What to do if instead of the text of the hieroglyphs (in Word, browser or text document). How to copy text from PDF in Word after converting PDF Word Hieroglyphs

User's question

Hello.

Please tell me why I have some pages in the browser displays instead of the text of the hieroglyphs, squares and do not understand that (nothing can be read). Previously, this was not.

Thanks in advance...

Good day!

Indeed, sometimes when opening some online page instead of text, various "krajakosabra" (as I call), and it is unrealistic.

This is due to the fact that the text on the page is written in one encoding (you can learn about it in more detail), and the browser is trying to open it in another. Because of such mismatch, instead of text - an incomprehensible set of characters.

Let's try to fix it ...

Correct hieroglyphs for text

Browser

In general, before Internet Explorer often issued such aquosabra, modern browsers (chrome, Yandex-browser, Opera, Firefox) - quite well defines the encoding, and are very rarely mistaken. I will say more, in some versions of the browser, you have already removed the selection of encoding, and for the "manual" setting of this parameter you need to download add-ons, or climb into the debris settings for the 10-current checkboxes ...

And so, suppose the browser incorrectly determined the encoding and you saw the following (as on the screenshot below) ...

Most often, the confusion occurs between UTF encodings (Unicode) and Windows-1251 (most Russian-speaking sites are made in these encodings).

  1. press the left alt - the menu seems from above. Press the menu "View";
  2. select item "Text Encoding" , Next to choose Unicode . Voila - hieroglyphs on the pages immediately became the usual text (the screen below)!

Another advice: if you can't find a coding in the browser (and give instructions for each browser - in general it is unreal!), I recommend trying to open a page in another browser. Very often another program opens the page as needed.

Text documents

Very many questions on ruskosabram are set when you open any text documents. Especially old, for example, when reading Readme in some program of the last century (for example, to games).

Of course, many modern notepads simply cannot read the DOS "Whane encoding that was previously used. To solve this problem, I recommend using the Bread 3 editor.

BRED 3.

Simple and convenient text notepad. An indispensable thing when you need to work with old text files. BRED 3 for one click the mouse allows you to change the encoding and do not read the text readable! Supports except text files a fairly large variety of documents. In general, I recommend!

Try to open your text document in Bred 3 (with which problems are observed). An example is shown in my screen below.

To work with text files, various encodings will also suit one more notebook - NOTEPAD ++. In general, of course, it is more suitable for programming, because Supports various backlights, for more convenient reading code.

An example of a coding change is shown below: To read the text, sufficiently in the example below, it suffices to change the ANSI encoding on UTF-8.

Word "Wrist Documents

Very often, the problem with cracks in Word is related to the fact that two formats are confused Doc and Docx . The fact is that since 2007 Word (if I'm not mistaken) format appeared Docx (It allows you to more stronger to compress the document than DOC, and it is more reliable protects it).

So, if you have an old Word, which does not support this format, then you, when opening a document in DOCX, you will see hieroglyphs and nothing more.

Solutions are 2:

  1. download on Microsoft Specialties. Supplement that allows you to open new documents in the old Word. Only from personal experience I can say that not all documents are opened, besides, the markup of the document is strongly suffering (which in some cases is very critical);
  2. use Word analogues (though, also markup in the document will suffer);
  3. update Word to the modern version.

Also when opening any document in Word (In the encoding of which he "doubts"), he invites you to specify you yourself. An example is shown in the figure below, try to choose:

  1. Widows (default);
  2. MS DOS;
  3. Other ...

Windows in various Windows applications

It happens that some window or menu in the program is shown with hieroglyphs (of course, read something or disassemble - unreal).

  1. Russifier. Quite often, the official support of the Russian language in the program is not, but many craftsmen make Russifiers. Most likely, on your system - this Russifier refused to work. Therefore, the Council is simple: try to put another;
  2. Switching language. Many programs can be used without Russian, switched in the language settings into English. Well, in fact: why do you need some kind of utility, instead of the "Start" button translate "start"?
  3. If you have previously displayed the text, but there is no right now - try restoring Windows if, of course, you have recovery points (in detail about it here -);
  4. Check the settings of languages \u200b\u200band regional standards in Windows, often the reason lies in them.

Languages \u200b\u200band regional standards in Windows

To open the settings menu:

  • click Win + R.;
  • enter intl.cpl Press ENTER.

intl.cpl - language and region. Standards

Check that in the "Formats" tab "Russian (Russia) // Use the Windows Interface Language (recommended)" (Example on the screen below).

In the Location tab, set the location of Russia.

And in the Advanced tab, set the system language to "Russian (Russia)". After that, save the settings and restart the PC. Then check again if the interface of the desired program is displayed.

And finally, probably, it is probably obvious for many, and yet some open certain files in programs that are not intended for this: for example, the DOCX or PDF file is trying to read in the usual notebook. Naturally, in this case, you will instead of text will be monitored by cracks, use those programs that are intended for this type of file (Word 2007+ and Adobe Reader for example above).

On the sim all, good luck!

Question from the user

Good day.

Please tell me. I have one PDF format file, and I need to edit it (change part of the text, put headlines and selection). I think that it is best to spend such an operation in Word.

How to convert this file to DOCX format (with which Word works)? I tried several services, but some give an error, others - tolerate text, but losing pictures. Is it possible to do better?

Marina Ivanova (Nizhny Novgorod)

Good day!

Yes, in office work, from time to time, you have to deal with such a task. In some cases, it is solved quite easily, in others - everything is very difficult ☺.

The fact is that PDF files can be different:

  1. in the form of pictures: When each page is a photo / picture, i.e. There are no text there in principle. The most difficult version for work, because Translate it all into text is like working with a scanned list (who has a scanner - he will understand ☺). In this case, it is advisable to use specials. programs;
  2. in the form of text: there are text in the file that is compressed in PDF format and is protected (not protected) from editing (with this type, as a rule, it is easier to work). In this case, both online services and programs will be suitable.

The article will consider several PDF conversion methods in Word. I think that of them will be able to find the most suitable for themselves, and will fulfill this task ☺.

Programs

Microsoft Word.

In new Word versions (at least in 2016) there is a special tool for converting PDF files. Moreover, nothing is unnecessary from you - it is enough to open some "PDF-ku" and agree to the transformation. After a couple of minutes - get the result.

And, by the way, this function in Word works quite well (and, with any type PDF files). That is why, I recommend trying this method first.

How to use: First open Word, then click "File / Open" and select the file you need.

On the question of conversion - just agree. After some time you will see your file in the form of text.

Pros: quickly; No need for a user; Acceptable result.

Cons: a paid program; Part of the formatting of the document may be lost; Not all the pictures will be transferred; It is impossible to influence the conversion process - everything goes in auto mode.

Note!

Instead of WORD and Excel, you can use other complimentary analogues with similar functionality. I talked about them in this article:

ABBY FINE READER.

Restrictions in the trial version: 100 pages for recognition; Software works within 30 days after installation.

But this program is one of the most universal - it can be "racing" any PDF file, a picture, photo, scan. It works according to the following principle: text blocks, pictures, tables are allocated (there is auto-mode, and there is a manual), and then recognizes the text from these blocks. At the output you get a regular Word document.

By the way, the latest versions of the program differ in the novice user - using the program is very simple. In the first welcoming window, select "Image or PDF file in Microsoft Word" (See Screen below).

FINE READER - Popular tasks made in the launch window of greetings

Next, the program will automatically break your document on the pages, and all blocks will allocate on each page itself and recognizes them. You will be left to correct errors and save the document into the DOCX format (by the way, the Fine Reader can save both in other formats: HTML, TXT, DOC, etc.).

Fine Reader - text recognition and pictures in the PDF file

Pros: You can translate any picture or PDF file to text format; The best recognition algorithms; There are options for checking recognized text; You can work even with the most hopeless files, from which all other services and programs refused.

Cons: a paid program; You need to manually specify blocks on each of the pages.

Readiris pro.

Trial limit: 10 days of use or processing 100 pages.

This program is some competitor FINE READER. It will help scan the document from the printer (even if you do not have drivers on it!), And then recognize information from the scan and save it in Word (in this article we are interested in the second part, namely recognition ☺).

By the way, thanks to very close integration with Word - the program can recognize mathematical formulas, various non-standard characters, hieroglyphs, etc.

Pros: recognition of different languages \u200b\u200b(English, Russian, etc.); Many formats for saving; good algorithms; System requirements are lower than other analog programs.

Cons: Paid; There are errors and manual processing is necessary.

Free PDF to Word Converter

Developer site: http://www.free-pdf-to-word-converter.com/

A very simple program to quickly convert PDF files to DOC. The program is completely free, and when converting it is trying to keep fully initial formatting (which is not enough for many analogues).

Despite the fact that there is no Russian in the program, it is enough to deal with everything: in the first window, specify PDF files ( Select file. - i.e. Select files); In the second - format for saving (for example, DOC); In the third - a folder where converted documents will be saved (by default, "My Documents" is used).

In general, in general, a good and convenient tool for converting relatively simple files.

Online services

Small PDF.

Is free

Smallpdf.com - free solution of all PDF problems

Excellent and free service for converting and working with PDF files. There is everything that can be useful: compression, converting between JPG, Word, PPT, PDF combining, turning, editing, etc.!

Benefits:

  1. high-quality and fast conversion, editing;
  2. a simple and convenient interface: even a very novice user will figure out;
  3. available on all platforms: Windows, Android, Linux, etc.;
  4. work with the service is free.

Disadvantages:

  1. it does not work with some types of PDF files (where it is necessary to recognize the pictures).

Converter PDF.

Cost: about $ 9 per month

This service allows you to process only two pages (for the rest you have to pay). But the service allows you to convert a PDF file to a wide variety of formats: Word, Excel, Power Point, in pictures, etc. It also uses different algorithms from analogs (allow you to obtain the quality of the file processing an order of magnitude higher than that of the analogs). Actually, thanks to this functionality and algorithms, I added it to the review ...

By the way, in the first two pages will be able to conclude whether to buy a service subscription (cost about $ 9 per month of work).

Zamzar

Is free

Multifunctional online converter, works with a bunch of formats: MP4, MP3, PDF, Doc, MKV, WAV and many others. Despite the fact that the service looks somewhat strange, it is simple enough to use it: because All actions are executed step by step (see screen above: Step 1, 2, 3, 4 (STEP 1, 2, 3, 4)).

  1. STEP 1 (step 1) - select a file.
  2. STEP 2 (step 2) - to what format is converted.
  3. STEP 3 (step 3) - you need to specify your mail (by the way, you may need an article about that).
  4. STEP 4 (step 4) - button to start the conversion.

Features:

  1. a bunch of formats for an envelope from one to another (including PDF);
  2. the possibility of batch processing;
  3. very fast algorithm;
  4. service is free;
  5. there is a limit on the file size - no more than 50 MB;
  6. the result of the envelope comes to the mail.

Convertio.

Is free

Powerful and free service on online work with various formats. As for PDF, the service can convert them to the DOC format (by the way, the service works even with complex "PDF-Kami", with which the others could not cope), compress, combine, etc.

Restrictions on the size of files and their structure - not detected. To add a file, it is not necessary to even have it on the disk - it is enough to specify the URL address, and from the service already download the finished document in the DOC format. Very convenient, I recommend!

ilovepdf.

Is free

Similar to the previous site: There is also the entire functionality to work with PDF - compression, union, breakdown, conversion (in various formats). Allows you to quickly convert various small PDF files.

Of the minuses: the service cannot process files that consist of pictures (i.e. "PDF-ki" where there is no text, here you do not pull anything with them - the service will return the error that there is no text in the file).

PDF.io.

Is free

Very interesting and multifunctional online service. Allows you to convert PDF to: Excel, Word, JPG, HTML, PNG (and the same operations in the opposite direction). In addition, on this service, you can compress the files of this type, combine and divide pages. In general, a convenient assistant in office work ☺.

Of the minuses: the service copes not with all types of files (in particular, about some writes that they do not have text).

Additions are welcome ...

When printing a PDF file, hieroglyphs are printed on the printer or how my accountants talked on the old work "Vitaly come with us with print PDF Abrakadabra Printed ". Today at work there was the same garbage and because I try to describe in my blog to a maximum of solving such problems and decided to lay out the instructions for correcting the hieroglyphs in PDF files. So this problem can be solved in three ways (maybe it is also but I will describe what I know).

1 way

This is the most reliable and time-tested way !!

  1. Open Registry Editor (Start -\u003e Run -\u003e Regedit.exe)
  2. Go to
    HKEY_LOCAL_MACHINE \\ SOFTWARE \\ Microsoft \\ Windows NT \\ Currentversion \\ Fontsubstitutes
  3. Delete Parameters: "Courier, 0" \u003d "Courier New, 204"
    "Arial, 0" \u003d "Arial, 204"
  4. Reload PC

PS Reboot the computer must necessarily !!!

2 way

The longest probably of all three ways, this is downloaded by the Russified version of Adobe Reader itself:

  • Download the latest version of Adobe Reader from the official site http://get.adobe.com/ru/Reader/
  • After that, we open a file and rejoice in life

2 way

So the first way is the fastest but also the most qualitative permission of the printed document:

  • When printing a document, go to Advanced and select Print how Images (File - Print -Advanced - PRINT AS Image)

4 way

This method is the most effective and cardinal because The solution of this jamb will be carried out at the Windows registry level:

  • Download Adobe Reader (This is important for the future because it is better to have the latest version of this program)
  • Create a REG file and make the following lines in it, then launch, agree with everything that he will say and reboot the computer.
Windows Registry Editor Version 5.00 "1250" \u003d "C_1251.NLS" "1251" \u003d "C_1251.NLS" "1252" \u003d "C_1251.NLS" "1253" \u003d "C_1251.NLS" "1254" \u003d "C_1251.NLS" "1255" \u003d "C_1251.NLS" "arial" \u003d DWORD: 000000CC "arial, 0" \u003d "arial, 204" "arival cyr, 0" \u003d "arial, 204" "Comic Sans MS, 0" \u003d "Comic Sans MS, 204 "" Courier, 0 "\u003d" Courier New, 204 "" Courier, 204 "\u003d" Courier New, 204 "" Courier New Cyr, 0 "\u003d" Courier New, 204 "" Fixedsys, 0 "\u003d" Fixedsys , 204 "" "Helv, 0" \u003d "MS SANS Serif, 204" "MS SANS Serif, 0" \u003d "MS SANS Serif, 204" "MS Serif, 0" \u003d "MS Serif, 204" "Small Fonts, 0" \u003d "Small Fonts, 204" "System, 0" \u003d "Arial, 204" "Tahoma, 0" \u003d "Tahoma, 204" "Times New Roman, 0" \u003d "Times New Roman, 204" "Times New Roman Cyr, 0 "\u003d" Times New Roman, 204 "" TMS RMN, 0 "\u003d" MS Serif, 204 "" "Verdana, 0" \u003d "Verdana, 204" That's all !!! :-) So we learned to correct the hieroglyphs when printing a PDF document. Thank you all for your attention. Heroglyphs in PDF, PDF prints hieroglyphs, in PDF Krakozyabe, hieroglyphs in PDF, in PDF instead of letters hieroglyphs, PDF is printed with hieroglyphs, the printer prints the hieroglyphs PDF, when printing from PDF hieroglyphs, PDF prints hieroglyphs, copy from pdf hieroglyphs, copied hieroglyphs from PDF Why PDF prints hieroglyphs, PDF file prints hieroglyphs, cracks when printing PDF, when printing PDF hieroglyphs, in the PDF file hieroglyphs, PDF is printed by hieroglyphs, why PDF prints hieroglyphs, PDF prints Krakoyar

Quite often used to publish a different kind of electronic documents. In PDF, scientific works, abstracts, books, magazines and many others are published.

Faced with the document in the PDF format, users often do not know how to copy the text to the Word. If you also have a similar problem, then our article should help you. Here you will learn 4 ways to copy text from PDF to Word.

The easiest way to copy the text from PDF to Word is the usual copy that you use constantly. Open your PDF file in any program to view PDF files (for example, you can use Adobe Reader), select the desired part of the text, click on it right-click and select "Copy".

You can also copy the text using the CTRL-C key combination. After copying, the text can be inserted into a Word or any other text editor.

Unfortunately, this method of copying text is far from always suitable. From copy, then you will not be able to copy text. Also in PDF, the document can be tables or pictures that cannot be simply copied. If you encountered a similar problem, the following ways to copy text from PDF should help you.

Copy text from PDF file in Word using ABBYY FineReader

ABBYY FineReader is a program to recognize text. Typically, this program is used to recognize text on scanned images. But using ABBYY FineReader you can recognize PDF files. To do this, open the ABBYY FineReader, click on the "Open" button and select the PDF file you need.

After the program finishes text recognition, click on the "Go to Word" button.

After that, you must open a WORD document with text from your PDF file.

Copy text from PDF file in Word using converter

If you do not have the ability to use the ABBYY FineReader program, you can resort to program-converters. Such programs will allow convert PDF document to Word file. For example, you can use a free program.

To convert PDF Document in Word File using UniPDF You just need to open the program, add the necessary PDF file to it, select conversion in Word and click on the "Convert" button.

Copy text from PDF file in Word using online converters

There are also online converters that allow you to convert a PDF file to Word file. Usually such online converters work worse than specialized programs, but they will allow you to copy text from PDF to Word without installing additional software. Therefore, they should also be mentioned.

Use such converters is quite simple. All you need to do is upload a file and click on the "Convert" button. And after completing the conversion, you will need to download the file back.

Good day.

Probably, each PC user faced a similar problem: you open an online page or Microsoft Word document - and instead of the text you see hieroglyphs (various "krajakosabra", unfamiliar letters, numbers, etc. (like on the picture on the left ...)).

Well, if this document (with hieroglyphs) is not particularly important, and if you need to read it?! Quite often, such questions and requests to help with the opening of such texts ask me. In this small article I want to consider the most popular causes of the appearance of hieroglyphs (of course, and eliminate them).

Heroglyphs in text files (.txt)

The most popular problem. The fact is that a text file (usually in TXT format, but they are the formats: PHP, CSS, INFO, etc.) can be saved in different encodings.

Encoding - This is a set of characters necessary in order to fully ensure the writing of the text on a specific alphabet (including numbers and special signs). More about it here: https://ru.wikipedia.org/wiki/Nab_Simvolov

Most often there is one thing: the document opens simply not in that encoding because of what the confusion occurs, and instead of the code of some characters, others will be caused. Various incomprehensible characters appear on the screen (see Fig. 1) ...

Fig. 1. Notepad - the problem with encoding

How to deal with it?

In my opinion the best option is to install an advanced notebook, such as NotePad ++ or BRED 3. Consider in more detail each of them.

NotePad ++.

One of the best notebooks for both novice users and professionals. Pros: Free program, supports Russian, works very quickly, the backlight of the code, the opening of all common file formats, a huge number of options allow you to adjust it.

In terms of encoding, there is generally full order: there is a separate section "encoding" (see Fig. 2). Just try changing ANSI on UTF-8 (for example).

After changing the encoding, my text document has become normal and readable - the hieroglyphs are disappeared (see Fig. 3)!

Fig. 3. Text has become readable ... NotePad ++

BRED 3.

Another wonderful program designed to fully replace the standard notebook in Windows. It also "easily" works with a multitude of encodings, it changes easily, supports a huge number of file formats, supports new Windows (8, 10).

By the way, BRED 3 helps a lot when working with "old" files stored in MS DOS formats. When other programs show only hieroglyphs - Bred 3 easily opens them and allows you to safely work with them (see Fig. 4).

If instead of the text of the hieroglyphs in Microsoft Word

The very first thing to pay attention to is on the file format. The fact is that from Word 2007 a new format appeared - "DOCX" (previously was just "Doc"). Usually, in the old Word, new file formats cannot be opened, but sometimes it happens that these "new" files open in the old program.

Just open the properties of the file, and then check the tab "Details" (as in Fig. 5). So you will learn the format of the file (in fig. 5 - the format of the file "TXT").

If the file format is DOCX - and you have an old Word (below 2007 version) - then just update Word until 2007 or higher (2010, 2013, 2016).

Next when opening a file note (By default, this option is always enabled if you, of course, do not "don't understand what kind of assembly") - Word will ask you: in which encoding to open the file (this message appears at any "hint" to problems when opening a file, see Figure . five).

Fig. 6. WORD - File Conversion

Most often, Word determines the automatically desired encoding itself, but not always the text is obtained by readable. You need to install the slider on the desired encoding when the text becomes readable. Sometimes, you have to literally guess, how the file was saved to read it.

Fig. 8. The browser defined incorrect encoding

To fix the site display: Change the encoding. It is done in the browser settings:

  1. Google Chrome: parameters (icon in the upper right corner) / Advanced parameters / encoding / Windows-1251 (or UTF-8);
  2. Firefox: the left alt button (if you are turned off the top panel), then the view / encoding page / select the desired one (most often Windows-1251 or UTF-8);
  3. Opera: Opera (red icon in the upper left corner) / Page / Encoding / Select the desired one.

PS.

Thus, in this article, the most frequent cases of the appearance of hieroglyphs associated with an incorrect encoding were disassembled. Using the above methods, you can solve all the main problems with incorrect encoding.

I would be grateful for additions on the topic. GOOD LUCK 🙂