Translation Memory technology. Translation memory systems Translation Memory concept and implementation OCR software

A project's translation memory (TM) is a repository of source strings and their translations into different languages ​​that you can use to speed up the translation of the same or similar strings in your project or other projects.

Each translation made in a project is automatically added to the project's translation memory. The owner or project managers can also upload to the project, if necessary. Translation Memory.

Downloading or uploading translation memories

  1. Go to the tab TM&MT item project settings.
  2. Click the button download or download.

You can upload and download TM in the following file formats .tmx, .csv, .xlsx.

If you are uploading TM in .csv or .xlsx file formats, map the columns to the appropriate languages ​​in the configuration dialog.

TM binding

To bind any specific TMs for your project, follow these steps:

  1. Go to the tab TM&MT item project settings.
  2. Click **Link TM**.
  3. Select the necessary TMs in the opened dialog box.
  4. Click the button Save.

You can set the priority of selected TMs in the same dialog.

Sharing TM

With all your TMs shared, you will be able to pre-translate any of your projects using all the shared TMs. In addition, the editor will show translation suggestions for TMs from all TMs assigned to projects you own.


Application of translation memory through pre-translation

Preliminary translation through TM allows you to use at least 100% and perfect matches.

This article contains programs (translation memory programs, electronic dictionaries, text recognition programs, statistics calculation programs, application localization programs, website translation programs, other programs for translators), including free ones that allow you to translate more texts in less time. Also given short descriptions these programs with links to the original sources for downloading and installing. We hope you find something useful here.

TRANSLATION MEMORY PROGRAMS

Translation memory (translation memory, translation drives) - programs that allow "not to translate the same thing twice." These are databases that contain previously translated units of text. If a unit is found in the new text that is already in the database, the system automatically adds it to the translation. Such programs significantly save the translator's time, especially if he works with texts of the same type.

Trados. At the time of writing, one of the most popular Translation programs memory. Allows you to work with MS Word documents, PowerPoint presentations, HTML documents, and other file formats. Trados has a glossary module. Website: http://www.translationzone.com/trados.html

Deja Vu. Also one of the leaders in popularity. Allows you to work with documents of almost all popular formats. There is separate versions programs for freelance translators and translation agencies. Website: http://www.atril.com/

OmegaT. Supports a large number of popular formats, but documents in MS Word, Excel, PowerPoint need to be converted to other formats. Nice feature: the program is free. Website: http://www.omegat.org/

MetaTexis. Allows you to work with documents of the main popular formats. Two versions of the program are offered - a module for MS Word and a server program. Website: http://www.metatexis.com/

MemoQ. The functionality is similar to Trados and Déjà Vu, the cost of the program (at the time of writing) is lower than that of more popular systems. Website: http://kilgray.com/

Star Transit. Designed for translation and localization. On the this moment compatible with Windows OS only. Website: http://www.star-group.net/DEU/group-transit-nxt/transit.html

WordFisher. Free system Translation Memory created and maintained professional translator. Website: http://www.wordfisher.com/

Across. There are 4 different versions of the program, differing in the amount of functionality. Website: http://www.across.net/us/translation-memory.aspx

Catnip. Bes paid program, the "successor" of the MT2007 program. Website: http://mt2007-cat.ru/catnip/

ELECTRONIC DICTIONARIES

Here we have presented only electronic dictionaries for battery life(no internet access). There are much more online dictionaries, and a separate article will be devoted to them. Although the Internet has penetrated the most remote corners of the planet, it is useful to have at least 1 dictionary to work with offline. We have considered dictionaries for professional use, phrasebooks and dictionaries for language learners are not included here.

ABBYY Lingvo. It currently allows you to translate from 15 languages. There are several versions of the program with different volumes of dictionaries. There is a version for mobile devices. Paid version The dictionary is installed on a computer and can work without an Internet connection, the free one is only available online. The program is compatible with Windows, Symbian, Mac OS X, iOS, Android. Website: http://www.lingvo.ru/

Multitran. Not everyone knows that there is an offline version of this popular dictionary. It can be installed on computers (stationary and pocket), smartphones. Works with Windows, Symbian and Android, as well as Linux (via browser). At the moment it allows you to translate from / to 13 languages. Website: http://www.multitran.ru/c/m.exe

prompt. This program has versions for professional use. The advantage of Promt is that it allows you to work with Trados. Website: http://www.promt.ru/

Slovoed. Can translate c/into 14 languages. Installed on desktop computers and laptops mobile devices and readers Amazon Kindle. Works with operating systems iOS, Android, Windows, Symbian, BlackBerry, bada, Tizen. The dictionary has several versions, including highly specialized thematic dictionaries. Website: http://www.slovoed.ru/

SOFTWARE FOR TEXT RECOGNITION

ABBYY FineReader. Recognizes text on photos, scans, PDF documents. The latest (at the time of this writing) version recognizes text in 190 languages, and for 48 of them it does spell checking. You can save the resulting text in almost all popular formats (Word, Excel, PowerPoint, PDF, html, etc.) Site: http://www.abbyy.ru/finereader/

CuneiForm(OpenOCR). The program was created as a commercial product, but is currently distributed freely. Compatible with operating Linux systems, Mac OS X , Windows. Website: http://openocr.org/

PROGRAMS FOR COUNTING STATISTICS

Translator's Abacusfree program to count the number of words in documents various types. Website: http://www.globalrendering.com/

AnyCount- a paid program that has a large number settings. For example, you can count the number of characters with or without spaces, the number of words, lines, pages, or set the counting unit yourself. Website: http://www.anycount.com/

FineCount- the program is available in two versions, paid and free, which differ in the amount of functions. Website: http://www.tilti.com/

APP LOCALIZATION PROGRAMS

PROGRAMS FOR SITE TRANSLATION

OTHER PROGRAMS FOR TRANSLATORS

ApSIC Comparator- a program for comparing files (original text VS text with changes made by the translator). Site.

Machine translation programs are programs that are specially designed to facilitate the process of translation using a computer (lit. computer assisted translation ), how AutoCAD for engineers or ArchiCAD for architects. Such software is specially designed to create, store, read and write information in the form of files called "translation memories" (lit. translation memories), while the program creates a string with a language pair, automatically finding a match for a single word or phrase on another language. The program creates a file with a name, for example, RU_EN or RU_IT, etc., in which a word or phrase in one language corresponds to its meaning in another language.

What is translation memory and how does it work.

Translation memory (translation memory, TM) is a voluminous file with technical terms, abbreviations, established expressions.

If you have to translate, for example, the abbreviation "CCCP" from Russian into English using a machine translation program (CAT), the program will immediately offer you a translation option: Soviet Union.

At first glance, everything is simple, but not everything is as simple as it seems. If we are translating a document that has nothing to do with history, then this abbreviation can mean something completely different: c arbonyl c yanide m- c chlorophenyl h ydrazone , toxic ionophore, respiratory chain uncoupler. Or, for example, it could mean " c combined c community c odec p ack «, software package Microsoft Windows to play multimedia files.

That's why the company "Example" does not use in the program automatic translation. We use exclusively terminological databases for technical translations.

(less often - part of a compound sentence, or a paragraph). If the translation unit of the source text exactly matches the translation unit stored in the database (exact match, eng. exact match), it can be automatically substituted into the translation. The new segment may also differ slightly from the one stored in the database (fuzzy matching, eng. fuzzy match). Such a segment can also be substituted into the translation, but the translator will have to make the necessary changes.

In addition to speeding up the process of translating repeated fragments and changes made to already translated texts (for example, new versions software products or changes in legislation), PP systems also ensure the uniformity of the translation of terminology in the same fragments, which is especially important for technical translation. On the other hand, if a translator regularly substitutes exact matches retrieved from translation memory into his translation without control over their use in a new context, the quality of the translated text may deteriorate.

In each specific system PP data is stored in its own format (text format in Wordfast, Access database in Deja Vu), but there is an international standard TMX (eng. Translation Memory eXchange format ) which is based on XML and which can be generated by almost all software systems. Thanks to this, the translations made can be used in different applications, that is, a translator working with OmegaT can use a PR created in TRADOS and vice versa.

Most software systems at least support the creation and use of user dictionaries, the creation of new databases based on parallel texts (eng. alignment), as well as semi-automatic extraction of terminology from original and parallel texts.

Popular PP software systems

According to surveys of the use of PP systems, the most popular systems include:

The English Wikipedia has a list comparing the capabilities of different systems.

Translation memory standards and formats

  • TMX (Translation Memory Exchange Format) format. This standard allows interchange between different translation memory providers. TMX is a commonly used format among translators and is best suited for importing and exporting translation memories. latest version this format - 1.4b allows you to restore original documents and their translation from a TMX file.
  • TBX (Termbase Exchange format - Termbase Exchange). This LISA (Localization Industry Association) format is currently being revised and republished according to ISO 30042. This standard allows for the exchange of terminology, including detailed lexical information. The core base of TBX is defined by the standards: ISO 12620, ISO 12200 and ISO 16642. ISO 12620 provides a registry of well-defined "data categories" with standardized names that function as data element types or predefined values. ISO 12200 (also known as MARTIF) provides the framework for the TBX framework. ISO 16642 (also known as the Terminological Markup Framework) includes a structural metamodel for Terminological Markup Languages ​​in general.
  • SRX is designed to improve the TMX format and make transferring translation memories more efficient between programs. The ability to specify the segmentation rules that were used in the previous translation improves the efficiency of identifying segments in the current text with the content of the PG.
  • GMX GILT stands for Globalization, Internationalization, Localization, and Translation (Globalization, internationalization, localization, translation). The GILT Metrics standard consists of three parts: GMX-V for volume metrics, GMX-C for complexity metrics, GMX-Q for quality metrics. The proposed GILT Metrics standard aims to quantify the scope of work and quality requirements for the implementation of GILT objectives.
  • OLIF is an XML-compatible open standard that is used for the exchange of terminological and lexical data. Although it was originally used as a way to exchange lexical data between private machine translation lexicons, this format has gradually evolved into a more general terminological exchange standard.
  • XLIFF (XML Localization Interchange File Format - XML ​​Localization Interchange File Format), created as a single file format for interchange, which is recognized by all software tools localization. XLIFF is the best way to exchange information in XML format in today's translation industry.
  • TransWS (Translation web services- translation web services), defines the required parameters for calling web services when sending and receiving files and messages related to localization projects. It was conceived as a deployed system for automating the localization process using services on the Internet.
  • xml:tm, this approach to translation memory is based on the concept of text memory, which allows the combination of author's memory and translation memory. The xml:tm format was provided to Lisa OSCAR by XML-INTL.

Advantages and disadvantages

Advantages

  • Reducing the time and volume of the translator's work
  • Improving translation consistency, especially when a group of translators are working on the same project.
  • Increasing profits by increasing the productivity of a translator, a group of translators
  • Improving the quality of services by increasing the accuracy and uniformity of the translation of terms, especially in specialized texts.

Flaws

  • Can make the translation more "dry"; the very essence of the text is lost if the translation using the translation memory is performed by an unskilled translator
  • Often there is no connection between the sentence / text proposed by the program with neighboring sentences and with the text as a whole
  • The original must be in in electronic format
  • One unnoticed mistake can spread to the entire project
  • It is necessary to train the program itself, and when changing jobs - perhaps more than once (if employers work with different programs TM)
  • Suitable for all types of texts
  • High cost of licensed software

see also

Literature

  • Grabovsky VN Technology Translation Memory // Bridges. Journal of Translators. 2004. No. 2. - S. 57-62.

Links

  • Don't do the same translation twice // Computerra Online, February 14, 2005.

Notes


Wikimedia Foundation. 2010 .

See what "Translation Memory" is in other dictionaries:

    Contents 1 In psychology 2 In computer technology... Wikipedia

    Mosaic with the image ... Wikipedia

    A text in one language along with its translation into another language. "Parallel text alignment" is the identification of matching sentences in both halves of the parallel text. Large collections of parallel texts are called ... ... Wikipedia

    Parallel text (bitext) text in one language together with its translation into another language. "Parallel text alignment" is the identification of matching sentences in both halves of the parallel text. Large gatherings ... ... Wikipedia

    Translation memory (TM, sometimes referred to as "Translation Memory") is a database containing a set of previously translated texts. One entry in such a database corresponds to a segment or "translation unit" (English ... ... Wikipedia

Translation memory systems: concept and implementation 1. Ideology of TM tools 2. General principleТМ 3. Composition of the ТМ 4 system. Functions of ТМ 5. Overview of the main programs of the Translation Memory class: TRADOS 3. 0 Deja. Vu SDLX 3.0 Transit and Term. star word. Fisher 4 IBM Translation. Manager 2. 0 6. Advantages and disadvantages of TM TM

Ideology of TM-Tools Makoto Nagao, Japan, University of Kyoto. In 1982, he proposed a new concept of machine translation, which was based on the assertion that texts should be translated by analogy with texts previously translated by hand. M. Nagao called his approach to technical translation from English into Russian "Example based translation". M. Nagao's idea was used by some Makoto Nagao

What is Translation Memory? Translation Memory (TM) is a database where completed translations are stored. The TM technology works on the principle of accumulation: during the transfer process, the original segment (offer) and its translation are stored in the TM. When processing a new text received for translation, the system compares each of its sentences with the segments stored in the database. If an identical or similar segment is found, that segment is displayed along with the translation and a percentage match. Words and phrases that differ from the stored text are highlighted. Thus, the translator needs to translate only new segments and edit overlapping ones. Each change or new translation is saved in the TM.

Ideology of TM-Tools TM-tools are designed to store pairs of sentences in a translation database. Each such pair consists of a sentence from the original and its technical translation into another language. It is possible to place fragments of text and other formats that exceed the length of the sentence or are only part of it. But in automatic mode it is the sentences that are saved, so quite often such programs are called “sentence memory” (from the English. Sentence Memory).

Operating principle of TM tools TM programs are integrated with such office programs like, for example, Word. Some means of text. of them have their own technical editing. Their interfaces differ little from the interfaces of text editors familiar to a modern translator.

Translation memory and auxiliary programs for translation Classes: MT (Machine Translation) - automatic or machine translation; CAT (Computer-assisted/aided translation) programs automate and facilitate the work of a translator in its various aspects, implement the concept of translation memory (translation memory), such as Trados, Omega. T, Deja. Wu, Word. Fast, etc.

How modern CAT programs work The program divides the source text into segments (as a rule, these are sentences or parts of sentences), and the translator enters the translation of each segment directly below the source text or, if the text is presented in the form of a table, to the right of it. The translation of the segment is saved along with the original text. The name of the translator and the date of the translation are also recorded (which is important for teamwork). You can return to the segment at any time to check or change the translation. The program places the segment in the translation memory, so that if it occurs again in the source text, its translation will be substituted from the TM automatically. In addition, the CAT program has a fuzzy matching function: it detects segments that are only partially similar to those already translated (for example, matching by 75%), and gives “hints” for their translation.

Trados automated translation system (Trados) Trados is a computer-aided translation system developed by the German company Trados Gmb. H in 1992. One of the world leaders in the class of Translation Memory (TM) systems. The Trados system includes several modules designed to translate texts of various formats: documents Microsoft Word, power presentations. Point, HTML texts and other metadata, Frame documents. Maker, Inter. Leaf and others, as well as for maintaining terminological databases (module Multi. Term).

How the Trados system works The concept of Translation Memory involves identifying fragments in the translated text, the translations of which are already available in the translation database, and thereby reducing the amount of work of the translator. This identification is called alignment or comparison (alignment). Fragments that remain untranslated after alignment (matching) are passed on for manual processing to a translator or machine translation system (Machine Translation, MT). At this stage, the translator can select newly translated fragments and enter new pairs of parallel texts in two languages ​​into the database. Such a scheme the best way works on texts of the same type, where the repetition of phrases is quite high.

The main modules of the Trados system * Trados Workbench - the main module for translating documents, integrated into the shell of Microsoft Word; * Tag. Editor - a module for translating documents in HTML, XML format, etc.; * win. Align - a module for creating a translation memory based on previously translated bilingual texts; * S-Tagger - module for translating documents in Frame format. Maker and Inter. Leaf; * T-Window - a module for translating documents in Power format. point; * Multi. Term - module for maintaining glossaries; *Extra. Term

What do Translation Memory programs have in common - Mixing - Alignment Maintenance - Maintenance Glossary - Terminology Program Text editor– Document Editor Concordance (Binding usage with context) – Concordance

Advantages and Disadvantages of Translation Memory Class Programs Advantages - Reduced translator's time and effort - Improved translation consistency, especially when several translators work on the same project. - Increasing profits by increasing labor productivity - Improving the quality of services by increasing the accuracy of translation of terms, especially in specialized texts. Disadvantages - Can make the translation more “dry”, the very essence of the text is lost - Often there is no connection between the sentence / text proposed by the program with neighboring sentences and with the text as a whole - The original must be in electronic form - One error applies to the entire project - Training is required for the program itself , and when changing jobs, it is possible, and more than once (if employers work with different TM programs) - Not suitable for all types of texts - High cost

References: 1. 2. 3. 4. Grabovsky VN : Translation Memory Technology. "Bridges" 2/2004 Kutuzov, A. B.: Computer techologies in the formation of the professional competence of a translator // Languages ​​of professional communication: collection of articles of the Third International Scientific Conference, vol. 2. - Chelyabinsk, 2007. URL: http: //tc. utmn. ru/files/kutuzov_it. pdf Shakhova N. G.: The train is leaving again. home computer No. 5 1. 05. 2000 Silonov A.: Programs that help the translator. Computer Week No. 16 (238) Moscow 16 -22. 05. 2000