4 Helpful Ways to Use Regular Expression in Trados

4 Helpful Ways to Use Regular Expression in Trados

Regex is an abbreviation of Regular Expression. In SDL Trados Studio, regex can be used to filter segments that contain a certain regex, find content that contains regex, set translation checks, create new segmentation rules to break segments in TM, and use regex for advanced search.

Replace tab characters and paragraph marks

With regex, tab characters and paragraph marks (new line or soft return) are displayed as follows:

\n: new line (shift+Enter)

\t: tab

For example, to filter segments containing paragraph mark(s), first we go to the find tool. There are three ways to use the find and filter tool in Trados.

  • Search box in Review tab (1)
  • Advanced Display Filter in View tab (2)
  • Ctrl + F or Ctrl + H (3)

Then, choose to find in Source or Target text. Enter characters representing the paragraph mark in the search box and press Enter.

Notes: The Regex box under the Search box in Review tab is automatically ticked. In the other two ways, manually tick the Regex box to use Regular Expression.

Option 1 (default)

4-Ung-dung-huu-ich-cua-Regular-Expression-trong-Trados

Option 2 (tick to choose)

4-Ung-dung-huu-ich-cua-Regular-Expression-trong-Trados

Option 3 (tick to choose)

4-Ung-dung-huu-ich-cua-Regular-Expression-trong-Trados

Set up Verification tools

The translation checking tool of Trados, Verification, allows regex to be used in order to find potential errors. For example, we can create a rule for Trados to check if the paragraph marks in the target text are the same as in the source text.

First, go to the settings by selecting Project Settings > Verification > QA Checker 3.0 > Regular Expressions.

Tick box Search regular expressions. In the Description box, enter a name for the verification rule. Next, enter the regex to be verified in the corresponding source and target boxes. Finally, select an action in the Condition box (e.g. report if the same regex in the source text can’t be found the in target text).

4-Ung-dung-huu-ich-cua-Regular-Expression-trong-Trados

Thus, by using the Verification tool for checking, Trados will notify you if there are any segments in which the regex in the translation is not the same as in the source text.

Create new segmentation rules to break segments in TM

By default, Trados will break segments at common line breaks such as period “.” and colon “:”. However, we can customize to allow Trados to break segments at specified regex characters, e.g. newline character (\n).

To do this, go to the settings of the project’s TMs (Project Settings > [Language Pairs > All Language Pairs > Translation Memory and Automated Translation] > Settings). In a default project, when selecting Project Settings, the “Translation Memory and Automated Translation” window shall automatically open.

4-Ung-dung-huu-ich-cua-Regular-Expression-trong-Trados

Then, select Language Resources > Segmentation Rules

4-Ung-dung-huu-ich-cua-Regular-Expression-trong-Trados

Select Add to add segmentation rules to break segments.

4-Ung-dung-huu-ich-cua-Regular-Expression-trong-Trados

In the Before break column, enter a regex to break segments, the Regular Expression checkbox will be available only after a regex is entered manually.

4-Ung-dung-huu-ich-cua-Regular-Expression-trong-Trados

Click OK to save. Then re-add the file to apply the new segmentation rules.

Advanced Search with Regex

Example 1: Find British English words such as: behaviour, colour, humour.

Enter the following command:

Find: (\w+)our [Explain: keywords end with ‘our’]

In the above example, the characters on the left of ‘our’ in the brackets, indicating that this is a group, which can contain any character.

Example 2: Find all dates in October, November and December in the text. Example: 20th November

Enter the following command:

Find: (\d+th)(\s)(October|November|December)

In this example, we used a regex containing 3 groups:

Group 1: (\d+th) – one or more numeric characters followed by “th” (e.g.: 20th)

Group 2: (\s) – whitespace characters

Group 3: (October|November|December) – Any of the 3 words

Example 3: Find numbers by format. Example: 100,000.00

Find: \d+,\d+\.\d+

In this example, “\d+” represents any group of numbers The entire search string above shall be interpreted as [number],[number].[number]. There can be one or more numbers in the square brackets. As such, numbers like 10,00.2 or 15,231,562 will show up in the search results.

Thus, with regex, translators have more options to handle a file or filter segments at will. This will greatly help with translation quality control.

4 Helpful Ways to Use Regular Expression in Trados

Three ways to translate your website.

Three ways to translate your website.

Explore the best options for translating a web page in your preferred browser or on your mobile device, and learn more about how to translate an entire site of your own.

blank

What is a Translation Management System (TMS)?

A translation system supports complex translations and allow enterprises and translation companies to centralize and automate the management of localization workflows involving several collaborators that can work simultaneously without geographical restrictions.

blank

World’s most spoken languages

There are many ways to measure the usage of a language in the world, but according to recent statistics, here we present the top 9 most common languages.

Conversion rate: Measure the success of your website

Conversion rate: Measure the success of your website

Conversion rate is no stranger to digital content marketing. It helps digital marketing strategists measure exactly how effective a website is in achieving its target. So what is conversion rate? And why is it so

Siêu trí tuệ: 03 cách để có được trí nhớ thị giác

Super Brain: 03 ways to get a photographic memory

Photographic memory is a type of memory surrounded by controversy. Some people think it’s just a joke, but some believe it’s real. In the history, there was a person officially recorded as having a photographic