Learn about language industry standards

Learn about language industry standards

Tìm hiểu về các tiêu chuẩn trong ngành ngôn ngữ

Standard can simply be understood as a unified way of doing something. In other words, every job needs an order and rules of implementation. Standards help managers control work scientifically, control product quality or meet market criteria.

Standards help standardize processes, standardize products and becomes a symbol of quality. Or, as ISO puts it, the standards “make things work”.

We are familiar with many standards in different fields, such as ISO standards for quality management, medical devices, European standards, German standards, etc. There are also standards for the language industry. You may not know it, but many of these standards are being applied every day without being recognized.

1. The importance of standards

Imagine a world without standards…

That’s an interesting assumption. To start, there would presumably be no currency, so trade would be limited to barter, which would be further limited to only those things that one could create by hand. Getting to trade centers itself would also be an uncertain proposition. Would the road be wide enough for your cart when there are no width standards?

There is a great, early episode of the Radiolab podcast that describes life in the American Midwest in the 1800s before the practice of standardized timekeeping. Prior to the introduction of the railroad system, the clocks in banks, saloons and shops as well as people’s watches were set against the general rising and setting of the sun.

As these towns became connected through a network of rail lines, the absence of a time standard emerged as a problem. How to develop route paths with connections if one train’s “late” is another train’s “early”? It was a problem that ultimately required the adoption of the time standard “rail time”. Today, the world (and the trains) use Coordinated Universal Time.

Language industry also encountered problems during the non-standards era. When text had to be translated before the wide adoption of Unicode; every language and platform had its own encoding permutations. A paragraph of text in Czech would need to be encoded in Windows-1250 if it was to be read on a Windows machine, while for the Mac it would need to be “Code Page 10029”. Back then, we were constantly transcoding content, and character corruption was frequent.

It was a problem made worse by the fact that a lot of translation was done within Microsoft Word using the RTF format, a changing protocol that made encoding decisions based on the name of the font used when typing the text. Since fonts themselves could have different encoding variants, and people often did not have the same fonts on their computers, text that looked perfect in one instance of Microsoft Word could appear corrupted in another!

Fortunately, RTF as a bi-text format was replaced by XLIFF. Thanks to that, we hardly ever run into character corruption today.

2. Language industry standards

Unicode and XML are two very famous language display standards. You may also be familiar with the standards TMX, TBX and SRX —for the exchange of translation memory content, terminology assets and segmentation rules, respectively—developed by the now defunct Localization Industry Standards Association (or LISA).

Here are few that you might not be familiar with.

XLIFF 2

XLIFF 2 is a version of XLIFF, but it is technically its own standard because it is not backward compatible with its predecessor. XLIFF 2 is the result of years of real-world application of XLIFF, and therefore a response to some of XLIFF’s limitations. This solution provides greater extensibility (through modules), including the Metadata Module, a valuable capability in the context of content enrichment.

XLIFF 2 is called out because, as an industry that did a good job supporting the earlier XLIFF, we are under the misapprehension that we are already “standardized. XLIFF 1 is certainly better than no XLIFF, but XLIFF 2 is better that XLIFF 1. There is more work to be done.

TAPICC

TAPICC is the Translation API Cases and Classes, a standards-oriented project. TAPICC is an initiative sponsored by GALA (the Globalization and Localization Association) that seeks to standardize the API methods used by various CMS, TMS, LQA tools and other systems that need to exchange information during localization.

MQM/DQF

The Multidimensional Quality Metrics (MQM) standard and the MQM-harmonized Dynamic Quality Framework (DQF) from TAUS constitute a framework by which as an industry we can define and quantify language quality using, well, the same language.

Internationalization Tag Set (ITS)

The Internationalization Tag Set is a way of enriching XML and HTML documents with information that supports the localization effort (or can enrich such documents about the localization effort).

In our interview with Professor Dave Lewis from the ADAPT Research Centre on the subject of the Provenance of Global Content, the professor described some of the work he has done to support the notion of provenance tracking within the tag set.

We have been introduced to some of the most popular and commonly used standards in the language industry. Subscribe here to more learn more about language as well as digital content.

As a pioneer in the field of translation and language solutions in Vietnam, AM Vietnam always updates and applies the latest standards in the industry. With two basic foundations, ISO 9001 and ISO 17100, AM Vietnam is confident to meet all customer requirements, whether in terms of quality or technical factors.