Multilingual Typesetting: What’s it all about?

What is multilingual typesetting? How should one go about it? What are the best tips for obtaining professional multilingual typesetting?

This site provides a reference resource to answer these questions and more on multilingual typesetting.

Below is a brief introduction, and you can read more on specific issues in our blog.

What is multilingual typesetting?

First things first: typesetting in this context refers to using computers to arrange and format text. The word derives from the time when individual letters cast in metal were composited into lines of text. Thin strips of lead were inserted between lines to create additional space between successive baselines (hence: “increase the leading”!). The internet offers many histories of typesetting. Nowadays commercial typesetting uses software such as Adobe InDesign or Quark Xpress on a Mac or PC. The document markup language LaTeX is also still used, but largely in academia.

We use multilingual to refer to the use of several languages within the text being typeset. In practice, “multilingual typesetting” is often used interchangeably with “foreign language typesetting” to mean typesetting in languages other than English. While this is linguistically incorrect, it does not seem unreasonable given that typesetting is an English word, and the default language of computing has long been English. Therefore, this site does not restrict itself to the issues raised by typesetting one document containing many languages, but also looks at converting documents from one language to another.

Early multilingual typesetting by Caslon
Early multilingual typesetting: a typeface and language specimen sheet from 1728 by William Caslon
click image to enlarge.

Foreign language typesetting in InDesign, Quark, Illustrator…

While many programs exist for manipulating text, any professional typesetting is mostly likely going to employ InDesign, Quark or Illustrator. Which is better for multilingual typesetting? It depends on the project, but the answer is often “InDesign”. It is not only an extremely powerful typesetting and design program, but provides good support for many languages.

Perhaps not surprisingly, InDesign's sister product Illustrator also provides good support for most languages. However, if you have to use Quark for other reasons, not all is lost. Its multilingual abilities have improved over the years, supporting Unicode since Quark 7, although it remains to be seen how well Quark 9 builds on this.

Foreign languages, different scripts

Different languages use different writing systems. English is writen in the Latin alphabet, also called the Roman alphabet. The letters in this alphabet is often casually referred to as Roman characters. These characters, with the additon of the odd accent, are in use throughout the languages of western Europe and by extension in most of north and south America, Australasia and central and southern Africa. The same characters form the base of writing many central and eastern European languages, albeit with some extra letters. In multilingual typsetting terminology, these languages are often referred to as CEE (Central and Eastern Europe) to distinguish the fact these additional characters are required.

Further to the east, languages such as Russian, Ukrainian and Bulgarian are written using the Cyrillic script. All of the languages mentioned so far are written from left to right, but some writing systems (such as Arabic) run from right to left. Chinese languages can all be writen in one of two common scripts: Simplified Chinese being used in mainland China (the PRC) and Traditional Chinese still in use elsewhere. Simplified Chinese is generally written horizontally from left to right. Historically Chinese, Japanese, and Korean were written vertically in columns going from top to bottom, with the columns arranged from right to left. Both this arrangement and left to right, horizontal orientations remain in use in Japanese, Korean and Traditional Chinese.

Needless to say, the world contains many, many more writing systems: each fascinating in their own right with their own quirks and characteristics.

Multilingual typesetting map showing global writing systems
Multilingual typesetting uses some of these scripts more often than others!
click image to enlarge. Pic: 23prootie/wikimedia commons

Unicode

Anyone typesetting in languages beyond those of western Europe needs to be Unicode aware. Unicode is what the name suggests: a unifying way of encoding the world's languages. Think back to school and learning about the periodic table. Now imagine a giant table, not of chemical elements, but of characters used in writing. In fact, the table has 109,000 characters covering 93 scripts. Before Unicode, character encodings tended to cover just one language and were often incompatible with each other, making truly multilingual documents and websites hard to implement. Nowadays, Unicode should always be the default. Occasionally it is important to make this clear. For instance, when hiring a Bengali translator, it would be crucial to instruct them to use Unicode.

DIY multilingual typesetting

In the 1990s, multilingual typesetting was an arcane art. There's no doubt that the process has become more accessible, but should you dive in and do it yourself?

The answer to this depends on your project and the level of correctness you need. If the layout does not matter too much and you just need people to be able to read it then you can get reasonable results with little linguistic knowledge. However if your publication is more high end, or aims to educate or persuade, then you want the text to be than readable. It needs to be well presented. That means following local typographic conventions.

For instance, in many Slavic languages some single letter words “belong” with the word following them. Any decent Polish typesetter would never allow one of these to split over two lines. In French, many punctuation marks are preceded by a thin space. For instance, a colon or question mark should always have a thin space between it and the word before. However, in Canadian French, fewer punctuation marks follow this rule: the colon needs a thin space, the question mark does not. Breaking these language rules will not render the text unreadable. Some readers may not even notice the specific errors, but just get a sense of “something wrong”. Again it very much depends what level of professionalism and “correctness” the text needs to be perceived at. If you are planning to typeset a foreign language, it is often useful to look online for grammatical reference material, such as this PDF for Canadian French typesetting.

If you want to be sure your text is spot on but don't want to wade through this material, you should get a typesetter who is native speaker of your foreign language or look for a commercial multilingual typesetting service. There are several online and they are often inexpensive (certainly compared to getting some disastrously wrong and re-printing!)

When working with several languages, no one can expect a multilingual typesetter to be fluent in them all. What can be important in “higher end” projects is to have a feel for each language. This can come from a basic level of that language, multilingual typesetting experience or even judicious use of online resources. Issues might include where the English uses typographic emphasis, which words should be picked out in the translation, how headlines break naturally over two decks, and so on.

Getting the best price for multilingual typesetting services

If you decide to use a commercial multilingual typesetting service, how can you get the best rates? Foreign language typesetting is usually charged by the page, with typesetting rates varying by language. So be alert to any pages in your project that only have pictures, they should not be counted when calculating a price.

Of course, it is worth shopping around and getting a good price. Google makes this easy. But what can you do to get the best rate?

  • Supply live English artwork. The work involved, and so cost, will often be increased if the only source artwork is a PDF.
  • Provide the English in InDesign if possible. Some services charge preferential rates.
  • Finish the English before you start other languages. Making artwork amends over several languages can quickly rack up the cost.
  • If you don't have a pre-existing translation, ask if they provide a combined rate for translating and typesetting?

As with everything, the absolute lowest price isn't always best for saving you time and money in the long run. So what else should you ask?

  • How long have they providing multilingual typesetting?
  • Can they provide any samples of similar previous work in the same languages?
  • Are they foreign language experts? Do they also supply translation? (a good sign)
  • What checks do they carry out on setting before returning it to you?
  • Is a first proof, for sign off before final output, included in the price?
  • If there are any amends, how quickly can they carry these out? Do their typesetters work in-house? Do they supply a 24 hour service?
  • The intangible “feel good” factor. Do you feel confident about liaising on a complex project with them?

Design a good multilingual template

There are some crucial facts to be aware of when designing an English document which will be used as the source for foreign language documents.

  • Languages run to different lengths. It varies according to subject but German typically requires about 20% more space. Simplified Chinese will usually run much shorter than English. Suddenly that single column of text that neatly fills the page in English looks like a potential design problem.
  • Expansion of length is not even. In a given language, text might run to the same length as the English in one story, but 50% longer in another. This is because different languages can be more efficient at conveying particular concepts. French often provides good examples of this. The ideal layout is therefore elastic, or able to absorb white space.
  • Languages have different length words. Some languages tend to have much longer words. A narrow column with unhyphenated text is going to be a problem in Russian, German or Hungarian for instance.