This tutorial is for people who start or want to optimize the internationalization of their PHP Apps.
We wrote it due the lack of useful ressources. Although there are many tutorials for gettext out there it is still a very complicated issue.
Maybe you looked into it and thought “mhm the whole unix world uses it so it must be cool but it looks painful to use” - we too, but we didn’t give up that early ;)
Some reasons for gettext:
- defacto-standard for i18n of unix systems and their applications
- very, very fast, because it uses a binary format and native caching
- no database required, get the load out of it
- you can put the translation files within your version-control
- Many editors for every plattform. -> easy for external translators…
- powerful command line tools for special needs.
- stable, stable, stable…
some disadvantages:
- not thread-safe (thanks to Mike and Derick for pointing this out)
- You have to spend some time to get it work
- debbuging gettext can be hard in PHP
- restart of webserver needed to activate new translations (not in cgi version).
Tutorial
Basic understanding points
- Translated sentences will be seem like function in PHP, for example:
_('This string should be translated');
tr('This string too...');
- xgettext is a command line tool and will walk through your source code to find every string which should be translated. xgettext will recognize the translations by the function names. You can specify your special function names like uebersetzung(’blah’) in german. Remember: gettext IS powerful.
- there are three file types that you must know:
- pot - this is the template for the translation
- po - this is a single translation for each language (it is generated from the pot file)
- mo - this is the “compiled” binary database for the application (generated from the po file)
First step: Generate Functions and init gettext
Make sure you are familiar with setlocale() !
$lang = 'de_DE.UTF-8'; // Dummy
$textdomain = 'cms';
setlocale(LC_ALL, $lang); // see Useful tips & tricks and debbuging about hints for setlocale
// path/to/your/mo/files without LC_MESSAGES and locale!!
// Example: /html/locales/de_DE/LC_MESSAGES/cms.mo
// Use: bindtextdomain('cms', '/html/locales');
bindtextdomain('cms', 'path/to/your/mo/files');
textdomain('cms');
This is an example translate function that wraps the native “_” function but allows us to populate the strings with parameters like “I have
{number}
strings in my translation file”.
We’ll show you later how to tell the gettext parser to recognize our special function, too.
/** * translate Function
*
* @param string Text which should be translated
* @param array Params with Vars which should be replaced (enclosed with {})
*
* @return string Translated Text
*/
function tr($msgid, array $params = array()) {
$trans = _($msgid); // Native PHP Function
if (!count($params)) {
return $trans;
}
foreach (array_keys($params) as $element) {
$search[] = '{' . $element . '}';
}
return str_replace($search, $params, $trans);
}
Second step: Translate your Application
Replace all sentences that should be translated with the PHP Function.
Example of very simple php File:
<html> <head> <title>This is my website</title> </head> <body> Hi, <? echo htmlspecialchars($username); ?>! Blaah </body> </html>
Translated:
<html>
<head>
<title><? echo htmlspecialchars(_('This is my website')); ?></title>
</head>
<body>
<? echo htmlspecialchars(tr('Hi, {user}! Blaah', array('user' => $username))); ?>
</body>
</html>
Third step: Generate a pot file (Template)
The pot file collects all translation strings and is the template for the localized language files.
The command line and xgettext will help you to generate the pot files.
#!/bin/sh # Generate a list of your files with "find". This helps you to exclude .svn folders or other things you do not want to translate cd /path/to/app find ( -path "./dir_to_include/*" -o -path "./another_dir_to_include/*" ) -name "*.php" ! -path "./exclude_this_path/*" ! -path "./exclude_this_path_too/*" ! -path "*.svn*" > /path/to/potfiles.txt # Generate POT File (Template). # `--keyword=tr' tells xgettext not to use the default _() function but your own tr() function. xgettext --from-code=utf-8 --keyword=tr --default-domain=cms --output=/path/to/pot.cms --files-from=/path/to/potfiles.txt
Fourth step: Generate your PO Files from your POT Template and translate
We use poedit (http://www.poedit.net/). Choose “New Catalog from POT File…”, there you can insert the language, the team etc. in a comfortable way.
Then start translating with poedit…
Last step: Generate Binaries
Now it’s time to get our translated texts ready for the application. It has to be compiled into the binary format.
msgfmt --output="/path/to/de_DE/LC_MESSAGES/cms.mo" /path/to/cms.po
Update your translation files
If you have new or changed translations in your code it’s just the same procedure:
- Create the POT Files again (see third step)
- Update the po files (this merges the new pot and the po - you’ll have to do it for each language you have)
msgmerge --update /path/to/cms.po /path/to/cms.pot
- Translate again (poedit will recognize which string are only changed a little bit, very useful)
- generate Binaries again (see last step)
- restart your webserver :)
Useful tips & tricks and debbuging
- It doesn’t work! After updating, restart your webserver (if not cgi version)!
- Remember: Not the PO-files but the MO-files are used by your application. So don’t forget them.
- Be sure that the locales you want to use are installed in your linux and that you use the .UTF-8 (you want i18n so please use UTF!):
# Example for debian dpkg-reconfigure locales
- WARNING: setlocale(LC_ALL) will change the output of floats (. in ,) in some languages like german (your SQL Querys could fail!!). Use prepared statements! You can use setlocale(LC_MESSAGES) so only your translations are being localized… (thanks to Mike for pointing this out)
- to find all strings in your app, use a debug switch that translates every string into “–”. In this mode every string you find in your app is not translated :) Easy and very effective for large apps.
define('DEBUG_LANGUAGE', true); function tr($msgid, array $params) { if (DEBUG_LANGUAGE) { return '--'; } [...] }
- Look at http://www.gnu.org/software/gettext/manual/gettext.html. You will find information about the very powerful command line utilities.
- Search at http://sourceforge.net/ for tools. There are many tools which fit nearly every special need.
If you have any questions, corrections or additions please let us know. Any feedback is appreciated as usual.


May 4th, 2007 at 11:00
Hey come on…
This is 2007…
1) Example with Short Open Tags
2) Example that requires register_globals=on
3) Example with XSS Vulnerability
May 4th, 2007 at 11:04
http://horde.org/horde/docs/?f=TRANSLATIONS.html has some notes on troubleshooting gettext issues with PHP.
May 4th, 2007 at 11:11
Hi “Anonymous Coward”,
thanks for commenting. You may have seen that our code snippets are short examples reduced to the minimum. Short open tags are some sort of coding guideline and IMHO ok for just echoing. If you don’t like them you don’t have to use them.
The example does not use any GPC variables so there is neither a register_globals nor a XSS issue (assuming $username and the translations to be clean). But that’s not the focus of the article.
Thanks.
May 4th, 2007 at 12:10
Hi,
good tutorial, cover main issues for po files.
But it’s not because gettext is used that we can echo anything without escaping it:
echo _(’This is my website’);
Must become:
echo htmlspecialchars(_(’This is my website’));
or other escaping functions
Don’t trust the translators…
May 4th, 2007 at 13:30
Hi Ludovic,
> But it’s not because gettext is used that we can echo anything without escaping it
As I said, our intention was to keep the examples simple but you’re right that it’s good style to always use htmlspecialchars(). I changed the examples to be “secure”.
May 4th, 2007 at 13:42
Hi, you forgot a mojor disadvantage: setlocale() and thus gettext is not thread safe.
On another note: there’s LC_MESSAGES though.
May 4th, 2007 at 14:26
Hi Mike,
thanks for those great hints. I’ve updated the entry.
May 4th, 2007 at 14:37
You also fail to mention that gettext does not work a single bit on threaded platforms as it relies on the locale environment settings…
May 5th, 2007 at 02:18
Please dont encourage the use of _().. it dramatically reduces the readability of the code, and when you have teams of programmers who use multiple languages, and occasionally use PHP, it is asking for another another ‘who the heck wrote this language’ comments.
May 5th, 2007 at 10:01
It’s probably worth mentioning php-gettext, a pure-php implementation of gettext. It works like a charm and is used by some major apps, like e.g. WordPress.
May 5th, 2007 at 17:40
First of all thanks for a great tutorial.
I’ll appreciate some information about i18n handling on multi threaded servers.
June 7th, 2007 at 06:43
A very good tutorial.
Kee it up . like to see more
June 13th, 2007 at 05:30
Nice tutorial.. read = never read
June 23rd, 2007 at 21:04
Hi,
good tutorial, covered all the main issues for po files. keep up the good work.
Brian
my tech blog
http://www.britec.co.uk/techblog/
July 7th, 2008 at 23:41
Thank you for the tutorial.
I have a problem in the third step, nevertheless.
The command line to generate the .POT file works fine with .PHP files, I tested it with your second step example of translated .PHP file. But if I want to use it with .TPL files (Smarty templates), and I just changed the extension of the previous file, xgettext says something like “unknown extension, will try as C” and does not generate anything.
Anyone knows how to solve this?
Mario
P.S. One minor glitch in the command line to generate the .POT file:
where it says
output=/path/to/pot.cms
should be substituted with
output=/path/to/cms.pot
August 3rd, 2008 at 21:11
Hi.sorry whatever i did in these days to use this gettext,I failed.I am so sad of this.Is it possible for you to look at my sample files and tell me what the problem is.bye the way I am using UBUNTU.thank you very much.
November 23rd, 2009 at 22:02
@Mario one year later ok but someone might search as we did !
“tpl unknown extension”
Check -L option, you can set it to C or lisp and then tpl is parsed correctly !
“Will try as C” seems to be more like “You should try as C” !
Cheers
February 27th, 2010 at 16:54
The personal loans seem to be useful for guys, which would like to organize their own organization. In fact, this is not very hard to get a credit loan.
May 31st, 2010 at 15:21
Hi.sorry whatever i did in these days to use this gettext,I failed.I am so sad of this.Is it possible for you to look at my sample files and tell me what the problem is.bye the way I am using UBUNTU.thank you very much.
June 10th, 2010 at 13:46
Your site was extremely interesting, especially since I was searching for thoughts on this subject last Thursday.
I’m Out! :)
July 3rd, 2010 at 10:07
hey buddy,this is one of the best posts that I’ve ever seen; you may include some more ideas in the same theme. I’m still waiting for some interesting thoughts from your side in your next post.
July 3rd, 2010 at 10:07
This is one of the best posts that I’ve ever seen; you may include some more ideas in the same theme. I’m still waiting for some interesting thoughts from your side in your next post.
July 3rd, 2010 at 10:07
Hey this is really nice information. I was looking for something similar like this. Thanks for this useful information.
July 3rd, 2010 at 10:08
I really liked the post and the stories are really thanks for sharing the informative post.
July 22nd, 2010 at 16:44
for sharing the informative post.
July 23rd, 2010 at 09:54
I was looking for something similar like this. Thanks for this useful information.
July 23rd, 2010 at 09:55
Your site was extremely interesting, especially since I was searching for thoughts on this subject last Thursday.
July 23rd, 2010 at 11:06
Hi,
good tutorial, covered all the main issues for po files. keep up the good work.
July 23rd, 2010 at 11:07
A very good tutorial.
Kee it up . like to see more