inicio sindicaci;ón

Archive for May, 2007

Durable Network Messaging with PHP? Spread? ActiveMQ?

While the last days I had an e-mail discussion with Theo Schlossnagle from OmniTI regarding the network messaging toolkit Spread that he mentions in his great book “Scalable Internet Architectures“.

The thread is about Spread and if it’s possible to build a durable message queing system. The main issue is that Spread cares about the reliable network delivery of messages but if one “listener” of a group goes down (host or daemon death) it’ll never get the message because (for spread) it’s not joined to the group anymore.

Theo posted our e-mail discussion in his Blog of the “Scalable Internet Architectures” book (read on!).

The result is that you either have to build your own queing system on top of spread that cares about the durability or use something “enterprise” like ApacheMQ what’d be maybe overkill.

Maybe you have other approaches for a durable network messaging system in PHP? I’d like to hear your voices!

Getting the PHP fatal errors

One big issue of the PHP error handling is that there’s no built-in way to catch fatal errors with an user-defined error handler. So I thought a little bit about it and maybe you have better approaches or solutions …

The short goal is to send the error via e-mail to the developer(s). As we are security-aware, we’re logging errors and do not display them to the world. (Hint: that should be your default on every production machine!)

With the error_log directive we have 3 possibilties: syslog, sapi and a common logfile. Sapi in my case is the mod_php Apache module that logs into the error_log by default so we can classify sapi and a normal logfile to be the same in this case.

Now I see two possibilities - file watching and syslogger.

First I looked into the syslogger if there is any possibility to send e-mails. This would be nice …
The original bsd syslogger used by the most Linux distributions can’t. And there’s a reason, yeah ;) The syslogger logs many events, even critical system ones (emergency etc). If the system is unstable and something really goes wrong the mailing subsystem shouldn’t be invoked. Yes that’s plausible to me. [1]

There are some people that built named pipes with the syslogger [2]. But that got too weird to me ;) Another issue is that the error_log-syslog behavior (facility, priority) of PHP seems to be not really documented. So I didn’t look deeper into syslog… did u make other / better experiences with syslog? I’d appreciate feedback because I didn’t got deeper into the syslog approach …

So lets get back. The more pragmatic approach is an external log file analyzier. The common suspects are:

Logwatch and Logcheck are not real-time, but we want real-time.

So there are Logsurfer and Swatch left. Because Swatch is available as Debian package (and I like standardized systems) I’ve chosen it. Swatch has a fine and clean but flexible configuration approach and is really easy to setup. The heart is this config-file:

watchfor /(PHP.*error:.*?)$/i
mail addresses=root
threshold=on
threshold track_by=$1,type=limit,count=1,seconds=10

This means in spoken words:

  1. Look for any line matching this regular expression (Fatal, Parse, Recoverable errors)
  2. Send this line to root
  3. Limit repeating lines to one per ten second (and use the $1 substitution of the matched line, this is the error message itself without the date)

Number 3 is the most interesting point because one can tune it and you don’t get spammed if the Fatal Error occures many times.

Now we start the daemon:

# swatch --daemon -c /root/php_fatal.conf --tail-file=/var/log/apache2/error.log

To test it raise a fatal error and wait for the mail:

From: root 
To: root@server01.intern.northclick.de
Subject: Message from Swatch

[Tue May 15 13:24:12 2007] [error] [client 192.168.2.16] PHP Parse error:  syntax error, unexpected T_CONSTANT_ENCAPSED_STRING, expecting ‘)’ in /home/soenke/tests/fatal.php on line 1

You want to put the swatch daemon into your rc.local or some other startup file so it’s started when the machine boots.

This technique seems the most clean to me.

What do you use? What other approaches do you have?

References:

[1] comp.os.linux.networking - How to send a syslog message by E-mail?
[2] http://www.softpanorama.org/Logs/Syslog/pipes_in_syslog.shtml

Handling Browser-Caching of JS / CSS

No real secret but maybe not every developer knows it: The JS /CSS is cached and the users do not clear their cache after you update your application :) There is a simple but effective way to handle it:

define('CACHE_STYLE_KEY', '3243jadfafas');
define('CACHE_JS_KEY', '3243jadfafas');

// ...

<link href="style/mystyle.css?<?php echo CACHE_STYLE_KEY; ?>" rel="stylesheet" type="text/css" /
<script src="javascript/app.js?<?php echo CACHE_JS_KEY; ?>" type="text/javascript"></script>

If you change your CSS/JS files, just also change the associated constant and the GET parameter will also be updated. Your users’ browsers will reload the files, and there’s no need to change the filename or anything else. And: The customers won’t call you and ask how to empty their browser cache :)

With the same way you could handle caching of images and other “static” data.

Tutorial for the easy use of gettext for internationalization of PHP Apps

This tutorial is for people who start or want to optimize the internationalization of their PHP Apps.

We wrote it due the lack of useful ressources. Although there are many tutorials for gettext out there it is still a very complicated issue.

Maybe you looked into it and thought “mhm the whole unix world uses it so it must be cool but it looks painful to use” - we too, but we didn’t give up that early ;)

Some reasons for gettext:

  • defacto-standard for i18n of unix systems and their applications
  • very, very fast, because it uses a binary format and native caching
  • no database required, get the load out of it
  • you can put the translation files within your version-control
  • Many editors for every plattform. -> easy for external translators…
  • powerful command line tools for special needs.
  • stable, stable, stable…

some disadvantages:

  • not thread-safe (thanks to Mike and Derick for pointing this out)
  • You have to spend some time to get it work
  • debbuging gettext can be hard in PHP
  • restart of webserver needed to activate new translations (not in cgi version).

Tutorial

Basic understanding points

  • Translated sentences will be seem like function in PHP, for example:
_('This string should be translated');
tr('This string too...');
  • xgettext is a command line tool and will walk through your source code to find every string which should be translated. xgettext will recognize the translations by the function names. You can specify your special function names like uebersetzung(’blah’) in german. Remember: gettext IS powerful.
  • there are three file types that you must know:
    • pot - this is the template for the translation
    • po - this is a single translation for each language (it is generated from the pot file)
    • mo - this is the “compiled” binary database for the application (generated from the po file)

First step: Generate Functions and init gettext

Make sure you are familiar with setlocale() !

$lang = 'de_DE.UTF-8'; // Dummy
$textdomain = 'cms';
setlocale(LC_ALL, $lang); // see Useful tips & tricks and debbuging about hints for setlocale

// path/to/your/mo/files without LC_MESSAGES and locale!!
// Example: /html/locales/de_DE/LC_MESSAGES/cms.mo
// Use: bindtextdomain('cms', '/html/locales');
bindtextdomain('cms', 'path/to/your/mo/files');  

textdomain('cms');

This is an example translate function that wraps the native “_” function but allows us to populate the strings with parameters like “I have
{number}
strings in my translation file”.

We’ll show you later how to tell the gettext parser to recognize our special function, too.

/**  * translate Function
  *
  * @param    string   Text which should be translated
  * @param    array    Params with Vars which should be replaced (enclosed with {})
  *
  * @return   string   Translated Text
  */
function tr($msgid, array $params = array()) {
  $trans = _($msgid); // Native PHP Function    

  if (!count($params)) {
    return $trans;
  }    

  foreach (array_keys($params) as $element) {
    $search[] = '{' . $element . '}';
  }        

  return str_replace($search, $params, $trans);
}

Second step: Translate your Application

Replace all sentences that should be translated with the PHP Function.

Example of very simple php File:

<html>
<head>
<title>This is my website</title>
</head>
<body>
Hi, <? echo htmlspecialchars($username); ?>! Blaah
</body>
</html>

Translated:

<html>
<head>
<title><? echo htmlspecialchars(_('This is my website')); ?></title>
</head>
<body>
<? echo htmlspecialchars(tr('Hi, {user}! Blaah', array('user' => $username))); ?>
</body>
</html>

Third step: Generate a pot file (Template)

The pot file collects all translation strings and is the template for the localized language files.

The command line and xgettext will help you to generate the pot files.

#!/bin/sh
# Generate a list of your files with "find". This helps you to exclude .svn folders or other things you do not want to translate
cd /path/to/app
find ( -path "./dir_to_include/*" -o -path "./another_dir_to_include/*" ) -name "*.php" ! -path "./exclude_this_path/*" ! -path "./exclude_this_path_too/*" ! -path "*.svn*" > /path/to/potfiles.txt  

# Generate POT File (Template).
# `--keyword=tr' tells xgettext not to use the default _() function but your own tr() function.
xgettext --from-code=utf-8 --keyword=tr --default-domain=cms --output=/path/to/pot.cms --files-from=/path/to/potfiles.txt

Fourth step: Generate your PO Files from your POT Template and translate

We use poedit (http://www.poedit.net/). Choose “New Catalog from POT File…”, there you can insert the language, the team etc. in a comfortable way.

Then start translating with poedit…

Last step: Generate Binaries

Now it’s time to get our translated texts ready for the application. It has to be compiled into the binary format.

msgfmt --output="/path/to/de_DE/LC_MESSAGES/cms.mo" /path/to/cms.po

Update your translation files

If you have new or changed translations in your code it’s just the same procedure:

  • Create the POT Files again (see third step)
  • Update the po files (this merges the new pot and the po - you’ll have to do it for each language you have)
    msgmerge --update /path/to/cms.po /path/to/cms.pot
  • Translate again (poedit will recognize which string are only changed a little bit, very useful)
  • generate Binaries again (see last step)
  • restart your webserver :)

Useful tips & tricks and debbuging

  • It doesn’t work! After updating, restart your webserver (if not cgi version)!
  • Remember: Not the PO-files but the MO-files are used by your application. So don’t forget them.
  • Be sure that the locales you want to use are installed in your linux and that you use the .UTF-8 (you want i18n so please use UTF!):
    	# Example for debian
    	dpkg-reconfigure locales
  • WARNING: setlocale(LC_ALL) will change the output of floats (. in ,) in some languages like german (your SQL Querys could fail!!). Use prepared statements! You can use setlocale(LC_MESSAGES) so only your translations are being localized… (thanks to Mike for pointing this out)
  • to find all strings in your app, use a debug switch that translates every string into “–”. In this mode every string you find in your app is not translated :) Easy and very effective for large apps.
    	define('DEBUG_LANGUAGE', true);
    	function tr($msgid, array $params) {
    	  if (DEBUG_LANGUAGE) {
    	    return '--';
    	  }
    	[...]
    	}

If you have any questions, corrections or additions please let us know. Any feedback is appreciated as usual.