<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="wordpress/2.2" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>

<channel>
	<title>Jimdo and Northclick Dev-Blog</title>
	<link>http://blog.northclick.de</link>
	<description>This Blog is about Web-Development at NorthClick and Jimdo, Hamburg, Germany</description>
	<pubDate>Wed, 23 Apr 2008 14:11:13 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.2</generator>
	<language>en</language>
			<item>
		<title>Jimdo @ PHP Unconference</title>
		<link>http://blog.northclick.de/archives/40</link>
		<comments>http://blog.northclick.de/archives/40#comments</comments>
		<pubDate>Wed, 23 Apr 2008 10:25:03 +0000</pubDate>
		<dc:creator>Soenke Ruempler</dc:creator>
		
		<category><![CDATA[PHP]]></category>

		<category><![CDATA[Development]]></category>

		<guid isPermaLink="false">http://blog.northclick.de/archives/40</guid>
		<description><![CDATA[The weekend and therefore the second PHP Unconference in Hamburg is getting closer and closer. You can and WILL meet the Jimdo development team  there,  including Markus Wolff, Hinrich Sager, Boris Erdmann, Martin Denk, Christian Springub and myself (Sönke Ruempler). Boris and Sönke will give a(n) (un)talk about our message queue dropr and [...]]]></description>
			<content:encoded><![CDATA[<p>The weekend and therefore the second PHP Unconference in Hamburg is getting closer and closer. You can and WILL meet the Jimdo development team  there,  including Markus Wolff, Hinrich Sager, Boris Erdmann, Martin Denk, Christian Springub and myself (Sönke Ruempler). Boris and Sönke will give a(n) (un)talk about our message queue <a href="https://www.dropr.org/">dropr</a> and hopefully an enthusiastic discussion about distributed architectures and asynchronous programming!</p>
<p>So if you can&#8217;t wait for the weekend just watch one of our famous &#8220;Atomic Bomberman&#8221; matches held every evening in our office:</p>
<p><object type="application/x-shockwave-flash" data="http://www.vimeo.com/moogaloop.swf?clip_id=925975&amp;server=www.vimeo.com&amp;fullscreen=1&amp;show_title=1&amp;show_byline=1&amp;show_portrait=0&amp;color=" height="327" width="400"></object></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.northclick.de/archives/40/feed</wfw:commentRss>
		</item>
		<item>
		<title>PECL spread module resurrected!</title>
		<link>http://blog.northclick.de/archives/39</link>
		<comments>http://blog.northclick.de/archives/39#comments</comments>
		<pubDate>Tue, 04 Mar 2008 23:08:21 +0000</pubDate>
		<dc:creator>Soenke Ruempler</dc:creator>
		
		<category><![CDATA[PHP]]></category>

		<category><![CDATA[Development]]></category>

		<guid isPermaLink="false">http://blog.northclick.de/archives/39</guid>
		<description><![CDATA[Yesterday I got an e-mail that two of my PECL bug reports for the spread module have been fixed. It seems that it gets some recent love from Rob  Richards who commited some fixes and cleanups. After a short test I can see that it&#8217;s basically working again (you can send messages to a group [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday I got an e-mail that two of my <a href="http://pecl.php.net/bugs/search.php?cmd=display&amp;package_name[]=spread&amp;status=All" target="_blank">PECL bug reports for the spread module have been fixed</a>. It seems that it gets some recent love from <a href="http://www.cdatazone.org/" target="_blank">Rob  Richards</a> who commited some fixes and cleanups. After a short test I can see that it&#8217;s basically working again (you can send messages to a group without segfaults). Although <a href="http://blog.northclick.de/archives/22" target="_blank">we have no use-case at the moment</a> it&#8217;s nice to see someone is caring for it.</p>
<p>Thanks Rob! I&#8217;m gonna play a little bit around :)</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.northclick.de/archives/39/feed</wfw:commentRss>
		</item>
		<item>
		<title>Welcome Michael Dunsky!</title>
		<link>http://blog.northclick.de/archives/37</link>
		<comments>http://blog.northclick.de/archives/37#comments</comments>
		<pubDate>Mon, 14 Jan 2008 14:15:12 +0000</pubDate>
		<dc:creator>Soenke Ruempler</dc:creator>
		
		<category><![CDATA[Development]]></category>

		<guid isPermaLink="false">http://blog.northclick.de/archives/37</guid>
		<description><![CDATA[Again, our team got reinforced - this time, we&#8217;re proud to annouce our new server administrator Michael Dunsky. He&#8217;s a really smart guy and will manage our servers and world-wide deployed infrastructure. He&#8217;ll also fill this blog with useful hints, howtos and other stuff!
Welcome Michael!
]]></description>
			<content:encoded><![CDATA[<p>Again, our team got reinforced - this time, we&#8217;re proud to annouce our new server administrator Michael Dunsky. He&#8217;s a really smart guy and will manage our servers and world-wide deployed infrastructure. He&#8217;ll also fill this blog with useful hints, howtos and other stuff!</p>
<p>Welcome Michael!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.northclick.de/archives/37/feed</wfw:commentRss>
		</item>
		<item>
		<title>delicious, please fix your json API!</title>
		<link>http://blog.northclick.de/archives/36</link>
		<comments>http://blog.northclick.de/archives/36#comments</comments>
		<pubDate>Thu, 03 Jan 2008 12:20:34 +0000</pubDate>
		<dc:creator>Soenke Ruempler</dc:creator>
		
		<category><![CDATA[PHP]]></category>

		<category><![CDATA[Development]]></category>

		<guid isPermaLink="false">http://blog.northclick.de/archives/36</guid>
		<description><![CDATA[One month ago we wrote an email to the guys at delicious that their JSON API breaks the specification when it comes to escaping. Unfortunately we haven&#8217;t got any answer yet.
They&#8217;re escaping single quotes what is not neccessary and not allowed! This for example causes the Zend_Service_Delicious component of the ZF to fail and return [...]]]></description>
			<content:encoded><![CDATA[<p>One month ago we wrote an email to the guys at delicious that their JSON API breaks the specification when it comes to escaping. Unfortunately we haven&#8217;t got any answer yet.</p>
<p>They&#8217;re escaping single quotes what is not neccessary and not allowed! This for example causes the Zend_Service_Delicious component of the ZF to fail and return an empty array if you try to get all bookmarks of a specific user whose bookmarks contain single quotes.</p>
<p>So please check the specification at <a href="http://www.json.org/">json.org</a> and fix your API!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.northclick.de/archives/36/feed</wfw:commentRss>
		</item>
		<item>
		<title>Announcing &#8220;dropr&#8221; - the message queue framework for PHP</title>
		<link>http://blog.northclick.de/archives/35</link>
		<comments>http://blog.northclick.de/archives/35#comments</comments>
		<pubDate>Mon, 10 Dec 2007 00:34:01 +0000</pubDate>
		<dc:creator>Soenke Ruempler</dc:creator>
		
		<category><![CDATA[PHP]]></category>

		<category><![CDATA[Development]]></category>

		<guid isPermaLink="false">http://blog.northclick.de/archives/35</guid>
		<description><![CDATA[Finally, we&#8217;ve named our new open source message queue framework &#8220;dropr&#8221;.
Why? When Boris was writing the client angel script he somehow named it &#8220;dropr&#8221;. As we neither got better suggestions nor had any other idea we just decided for this name. Actually the name is a little bit fun because all those stupidR startupRs. But [...]]]></description>
			<content:encoded><![CDATA[<p>Finally, we&#8217;ve named our new open source message queue framework <strong>&#8220;dropr&#8221;</strong>.</p>
<p>Why? When Boris was writing the client angel script he somehow named it &#8220;dropr&#8221;. As we neither got better suggestions nor had any other idea we just decided for this name. Actually the name is a little bit fun because all those stupidR startupRs. But it&#8217;s nice and somehow our framework drops message into queues :)</p>
<p>If you&#8217;re using google to search for it, it&#8217;s already the first result.</p>
<p>So have a look at <a href="https://www.dropr.org/">https://www.dropr.org/</a>. We&#8217;re just writing a little installation manual about setting up the pre-release of our framework. We&#8217;ll keep you up-to-date on this blog, so stay tuned.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.northclick.de/archives/35/feed</wfw:commentRss>
		</item>
		<item>
		<title>Message Queue Project: First working version</title>
		<link>http://blog.northclick.de/archives/34</link>
		<comments>http://blog.northclick.de/archives/34#comments</comments>
		<pubDate>Thu, 06 Dec 2007 18:32:07 +0000</pubDate>
		<dc:creator>Soenke Ruempler</dc:creator>
		
		<category><![CDATA[PHP]]></category>

		<category><![CDATA[Development]]></category>

		<guid isPermaLink="false">http://blog.northclick.de/archives/34</guid>
		<description><![CDATA[First, we&#8217;d like to annouce that our staff has been reinforced by Boris Erdmann! We&#8217;re really proud of having another geek within our development team. Among other things he is the founder and maintainer of the german OpenID provider xlogon.

Update: The project is now called dropr!
Some time has elapsed since we wrote our draft for [...]]]></description>
			<content:encoded><![CDATA[<p><strong>First, we&#8217;d like to annouce that our staff has been reinforced by Boris Erdmann! We&#8217;re really proud of having another geek within our development team.</strong> Among other things he is the founder and maintainer of the german OpenID provider <a href="http://www.xlogon.net/" target="_blank">xlogon</a>.<br />
<strong><br />
Update: The project is now called dropr!</strong></p>
<p>Some time has elapsed since we <a href="http://blog.northclick.de/archives/31">wrote our draft</a> for a message queue system written in and for PHP. Now it&#8217;s time to give you guys an update and working beta-code.</p>
<p>Boris and me have spent some nights to get a prototype running. First, we tried to work on tcp stream connections with php for the transport layer but that was buggy and slow. Our second approach was to use http uploads with curl (thanks to <a href="http://jan.kneschke.de/" target="_blank">Jan Kneschke</a> for the tip!). For the storage we decided to try a simple filesystem storage and it seems our predictions were right:</p>
<ul>
<li>With using the filesystem as client and server storage you won&#8217;t need to setup any database system</li>
<li>The filesystem actually IS a (restricted) DBMS</li>
<li>If you have files, you are able to get all the I/O work done by fast C-code. On the client side, the data is loaded via curl (we only pass the filenames to it) and gets uploaded to the server, where the php handles the uploads and makes tempfiles out of it =&gt; the server storage can use a simple rename() operation again to put it into it&#8217;s store</li>
<li>Filesystem operations like moving and deleting are ATOMIC - a file is existent or it isn&#8217;t</li>
<li>You can encode all metadata like priority in the filename - there&#8217;s enough space in it</li>
<li>You can read huge dirs fast with &#8220;scandir()&#8221; function, it&#8217;s sorted by filename</li>
<li>You&#8217;re still able to implement your own storages and transports, though, because everything is written within an abstract class model.</li>
</ul>
<p>Some notes:</p>
<ul>
<li>With this setup we&#8217;re actually able to send 200-250 messages per second (benchmarked on a damn slow machine).</li>
<li>we&#8217;ve set up a rc.d script for starting and stopping the client queue daemon</li>
<li>the client queue daemon has an angel-process that cares about php breaking down</li>
<li>the message queue daemon gets notified by the client via systemv inter-process-communication so there is no sleep() overhead within</li>
<li>the system is also in production in a non-critical area of our software and we&#8217;re testing it for a few weeks.</li>
</ul>
<p>You can find the project homepage at <strong><a href="https://www.dropr.org/">https://www.dropr.org/</a></strong>.</p>
<p>Please checkout the sources &#8220;svn co https://www.dropr.org/svn/trunk/&#8221; and have a look at it. We&#8217;re sorry that we have no real documentation at this stage of the project but it will follow soon hopefully.</p>
<p>Stay tuned, we&#8217;d like to get out a alpha/beta as PEAR (or maybe debian) package soon.</p>
<p><em>A tip:</em> Get a <a href="http://www.jimdo.com/" title="free website">free website</a> with integrated content management at Jimdo.com - no programming knowledge required!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.northclick.de/archives/34/feed</wfw:commentRss>
		</item>
		<item>
		<title>Welcome Hinrich Sager!</title>
		<link>http://blog.northclick.de/archives/33</link>
		<comments>http://blog.northclick.de/archives/33#comments</comments>
		<pubDate>Fri, 02 Nov 2007 12:19:15 +0000</pubDate>
		<dc:creator>Springub</dc:creator>
		
		<category><![CDATA[PHP]]></category>

		<category><![CDATA[Development]]></category>

		<guid isPermaLink="false">http://blog.northclick.de/archives/33</guid>
		<description><![CDATA[We’re proud to announce our new software engineer Hinrich Sager. He’s a well known guy in the PHP scene in Hamburg.
]]></description>
			<content:encoded><![CDATA[<p>We’re proud to announce our new software engineer Hinrich Sager. He’s a well known guy in the PHP scene in Hamburg.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.northclick.de/archives/33/feed</wfw:commentRss>
		</item>
		<item>
		<title>RFC: Draft for a Message Queue system in PHP</title>
		<link>http://blog.northclick.de/archives/31</link>
		<comments>http://blog.northclick.de/archives/31#comments</comments>
		<pubDate>Wed, 03 Oct 2007 11:34:53 +0000</pubDate>
		<dc:creator>Soenke Ruempler</dc:creator>
		
		<category><![CDATA[PHP]]></category>

		<category><![CDATA[Development]]></category>

		<guid isPermaLink="false">http://blog.northclick.de/archives/31</guid>
		<description><![CDATA[
Because the lack of implementations of a messaging system in PHP we at Jimdo are going to implement such a solution on our own. Some time ago I wrote another post and asked if somebody has experiences with such techniques or already designed such a beast.
You&#8217;ll find a Draft for a php-based messaging system below. [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://blog.northclick.de/wp-content/uploads/2007/10/message_que_draft.gif" title="Click on the image for a bigger view!"><img src="http://blog.northclick.de/wp-content/uploads/2007/10/message_que_draft-small.gif" alt="Click on the image for a bigger view!" /></a></p>
<p>Because the lack of implementations of a messaging system in PHP we at <a href="http://www.jimdo.com/">Jimdo</a> are going to implement such a solution on our own. Some time ago <a href="http://blog.northclick.de/archives/22">I wrote another post</a> and asked if somebody has experiences with such techniques or already designed such a beast.</p>
<p>You&#8217;ll find a Draft for a php-based messaging system below. We&#8217;d be glad if we get some comments from the readers. Because we&#8217;re heavily using open source we want to give something back to the community and <strong>make the message queue system open source</strong>. And, yes, if someone is planning something like this or already knows a solution, please let us know, too. We don&#8217;t wanna reinvent the wheel!</p>
<p>But now let&#8217;s come to the details ;-)</p>
<ol>
<li>Problem</li>
<li>Existing Solutions</li>
<li>Implementation Draft</li>
</ol>
<p><strong>1. The problem</strong></p>
<p>When you&#8217;re building an infrastructure that is distributed all over the internet, you&#8217;ll come to a point where you can&#8217;t rely on synchronous remote calls that - for example - synchronize data on 2 servers:</p>
<p>a) You don&#8217;t have any failover system that resends messages if something went wrong (network outages, software failures)<br />
b) Messages are processed over time and you have no control if something goes overloaded by too many requests</p>
<p>Even if you don&#8217;t have to send messages all over the Internet there are enough points of failures where something can go wrong. You want a reliable and durable system that fails gracefully and ensures the delivery of messages even after temporary outages of any machine within the system.</p>
<p><strong>2. Existing solutions</strong></p>
<p>In the Java world there is a standard called <a href="http://en.wikipedia.org/wiki/Java_Message_Service"><strong>Java Message Service (JMS)</strong></a>. ActiveMQ from the Apache project implements such a solution.</p>
<p>In theory you can connect to a JMS service with PHP using a PECL Module called <a href="http://pecl.php.net/package/sam">SAM (Simple Asynchronous Messaging for PHP)</a>. While it sounds easy to install and use, I ran into so many problems that I gave up. First, you need to register at IBM and download <a href="http://www.ibm.com/developerworks/websphere/library/techarticles/0509_phillips/0509_phillips.html">&#8220;XMS - The IBM Message Service API&#8221;</a>. After you&#8217;ve downloaded the 60MB (!) package you can compile SAM. Add the extension in our php.ini and BAM: &#8220;PHP could not startup&#8221;. At this point I got really frustrated and gave up. I wanna have something more lightweight.</p>
<p>But in the PHP world I didn&#8217;t find anything similar.</p>
<p><strong>3. Our approach to a PHP-based message queueing system</strong></p>
<p>Vocabulary:</p>
<p>Client: The peer that invokes a message<br />
Client-queue: The local client que that will run as a daemon and sends the messages to the servers<br />
Server: The peer that receives messages</p>
<p><strong>3.1 Features</strong></p>
<p><strong>3.1.1 Main Architecture</strong></p>
<p>The system is peer-to-peer based with no central server (and therefore no single point of failure). Every (sending/client) node has its own local queue.</p>
<p>The local queue will have an &#8220;angel process&#8221; that restarts the daemon if it dies.</p>
<p><strong>3.1.2 Asynchronous / Durability</strong></p>
<p>Messaging is never direct. On the client side (the system that wants to send some message to another peer) there will be a local queue that sends out messages to its peers (called servers). There will be an API available for your applications to put messages into the queue, but the actual message transport will be handled by a separate process.</p>
<p>On the server-side there will be another process that writes the received messages into its queue. This queue will be queried by YOUR application using the provided API. See code examples below.</p>
<p>We will have to take care that operations are atomic. If the client gets a &#8220;true&#8221;, it can be sure that the data is saved to the local queue. And if the client que gets a &#8220;true&#8221; from the transport layer it can be sure the message is in the server&#8217;s queue.</p>
<p>As you may have noticed already, the solution never executes your code. It&#8217;s always queried by the application that makes use of the messaging service.</p>
<p>(HINT: Anyways we can implement a mode on the server side that calls your code. This would be an interface with only one method like &#8220;handleMessage(MQ_Message $message)&#8221; you&#8217;ll have to implement. If your method returns true, the message will be deleted from the server-side queue, otherwise it will be kept for later handling.)</p>
<p><strong>3.1.3 Scalability</strong></p>
<p>Because there is no single instance that could get overloaded scalability issues shouldn&#8217;t occur. The system scales somehow horizontal.</p>
<p><strong>3.1.4 Content Independence</strong></p>
<p>The message content is arbitrary. You could even send movies ;-)</p>
<p><strong>3.1.5 Modularity</strong></p>
<p>The underlying database is switchable. We will do our POC-implementation with PDO_sqlite but switching to other PDO-drivers, maybe even text-files, DBM etc. should be possible.</p>
<p>Also, the transport layer should be replaceable (XML-RPC, REST, SOAP, spread, &lt;put here what you want&gt;).</p>
<p><strong>3.1.6 Easy to setup and use</strong></p>
<p>The default configuration should work out of the box with an SQLite backend. See the examples.</p>
<p><strong>3.1.7 &#8220;Job done&#8221; messages</strong></p>
<p>If a message is completed and processed at the &#8220;server&#8221;, it can notify the original client that the operation was successful. We&#8217;ll have to think about possible solutions because this certainly implies a 2-way-communication.</p>
<p>One approach could be that the sender specifies if he want&#8217;s to get an &#8220;OK - message processed&#8221; from the peer. The client que will  poll the server after submitting the message and checks if the message has been set to &#8220;PROCESSED&#8221; state.</p>
<p><strong>3.1.8 Priority handling</strong></p>
<p>Of course, you can actually set the priority of a message so you can be sure that more important messages will by favored by the que.</p>
<p><strong>3.1.9 Multicasting</strong></p>
<p>It&#8217;s possible to send a message to several servers with a single command. The queue will set the state to PROCESSED if all servers notified it that the message was processed.</p>
<p><strong>3.2 Code Examples</strong></p>
<p><strong>3.2.1 Client-side</strong></p>
<p>Starting the queueing daemon should look like this:</p>
<pre>
/opt/MQ/daemon start
</pre>
<p>FIXME: Configuration of the Daemon</p>
<p>Sending a message would look like this:</p>
<pre>
$queueDriver = MQ_QueueDriver::factory('PDO', 'dsn ....');
$transport = MQ_Transport::factory('REST');
$serverOne = new MQ_Peer('http://server1.com/path/to/server');
$serverTwo = new MQ_Peer('http://server2.com/path/to/server');
$queue = new MQ_Client($queueDriver, $transport);
$queue->send($serverOne, 'this is a test message');
$queue->multicast(array($serverOne, $serverTwo), 'this is a test multicast message');

// Send with feedback
$message = $queue->send($serverOne, 'this is a test message', true);

// Wait for the feedback
while ($message->getStatus() != MQ_PROCESSED) {
   sleep(1);
} 

echo "Now it's processed by the peer, yieehaaa!";
</pre>
<p><strong>3.2.2 Server-side (get something out of the queue)</strong></p>
<p>Because your main application is not involved into the messaging process the only thing it has to do is to query the queue:</p>
<pre>
$queueDriver = MQ_QueueDriver::factory('PDO', 'dsn ....');
$queue = new MQ_Server($queueDriver);

// Getting the messages with an infinite-loop iterator
foreach ($queue as $message) {
    echo "Sender was: " . $message->getSender();
    echo "Message is: " . $message->getMessage();

    // Mark the Message as processed, the client que checks if we were processed.
    // So the origin has the opportunity to check if we could process the message properly.
    $message->setProcessed(true);
}
</pre>
<p><strong>3.3 Requirements</strong></p>
<p>- PHP 5.2+<br />
- certainly the Zend-Framework</p>
<p><strong>4. The Name?</strong></p>
<p>No name until we have a working prototype - haha.</p>
<p><strong>5. Next steps</strong></p>
<p>- evaluate feedback<br />
- setup a trac, fill tickets<br />
- begin to code</p>
<p>We appreciate your feedback, wishlist, improvements, criticism in the comments as the development will be started soon.</p>
<p>Sönke with assistance from Markus and Christian.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.northclick.de/archives/31/feed</wfw:commentRss>
		</item>
		<item>
		<title>PHP Work @ jimdo.com !</title>
		<link>http://blog.northclick.de/archives/26</link>
		<comments>http://blog.northclick.de/archives/26#comments</comments>
		<pubDate>Thu, 20 Sep 2007 16:47:11 +0000</pubDate>
		<dc:creator>MarkusWolff</dc:creator>
		
		<category><![CDATA[PHP]]></category>

		<category><![CDATA[Development]]></category>

		<guid isPermaLink="false">http://blog.northclick.de/archives/26</guid>
		<description><![CDATA[Welcome to yet another job offer post :-)
We at Jimdo are looking for reinforcements! You&#8217;ll be able to work in Hamburg, one of the most beautiful cities in Germany and work with a young team in an enthusiastic athmosphere. Honestly, I haven&#8217;t had this much fun for years before I started working here :-)
We&#8217;re looking [...]]]></description>
			<content:encoded><![CDATA[<p>Welcome to yet another job offer post :-)</p>
<p>We at Jimdo are looking for reinforcements! You&#8217;ll be able to work in Hamburg, one of the most beautiful cities in Germany and work with a young team in an enthusiastic athmosphere. Honestly, I haven&#8217;t had this much fun for years before I started working here :-)</p>
<p>We&#8217;re looking for PHP experts, system administrators, web designers and Javascript gods (or a human vessel with equal powers).</p>
<p>What are you still waiting for? Go to the official job page, right now, don&#8217;t hesitate:<br />
<a href="http://www.jimdo.com/jobs.php">http://www.jimdo.com/jobs.php</a></p>
<p>Note: If you&#8217;re a resident of the European Union, you&#8217;re free to work in any EU country you like! You don&#8217;t speak German? Not a problem as long as you speak and understand English well enough. You&#8217;ll miss out on some of the jokes for a while, but you can still see me lose each and every single Bomberman tournament held in this office, which should be satisfactory enough. Oh, we also got excellent coffee! Still not interested? What&#8217;s wrong with you? :-)</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.northclick.de/archives/26/feed</wfw:commentRss>
		</item>
		<item>
		<title>A comma is a comma is a comma&#8230; or is it?</title>
		<link>http://blog.northclick.de/archives/25</link>
		<comments>http://blog.northclick.de/archives/25#comments</comments>
		<pubDate>Wed, 19 Sep 2007 15:14:42 +0000</pubDate>
		<dc:creator>MarkusWolff</dc:creator>
		
		<category><![CDATA[PHP]]></category>

		<category><![CDATA[Development]]></category>

		<guid isPermaLink="false">http://blog.northclick.de/archives/25</guid>
		<description><![CDATA[When you&#8217;re builing international websites, there&#8217;s always something new to learn. Especially if one of the languages your website is available in uses a character set different from anything you&#8217;re used to. For jimdo.com, the greatest challenge as of yet is the chinese version.
Jimdo allows to define tags for your website. You can separate the [...]]]></description>
			<content:encoded><![CDATA[<p>When you&#8217;re builing international websites, there&#8217;s always something new to learn. Especially if one of the languages your website is available in uses a character set different from anything you&#8217;re used to. For jimdo.com, the greatest challenge as of yet is the <a href="http://cn.jimdo.com/index.php" target="_blank">chinese version</a>.</p>
<p>Jimdo allows to define tags for your website. You can separate the tags with whitespace, but it&#8217;s also possible to use commas, like this:</p>
<p><code>tag1, tag2, tag3</code></p>
<p>Chinese users naturally are way more used to using UTF-8 characters than us westerners, and, lo and behold, UTF-8 has its own special comma character with integrated whitespace, that is quite frequently used by chinese users:</p>
<p><code>科学，思考，心情</code></p>
<p>As we&#8217;re using good ol&#8217; regular expressions to split up the tag strings into single tags, one might think, &#8220;no problem, I&#8217;ll just add another character to the regex pattern&#8221;, like so:</p>
<p><code>$tags = preg_split("/[\s,;:，]+/", $input, null, PREG_SPLIT_NO_EMPTY);</code></p>
<p>And heureka, it works! Or does it? Nope. As UTF-8 works with multiple bytes per character and preg_split, like so many other current PHP functions, thinks of one byte = one character, you may encounter strange side-effects. Here&#8217;s an example using the above pattern on a random string with some German umlauts:</p>
<p><code><br />
Splitting up 'Bääh Blöök Dübel', becomes:<br />
Array<br />
(<br />
[0] =&gt; Bääh<br />
[1] =&gt; Blöök<br />
[2] =&gt; D�<br />
[3] =&gt; bel<br />
)<br />
</code></p>
<p>What to do? Simple: Add the unicode modifier, &#8220;u&#8221;, to the pattern:</p>
<p>$tags = preg_split(&#8221;/[\s,;:，]+/u&#8221;, $input, null, PREG_SPLIT_NO_EMPTY);</p>
<p>Now preg_split correctly recognizes multibyte characters and yields the expected results:</p>
<p><code><br />
Splitting up 'Bääh Blöök Dübel', becomes:<br />
Array<br />
(<br />
[0] =&gt; Bääh<br />
[1] =&gt; Blöök<br />
[2] =&gt; Dübel<br />
)<br />
</code></p>
<p>Another lesson learned.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.northclick.de/archives/25/feed</wfw:commentRss>
		</item>
	</channel>
</rss>
