• Search:



Planet eZ publish




ez publish community gateway

› How to use the Doctrine ORM to persist data in eZ Publish 5.3

This is a short outline of how to use the Doctrine ORM to persist data in eZ Publish 5.3. Frankly, it took me more than a day to figure it out...

Please correct me if I did something stupid here, I found zero resources for it so far...

02/09/2014 2:23 pm (UTC)   http://share.ez.no   View entry   Digg!  digg it!   del.icio.us  del.icio.us

ez projects

› New release (compatible with 2014.07.0) and Twitter API 1.1

If you use this extension or did in the past you should review our new GitHub repository: 

git clone git@github.com:brookinsconsulting/nxc_twitter.git ezpublish_legacy/extension/nxc_twitter;

or you may download a package instead nxc_twitter_2014.07.tar.gz

Warning: This solution is deprecated. Please do not use for new projects. Please use http://projects.ez.no/nxc_social_networks instead!

31/08/2014 8:10 pm (UTC)   eZ Projects   View entry   Digg!  digg it!   del.icio.us  del.icio.us

derick rethans

› Natural Sorting with MongoDB

Natural Sorting with MongoDB

Arranging English words in order is simple--well, most of the time. You simply arrange them in alphabetical order. However sorting a set of German words, or French words with all their accents, or Chinese with their different characters is a lot harder than it looks. Sorting rules are specified through "locales", which determine how accents are sorted, in which order the letters are in and how to do case-insensitive sorts. There is a good set of those sorting rules available through CLDR, and there is a neat example to play with all kinds of sorting at ICU's demo site. If you really want to know how the algorithms work, have a look at the Unicode Consortium's report on the Unicode Collation Algorithm.

Right now, MongoDB does not support indexes or sorting on anything but Unicode Code Points. Basically, that means, that it can't sort anything but English. There is a long standing issue, SERVER-1920, that is at the top of the priority list, but is not scheduled to be added to a future release. I expect this to be addressed at a point in the near future. However, with some tricks there is a way to solve the sorting problem manually.

Many languages, have their own implementation of the Unicode Collation Algorithm, often implemented through ICU. PHP has an ICU based implementation as part of the intl extension. And the class to use is the Collator class.

The Collator class encapsulates the Collation Algorithm to allow you to sort an array of text yourself, but it also allows you extract the "sort key". By storing this generated sort key in a separate field in MongoDB, we can sort by locale—and even multiple locales.

Take for example the following array of words:

$words = [
        'bailey', 'boffey', 'böhm', 'brown', 'серге́й', 'сергій', 'swag',
        'svere'
];

Which we can turn into sort keys with a simple PHP script like:

$collator = new Collator( 'en' );
foreach ( $words as $word )
{
        $sortKey = $collator->getSortKey( $word );
        echo $word, ': ', bin2hex( $sortKey ), "\n";
}

We create a collator object for the en locale, which is generic English. When running the script, the output is (after formatting):

bailey: 2927373d2f57010a010a
boffey: 294331312f57010a010a
böhm:   2943353f01859d060109
brown:  294943534101090109
серге́й: 5cba34b41a346601828d05010b
сергій: 5cba34b41a6066010a010a
swag:   4b53273301080108
svere:  4b512f492f01090109

Those sort keys can be used to then sort the array of names. In PHP, that would be:

$collator->sort( $words );
print_r( $words );

Which returns the following list:

[0] => bailey
[1] => boffey
[2] => böhm
[3] => brown
[4] => svere
[5] => swag
[6] => серге́й
[7] => сергій

We can extend this script, to use multiple collations, and import each word including its sort keys into MongoDB.

Below, we define the words we want to sort on, and the collations we want to compare. They are in order: English, German with phone book sorting, Norwegian, Russian and two forms of Swedish: "default" and "standard":


      

Make the connection to MongoDB and clean out the collection:

$m = new MongoClient;
$d = $m->demo;
$c = $d->collate;
$c->drop();

Create the Collator objects for each of our collations:

$collators = [];

foreach ( $collations as $collation )
{
        $c->createIndex( [ $collation => 1 ] );
        $collators[$collation] = new Collator( $collation );
}

Loop over all the words, and for each collation we have define, use the created Collator object to generate the sort key. We encode the sort key with bin2hex() because sort keys are binary data, and MongoDB requires UTF-8 for strings. My original plan of using MongoDB's BinData type did not work, as it sorts first according to the length of the data. Encoding with base64_encode() also does not work, as it's encoding scheme does not keep the original order. Encoding with utf8_encode() does work, but as it creates some binary (but valid-for-MongoDB-UTF-8) data, it's not good to use as an example.

foreach ( $words as $word )
{
        $doc = [ 'word' => $word ];
        foreach ( $collations as $collation )
        {
                $sortKey = $collators[$collation]->getSortKey( $word );
                $doc[$collation] = bin2hex( $sortKey );
        }
        $c->insert( $doc );
}

When we run the script, and see what's in the database, we find something like the following for böhm:

> db.collate.find( { word: 'böhm' }).pretty();
{
        "_id" : ObjectId("53fc721844670a35498b4569"),
        "word" : "böhm",
        "en" : "2943353f01859d060109",
        "de_DE@collation=phonebook" : "29432f353f0186870701848f06",
        "no" : "295aa105353f018687060108",
        "ru" : "2b45374101859d060109",
        "sv@collation=standard" : "295aa106353f01080108",
        "sv@collation=default" : "295aa106353f01080108"
}

To see the sorting for the words in all the locales, I've added the following to the end of the script:

foreach ( $collations as $collation )
{
        echo $collation, ":\n";

        $r = $c->find()->sort( [ $collation => 1 ] );
        foreach ( $r as $res )
        {
                echo $res['word'], ' ';
        }

        echo "\n\n";
}

As you can see, we call sort() and specify which field to sort on. The $collation variable contains the name of the collation. In each stored document, the field with the name of the collation, stores the sort key for that collation as you saw in the previous MongoDB shell output.

Running with this part of the code added, we get:

en:
bailey boffey böhm brown svere swag серге́й сергій

de_DE@collation=phonebook:
bailey böhm boffey brown svere swag серге́й сергій

no:
bailey boffey brown böhm svere swag серге́й сергій

ru:
серге́й сергій bailey boffey böhm brown svere swag

sv@collation=standard:
bailey boffey brown böhm swag svere серге́й сергій

sv@collation=default:
bailey boffey brown böhm svere swag серге́й сергій

  • In English, the ö in böhm sorts as an o.

  • In Germany's phone book collation, the ö in böhm sorts like an oe.

  • In Norwegian, the ö in böhm sorts as an extra letter after z.

  • In Russian, the Cyrillic letters sort before Latin letters.

  • In Sweden's "standard" collation, the v and w are considered equivalent letters.

By generating a sort key for your data, you get to chose with which locale MongoDB will do the sorting, but with the overhead of having to maintain an index yourself. ICU, the library that lies underneath PHP's intl extension supports a lot more customisations for collators, and even allows you to define your own custom rules. In the future, we will likely see some of this functionality make it into MongoDB as well. Until this implemented, generating your own sort-key field for each document like this article shows, is your best MongoDB-only approach. If you find collation sorting in MongoDB important, feel free to vote on the SERVER-1920 issue in Jira.

27/08/2014 10:33 am (UTC)   Derick Rethans   View entry   Digg!  digg it!   del.icio.us  del.icio.us

mugo web

› eZ Publish workflows: multi-language, collaboration, and scheduled publishing

Mugo's eZ Collaboration Workflow extension has been released for a few years now. We've been able to make continuous improvements over time to solve different and more complex client needs. Here's an update on some of the recent new functionality around multi-language workflows, editing other users' drafts, and scheduled publishing.

25/08/2014 3:47 pm (UTC)   Mugo Web   View entry   Digg!  digg it!   del.icio.us  del.icio.us

mugo web

› Adding data to the eZ Find index with Index Time Plugins

Index time plugins are one of the most important techniques of extending eZ Find functionality; they allow you to control how and what data is indexed. Combined with custom eZ Find queries, this opens up huge opportunities for providing access to content, well beyond mere 'search'.

In this post we will look at some typical use cases, briefly consider out of the box functionality and then dive into why you would want to make use of index time plugins and how you would go about setting one up.

"Index time" refers to the fact that eZ Find and Solr maintain a digest of your data independent of the eZ Publish datastore. The digest, properly called "the eZ Find index" is maintained while you edit your content, and not at "query time". The eZ find index is optimized in ways that eZ Publish is not, and this provides an important location for functionalities that are difficult or impossible within eZ Publish.

These use cases may be more or less common, depending on the application or you're working on. Let's take a look at a few ...

25/08/2014 7:52 am (UTC)   Mugo Web   View entry   Digg!  digg it!   del.icio.us  del.icio.us

netgen

› PHP / eZ Publish Summer Camp 2014 - latest insights

With less than 2 weeks till both PHP Summer Camp and eZ Publish Summer Camp its time to share whats currently going on with preparations and reveal few more interesting facts.

22/08/2014 12:34 am (UTC)   http://www.netgenlabs.com/Blog   View entry   Digg!  digg it!   del.icio.us  del.icio.us

derick rethans

› On Backwards Compatibility and not Being Evil

On Backwards Compatibility and not Being Evil

This is a repost of an email I sent to PHP internals as a reply to:

And since you're targetting[sic] the next major release, BC isn't an issue.

This sort of blanket statements that "Backwards Compatibility is not an issue" with a new major version is extremely unwarranted. Extreme care should be taken when deciding to break Backwards Compatibility. It should not be "oh we have a major new version so we can break all the things"™.

There are two main types of breaking Backwards Compatibility:

  1. The obvious case where running things trough php -l instantly tells you your code no longer works. Bugs like the two default cases, fall in this category. I have no problem with this, as it's very easy to spot the difference (In the case of allowing multiple "default" cases, it's a fricking bug fix too).

  2. Subtle changes in how PHP behaves. There is large amount of those things currently under discussion. There is the nearly undetectable change of the "Uniform Variable Syntax", that I already wrote about, the current discussion on "Binary String Comparison", and now changing the behaviour on << and >> in a subtle way. These changes are not okay, because they are nearly impossible to test for.

    Changes that are so difficult to detect, mean that our users need to re-audit and check their whole code base. It makes people not want to upgrade to a newer version as there would be more overhead than gains. Heck, even changing the $ in front of variables to £ is a better change, as it's immediately apparent that stuff changed. And you can't get away with "But Symfony and ZendFramework don't use this" either, as there is so much code out there

As I said, the first type isn't much of a problem, as it's easy to find what causes such Backwards Compatibility break, but the second type is what causes our users an enormous amount of frustration. Which then results in a lot slower adoption rate—if they bother upgrading at all. Computer Science "purity" reasons to "make things better" have little to no meaning for PHP, as it's clearly not designed in the first place.

Can I please urge people to not take Backwards Compatibility issues so lightly. Please think really careful when you suggest to break Backwards Compatibility, it should only be considered if there is a real and important reason to do so. Changing binary comparison is not one of those, changing behaviour for everybody regarding << and >> is not one of those, and subtle changes to what syntax means is certainly not one of them.

Don't be Evil

21/08/2014 5:31 pm (UTC)   Derick Rethans   View entry   Digg!  digg it!   del.icio.us  del.icio.us

mugo web

› ezurl() links in eZ Publish 5

In eZ Publish 4 / legacy, formatting link URLs is handled by the well-known ezurl() template operator. This is especially useful when you have multiple siteaccesses and you use URL-based matching. In eZ Publish 5, there is no single ezurl() equivalent; instead, there are several options depending on the type of link you want to display.

21/08/2014 1:52 am (UTC)   Mugo Web   View entry   Digg!  digg it!   del.icio.us  del.icio.us

ez publish community gateway

› ezurl() links in eZ Publish 5

In eZ Publish 4 / legacy, formatting link URLs is handled by the well-known ezurl() template operator. This is especially useful when you have multiple siteaccesses and you use URL-based matching. In eZ Publish 5, there is no single ezurl() equivalent; instead, there are several options depending on the type of link you want to display.

20/08/2014 5:24 pm (UTC)   http://share.ez.no   View entry   Digg!  digg it!   del.icio.us  del.icio.us

ez publish community gateway

› Community Project Board meeting minutes - August 4

Aloha, here are the minutes of the Community Project Board meeting that happened July 23rd and August 4th. Our previous minutes can be found here. Minutes for meeting 44 and 45 included.

20/08/2014 4:05 pm (UTC)   http://share.ez.no   View entry   Digg!  digg it!   del.icio.us  del.icio.us