W-JAX 2009: JCR – Java Content Repositories

stapelDisclaimer: This entry has been written while listening to the talk. Please forgive me any typographical or grammatical errors resulting from this approach.

Carsten Ziegeler is going to present the basic concepts behind Java Content Repositories, the features of version 2 of the standard and its use as a potential alternative to classical database approaches. Since I often work with less structured data and web applications (and their respective CMS requirements) I’m curious about the new JCR features. Our last internal talk abou JCR at QuinScape was about two years ago – so some things probably will have happened.

Carsten is involved in the JCR standardization process and working in his day job for Day, main driver behind that standard. JCR currently also positions itself in the NOSQL movement – for what that matters.

Carsten starts with the statement that everything in an application is content, but there are different types of content, some of that unstructured without an existing schema. Usually you will find different kinds of storage for each type of content. Large data sets also pose interesting problems. The usual consequence being that structured data usually is stored in the database, large binary data in the file system and maybe more.

JCR combines the advantages of databases (e.g. referential integrity, transactions, …) with the strengths of hierarchical file systems (e.g. locking, access controls, handling of large data sets). Additional it provides new features like versioning, historization and observation of the data to be notified by events whenever something happens in the content repository. The result being one content repository for all kinds of data.

Data is structured with nodes and properties. Data can be structured via node types, but it also can be unstructured.

Next Carsten provides a small demo application based on the Sling framework – a simple digital asset management system allowing to store photos (with auto-generated thumbnails) in albums.

Apache Jackrabbit provides an implementation for a Java Content Repository (specified in JSR-170). JSR-283 describes the JCR 2.0 standard with a vast number of improvements.

Repositories are accessed through a session, which is returned after logging into the repository with system-specific credentials. The JCR standard provides an API for these operations. Using the session it is possible to navigate the nodes, retrieve properties, etc.  Additionally it is possible to change nodes and node properties. Only when the session is saved the changes are stored permanently in the repository.

Concerning actual data modeling Carsten recommends “Davids Model” as a keyword for the JCR Apache Wiki – there should be ample advice starting there.

Additionally a query API exists for navigating the content structure. SQL-like queries also are possible.

Finally content repository observation mechanisms are very interesting. E.g. file-system based repositories can note that new files have been added to the repository and post corresponding events to listening application modules which in turn can react in some way (e.g. indexing files, …).

A final cool feature is versioning which can store versions of both nodes or the workspace – the user has the choice. The basic mechanism uses subtree versioning. Links are also stored as information.

Answering a question Carsten responded that query performance is “as good as it can be” – receiving quite a bit of jolly laughter 🙂 He noticed that similar to databases there a lot of optimizations in the implementations concerning lazy loading, etc. Actual data is only retrieved when requested, usually only nodes are returned. Additionally compression algorithms exist for the storage files, etc.

While Carstens overview was very brief (he was given just 30 minutes) it was very precise and gave a very good overview. Yet another very good short session at this years’ W-JAX – I am thoroughly impressed by the continually rising standard. Kudos to Software & Support for that!

Comments are closed.