Category Archives: Content

JamSpoon!

First of all, I am writing this on my iPhone on a bumpy road in the middle of Laos. So don’t expect this to look very nice… I have already lost some of my edits due to silly iPhone carry-on, so this is now a exercise in pure stubborness.

I have been designing in my head my Json CMS thing that I was going to call Spatula. But now the concepts and ideas are much clearer to me, so a rename is in order – JamSpoon it is! (name to be explained soon). In a nut-shell it will combine a rich UI for editing Json documents with an extensible REST based architecture for storing these documents on the web. More detail needed? Come on then, let’s go.

First of, I think the best description for what this system will be is an RMS. That stands for Resource Management System. By Resource, I mean resource in the web sense – ie everything that sits at the end of a URL. It’s therefore also a resource in the REST sense, and can be manipulated in the normal restful way.

So this system will provide the UI and mechanisms for managing web resources. It’s no coincidence that RMS sounds a bit like CMS. This is because one of my goals is to provide certain features that have become common place in most CMS systems (such as templates, edit/publish workflows, media management). But at the same time, I want to step away from the usual focus of CMS systems, which is to provide an end to end means for managing web sites. Web site creation is a much more specific and detailed set of functions (sometimes with in-page content editing, seo, HTML templating engines etc), of which resource management (editing text, uploading images, publishing blog posts) is often nestled into in an inextricable way. JamSpoon will only do resource management. Nothing else.

Why is this useful? See my previous blog post “CM bloody S” for some background, and my “Spatula” post for some more details on how I have got to this point (can’t link! On iPhone!). But basically it comes down to the fact l often content driven sites/services are tormented by the need for basic content management, but don’t need web site management. Often the web site
either exists already, or is developed in another framework (asp.net, ruby on rails) that has nothing to do with a CMS. All they need is a mechanism for their model to be populated.

Another angle of use is coming from the document/NoSQL world. Typically these systems provide a schema-free storage mechanism for holding web resources (usually in JSON). Having a nice UI to edit these documents would be very handy, and would often remove the need to write a custom UI for every particular purpose. These two usages seem like pretty good motivators to me, and should find an audience. Hopefully.

To begin explaining the design, I’ll start with the name. JamSpoon will be a system for pushing (spooning) around JSON documents (for textual resources) and binary resources. Ie JSON And Media – Jam. It’s a catchy name, I’m happy. There are 2 main aspects to the design which I think are the key elements. They are how the UI is configured to to edit JSON documents with a given schema (called Jam Recipes), and then how these documents are then stored on the web (Jam Spoons). Firstly, Jam Recipes. These are roughly equivalent to templates in CMSs, or schema for relational models. So this is where guidance can be given about how the UI should lie over a JSON document. I should emphasis this is not schema in the strict sense – the document being edited might have all sorts of fields not mentioned in the recipe. That’s fine, the UI will just ignore those bits.

The format of recipes will be itself in JSON. Specifically, JSON schema documents (no link sorry, google it). This format is very useful, as it allows validation information, nesting, arrays, and other custom fields. Being JSON, the same UI will be used to edit the recipes as the documents these recipes describe. Nice and tidy. I’d love to give an example in JSON Schema, but can’t easily on the phone. But it words, a recipe would be something like:

Car story
– Title (string < 100 chars, required)
– Body text (string, required)
– Array of photos
— Image (binary resource, required)
— Alt text

So this describes how a UI can be constructed such that a JSON document of this form can be edited. Hopefully users of Raven/Mongo/Couch might find this useful. So imagine a UI in which a hierarchy of forms let you create car stories, filling in the text, creating images underneath. Very standard CMS stuff. When you are finished, you can save it, and this resource is persisted somewhere. I'm thinking of supporting 2 modes: CRUD for normal admin type interfaces (where once saved, the resource exists, deletion is possible, no change history is maintained) and Publishing Workflow (where resources are in draft state until published, can't be deleted (only unpublished), and full change history is maintained). These are two very common patterns, each with a lot of uses. The Jam Recipe will state which mode a document should edited. Sounds good to me anyway. There is lots more too it (specifically how links between document could be displayed, and how displaying lists of documents/searching would be handled) but I won't get into it here.

Time to move onto the persistence mechanism. My idea is to use the analogy of spooning. Not in a sexy way, in a slurping stuff around way :). Jam (the JSON And Media) will come from the UI and sent to Spoons via a uniform REST interface. For eg, if Car Story 12 is to be deleted, the http request DELETE /carstory/1 would be sent. So The main UI website would contain no code to directly store anything, it would only be configured to point to a Restful end point over http. The expectation is that these end points would act as an adapter, translating these standardised rest requests into whatever form required to store the Jam. The external systems that actually store the Jam will be called JamJars, to emphasis that's where they ultimately live. Examples I can think of right now:

RavenJamSpoon: translates calls into calls to Raven (either over http again, or the clientAPI)
MongoJamSpoon: translates calls into the Mongo Api
CouchJamSpoon: ditto
LocalFileJamSpoon: translates calls into local file system operations
AmazonS3Spoon: translates calls to the Amazon rest Api for objects/buckets Cassandra, MySql, Memcached(!), WordPress, Umbraco would all be theoretically possible…

The point here is that because this is happening over a rest interface, this spooning can be done in any technology, an any platform. All that the spoons need to do is implement the uniform http interface the UI code is expecting, and translate it into calls in an external system however they see fit. Many spoons would be very simple, some more complex. Not sure which yet!

As an interesting side effect, this uniform interface could also be used to access these documents outside of JamSpoon, if the raw format is hard to work with. Good ones (like Raven) I would expect would just be uses as is, ie through the Raven rest interface.

I guess that's about it. I think the UI part will be done in a Ruby framework (Rails prob) and I'll start with the RavenDb spoon, in C#. Just to prove the neutrality of it all. Well I will start, as soon as the old tropical holiday is over…

Spatula – aka jumping on the NoSQL bandwagon

Thats what I’ll call it. This O/C/DMS thing. Its a simple name for a simple idea.

And I mean a really simple idea. It really won’t do much. Its not ambitious, overly clever or particularly revolutionary. Its just something I wish I had. And based on the assumption I am not in a completely weird position, it might be useful for others. It might disappear, I might find it is not novel and should be replaced with something else blatantly obvious that already exists. But for now, its interesting to me.

So, in short terms, this is what I think it should do:

  1. Provide a friendly UI for the editing of content, including the usual mixture of dates, hyperlinks, html, numbers text etc. I think this content should be very much limited to the actual real core content – avoiding wherever possible view/layout specific stuff.
  2. Provide workflow facilities on top of this content, to allow the publishing model almost every real world content editing scenario needs.
  3. Incorporate versioning in this workflow, so that content clients can detect and act on changes
  4. Not be statically dependent on any model or schema for this content, to allow general reuse and consistency
  5. To handle assets, such as images and video.
  6. Have a mechanism to make this content available to a client, preferably using a strong domain specific model. This is in contrast to the common situation of being faced with key/string pairs that are a nightmare to write code on top of.
  7. Allow items of content to have structure and relationships with other items of content

These are the things I have found to be necessary when creating content for websites and services.

Here is my broad plan how this could be done (note, this assume basic knowledge of document databases, you might need to look some of this up to follow what I mean):

  1. Use document databases (such as CouchDB, Mongo, Raven) for their ability to store JSON documents without having to have static knowledge of the document resources they are storing.
  2. Use the attachment features of these dbs to manage assets such as videos and images.
  3. Use the document structure to represent the “natural” aggregate structure of content. For example, a car page is made of subparts (the car name, review, makes, models) which are most easily understood by editors as a single thing.
  4. Use the index features in these databases to allow relationships to be set to documents not in the current aggregate. And example might be a home page aggregate, in which you would choose a number of articles via an index into those articles. This index could limit the articles in any way desired, such as by date range, category or any arbitrary part of the article document. These references between documents are a natural part of all document dbs.
  5. Use the versioning features of these dbs to handle workflow. The versioning strategy may depend on the db, but will probably require one document per version, with an extra key to tie these versions together.
  6. JSON schema documents will be used as the “ui overlay” to allow these documents to be created, validated and edited easily and dynamically. That is, Spatula will read a list of JSON schemas (annotated with quite a few extras) and use these to construct a UI around this schema. When this UI is filled out, a document matching this schema would then be written to the document db.
  7. The client then simply needs to read straight from the document db and deserialize these documents into memory objects using whatever techniques the document db has available. Most seem to have http rest, at the very least. The result is very simple – the client (for eg a ReST service) has all the objects it needs in its own native format – no mapping, slicing or coercing needed. These could even be updated (even in ways that invalidate the original schema) with no hassles caused to Spatula. Probably.
  8. As a bonus… any website or service written on top of this would absolutely fly. This is because most pages would involve the loading of a single figure number of documents from a document db.

A lot of these things are things already done in the Top Gear system. The end result is very similar. Except we use a flat key/value based CMS database that is mapped at publish time to a strong domain model. This is then stored using nHibernate into a sql database. Then on page render, this model is loaded out of the database into the domain model again, which is then rendered in the usual MVC way. Spatula, I believe could achieve this much more directly, more simply and definitely much more quickly.

My Plan A is to use Mongo as the db and RoR for the document editing UI. Mongo for its attachment and versioning support, RoR for its general no-fuss-ness and dynamic nature. I suspect the RoR bit could possibly even be replaced with some sort of plugin into another CMS. I think I will only know once I’m there though.

In my next post I might be at the point where I can give some very specific examples, or even code.

CM bloody S

There mustn’t be many sentences in the world of IT more frightening than:

“There isn’t a CMS that suits our needs. I think we should write one”

Having said that, of all the sites I have worked on at the BBC, they all used…. a custom CMS. Urk! Why? Dear god!!!

The reason, as far as I can tell that we did this is because…. well… we never actually needed a CMS. Not what most people call a CMS anyway.

For me a pure CMS is this:
A system that allows editorial users to manage web content.

For everyone else on the internet, CMS seems to mean:
A system that allows editorial users to manage a website. So we need artcles. And to be able to set page titles. And layout. And set colours of the heading. And control SEO. And set what the 404 page looks like. And introduce paging of comments. And forums (with moderation!). And blogs of course. And it has to allow extensibility through scripting. And manage users. And, and, and…

According to this definition, there are many stable, strong, excellent products in the world. Their job is to allow powerful simple tools in order to be able to create websites. Umbraco, Drupal, Joomla, WordPress, Expression Engine. These products can tick through the average set of requirements with confident ease. If you need one of these things – well happy days, son. You’ve got choices!

But I don’t want any of them. I actually don’t need all of the things these systems are so proud of.

The systems I have worked take a very opinionated and kind of arrogant attitude. They are all written by developers. Good developers backed by strong design and editorial control. We know how to write web sites! We don’t need a CMS to get paging going on a gallery. We really don’t. And we like the advantages that this control affords us.

Does it cost money? Definitely. But it means we can use whatever technology we want, and change when we want. When can run it all off a database with strong schema. We can control deployment through local, testing and production environments. It means that the website part of the system is easy to integrate with other behind the scenes processes that shovels content to and from other 3rd parties. It means when we need to write a shopping cart we just go ahead and do it, without needing to tip toe around a “CMS” that has decided to set the rules of the game. For sure, this choice has serious implications. But time and time again, we have decided to take that choice.

Not only that, sometimes we aren’t even writing a damn website! My last project was writing a service to back an iPhone app. What does Drupal have to say to that?. Service content (in this case versioned content via JSON/ReST) is still content. So why am I left in the wilderness? But by doing so very much, the big guns are just massively inappropriate to handle these kinds of needs. So we went our own way. Again.

But….. we still need something that controls how we got content in our system. So we wrote it, called it a CMS, there onwards confusing everyone at our company essentially forever when they try and compare it all the other ones everyone raves about.

So either we (we being BBC Worldwide, in particular creators of http://www.topgear.com) are mad… or just daring. Maybe there should be a rule that everyone, when faced with a big (website/something that can be forced to think of as a website) project should just download Drupal/Expression Engine/etc and get one with it, no exceptions. Maybe this is true, and I (and many others) are simply blind to this sense.

Or alternatively we are similar to many others in the world. We write our systems by ourselves, thankyou. We just need a good way of getting content into them. Topgear developed such a way. Its not bad, and its not perfect. It was quite hard to do.

I have recently had a better idea about how to create such a system.

Lets call it a… O(bject)MS. Or a R(esource)MS. Or, maybe even a D(ocument)Management System. I would LOVE to call it a CMS, but it seems the internet has simply outvoted me.

My next post will explain this new idea.