Saturday, 28 November 2009

The unknown approach to Database Publishing

What do you associate with the words Database Publishing? Heavy scripting, clumsy, old fashioned, hard to adopt, impossible to sustain over time, only for really big productions?

These would probably have been my answers had you asked me years ago. Now, on the other hand I’ve come to realise that there’s another approach to this. An approach that I fear not many are aware of. Without the knowledge of this approach you might unfortunately miss out on a lot of opportunities.

Before explaining the approach I’m talking about, I’d like to start by explaining the traditional approach to Database Publishing.

Traditional Database Publishing can basically be described as this:
  1. Export the data from the data source into a text file.
  2. Read the exported text with a custom written script or a commercial tool.
  3. Let the script or tool create the publication, page per page, formatting the text and creating the layout as the pages are built - one after the other.
This sounds fairly easy but there are a few hurdles to pass. First of all you are likely to encounter problems with the exported data. It probably doesn’t contain all the data you need like units, headings etc. Then there could be problems with structure of the data. The data might for instance come in another order then you’d expect. All these data “hurdles” need to be addressed and that’s usually handled by the script or the tool used.

Then there’re all the “hurdles” associated with changes, adjustments and reuse.

What do you do when the publication is generated and you find an error in the original data? Repeat the process from step 1 should be the obvious answer, but what if you had done several manual adjustments to the layout? Repeating from step 1 would mean re-doing all the manual adjustments all over again. If you repeat this cycle a couple of times... well, you know where I’m going with this, don’t you?

Another interesting question is what you do in the highly customised layout case? It’s not that uncommon that a layout have to look a certain way. A way that doesn’t accomodate data “flowing” very good.

In this case I’m sure most consultants and power users come to the rescue and offer to customise the script or tool to do exactly what you need. This might just solve your immediate problems but be aware. This is usually a good way of making sure your solution will become costly to maintain and very hard to reuse for other database publishing needs.

All these “hurdles” are the reasons why database publishing is associated with words like heavy scripting, clumsy, old fashioned, hard to adopt and impossible to sustain over time. The consequence is that database publishing is really only considered for really big productions where the manual approach isn’t a feasible choice.

Traditional database publishing like this can be described by one word - “push”. This of course comes from the approach where the data is being pushed onto to layout.

So, this approach obviously have limitations. What’s the other approach then? Well, if there’s a “push”, then there’s of course a “pull”.

Imagine a solution where the data is being pulled instead of pushed. What would it look like? What’s being pulled and who’s doing the pulling?

Of course it’s the data that’s being pulled and it’s the layout that’s doing the pulling. And since you are in control of the layout, you’re the one doing the pulling. In fact the idea is not that different from how you place an image in a layout.

You create the page, including the boxes and all data that isn’t to be found in the database. Then you pull the desired data from the database into the correct place in the layout and format it. Just like when placing images in the layout, the pulled data stays linked to the database. When the data in the database gets updated it will show in the layout and you can choose to update the data with a single click. Just like an image.

This works with pricing information, product text, headings, units and tables. It even works with images.

This “pull” approach gives you total control over the entire layout and it is very easy to adopt to different productions. It allows you to create entire catalogues as well as a single advertisement without having to alter the data source or the tool being used.

It’s also very easy to get started. It’s just a matter of opening a previous production and replace the static information with data pulled from the database. Then you have your linked layout and you can always be confident that your prices, product information etc will stay up to date. This should also affect the approval process in a very positive way, right? There’s no longer the same need to check information correctness.

Does this sound like magic? It really isn’t.
There are in fact several plugins to Adobe InDesign that makes exactly this possible. There’s LiveMerge from Cacidi, Smart Catalog from Woodwing, CCR from Ctrl Publishing and my personal favorite EasyCatalog from 65bit Software. I’m sure there are even more tools than these out there. If you know of any please mention them in the comments.

EasyCatalog presents the data in an Excel like panel. This allow you to restructure the data and format it in different ways, for instance add units. Inserting and linking data to the layout is easy. Just select the data in the panel and then click the little pen icon in the panels bottom left and the data will be inserted (or pulled) into the layout.

EasyCatalog also works with InDesign Librarys. This is very handy when creating for instance DM leaflets that are commonly built on modules. Create a Library item with linked data for each module type, then select the data row for a product in the Excel like panel, then drag & drop the module (Library item) onto the page. All corresponding data including images will be inserted and formated automatically.

I’ve used EasyCatalog to help reduce the production time of 12-16 page DM leaflets from over a week to 1-2 days.

Before you get too excited and start downloading the trial of one of these tools you should reflect over the following lessons I learned from helping customers implement this.

1. Does the database really contain all the data you need?
You may find that not all data you need exist in the database. If so, you need to find a way to add that data and that can unfortunately prove to be a rather cumbersome process in some cases. You might even need to consider implementing a Product Information Management System.

2. Is the data in the database correct?
A common problem you might discover is that some data, for instance product names, are incorrect. They might be the correct internal names but not the names used when marketing the product.

3. Is the data accessible?
If the database you want to access is a proprietary ERP-system you might have trouble accessing the data. If you can’t access the data directly via ODBC or similar, make sure the data can be exported into a text file or something similar.

4. Be prepared to change the way you work
Usually proof reading and approval are performed by other people than the people creating the publications. If the people doing the proof reading and approval are used to marking text adjustments on the physical proof, this have to change. They now have to make the changes themselves directly in the database instead.

5. Image database or a Digital Asset Management system?
If you want to include images into the database publishing process you need to have your images structured. They need to be tagged with, or named according to a unique identifier. Something that can be connected to a product in the exported data. If you have many images per product you also need a way of identifying which image should be used.

If this still seem tough, I will offer a last piece of advise. If you have a database solution serving the web, I’m willing to bet you can check many of above points right off away. Looking into the systems powering the web site is usually a good place to start.

To summarise. Database Publishing doesn’t have to involve heavy scripting and isn’t just something for the really big productions. There is another approach that not that many are aware of. The “pull” approach. With standard commercial plugins data can be linked to layouts just like images, making data population and data updates a very easy. This makes database publishing feasible for pretty much any publication containing product information.

The great benefit of all this is that you will be able to shorten production times, which in return will make it possible to either same money or find the time to create other publications. Perhaps all those targeted or personalised publications you are unable to create today?

Feel free to drop a comment if you have questions or need advise.
Good luck!


  1. I totally agree on all points, but Easy Catalog would be great if it supported web services or some other standard that would allow for a modern system setup. As I understand it, you need to either supply a database access using ODBC or export a file to disk. This is not a good way to go and makes it hard to implement as most modern systems (CMS, PIM, DAM etc) is not designed for this. There is a solution from Priint that would be great if not for the price tag and some other drawbacks. ( All the ones that I have evaluated have failed, in one way or another, to deliver all that our customers need. Someone will have to develop it me thinks!

  2. @Mr Meta
    You are very much correct in that Easy Catalog would benefit if it supported Web Services.

    That said, until today, I haven't come across a situation where I've actually needed any such funtionality.

    As you said Easy Catalog supports ODBC or export file (even combining export files from different data sources). The common scenario is using some sort of export file. This could be .csv or .xml and quite often it's the same export file that's currently being used for exporting to web.

    Of course, I have never used Easy Catalog together with any PIM-system. The reason for that is, of course, that all good PIM-systems already have their own "Easy Catalog"-like solution.

    Perhaps Easy Catalog should best be positioned as the state-of-the art independent solution you should use if you don't have a PIM- och CMS-system with similar functionality?