In the early days of WebManager 9, I did a content migration by reading data from a set of XML files and storing them in WebManager. That was a hell of a job! The last couple of weeks I’ve been doing about the same thing, but this time it was much easier. Why? We now have the Connector API and the Content API!
Basically all I had to do was reading a set XML files that each contain a set of articles and create media repository articles for each of those articles. With special thanks to my colleague Ivo Ladage for building a framework for transferring data from a source of one type to a destination of another type, that made my task a lot easier. The only things I had to build myself was reading XML and writing media repository.
The connector API is implemented in a the wmsconnectorapi WCB and is freely available on WCMExchange. After downloading and building it (mvn clean package), there is a very nice JavaDoc that is very helpful in understanding how to implement your own connector and finish things up. Actually, all that needs to be done is creating a service that implements either the ImportConnector or the ExportConnector interface.
When you have at least one import connector and one output
connector installed in WebManager, you can open the … panel and connect them
together. You have now two options: either you start the transfer manually, by
clicking the “Run now” button on the “Jobs” tab or you enter a schedule. When
the job starts, the framework sets up the importer and the exporter and invokes
doImport() on the ImportConnector to obtain an iterator, which is passed to the
doExport() method of the ExportConnector. At this point, the ExportConnector is
in control, requesting one item at a time from the ImportConnector.
Let’s start with the very easy part: the ExportConnector.
The interested reader is referred to the excellent JavaDoc, but basically the
framework invokes the initialize() method to pass some objects, like a logger
and a mapper of parameter values, invokes preExport(), doExport() and
postExport() and finishes with an invocation of destroy(). My preExport()
method parses the parameter values received on initialize() and stores them in
a convenient way, such that I can use them during doExport(), which has the
following overall structure:
public void doExport(IteratorctoIterator) throws ExportException { while (ctoIterator.hasNext()) { final ContentTransferObject cto = ctoIterator.next(); MediaItem mediaItem = myMediaRepositoryManagementService.createMediaItem(myWebsite, CONTENT_TYPE_NAME); MediaItemVersion mediaItemVersion = mediaItem.getVersions()[0]; mediaItemVersion.setExternalId(cto.getExternalId()); // Fill the media item version metadata and place elements on it myMediaRepositoryManagementService.performPostUpdateActions(mediaItemVersion); myLogger.info("Imported " + cto.getExternalId() + " to media item " + mediaItem.getId()); myLogger.incrementExportCreatedCount(); } }
The import side is also easy, but has a small twitch. For reading the XML files, I use the Castor framework. I could equally well have decided to use build a SaX handler, but there are some people near me that already know Castor, so I decided to use their experience. In either case, you pass an XML source (a Reader object in my case) to the Castor or SaX framework and it parses the document with one method invocation. However, the Connector API puts the exporter in control, so the importer should deliver one object at a time. So we have two frameworks that both want to be in control…
To solve this, I decided to use the producer / consumer design pattern. I start a new thread that parses the XML document and each time a complete article element is processed, an object is added to a BlockingQueue. Now, the Iterator returned by the doImport() method takes these objects from the BlockingQueue and to return them to the ExportConnector. The doImport() looks like:
public IteratordoImport() throws ImportException { return new UnmarshallerThread(myLogger, myReaders, myFactory); }
The UnmarshallerThread is available for download. Although it references some other classes and a castor-mapping.xml file, these aren’t necessary to understand to basic concept. The UnmarshallerThread class still has to be finished by adding some error checking and interrupt handling, e.g. to prevent producer threads to hang around when the job is aborted or the ExportConnector throws an exception and stops reading from the Iterator.
Mark is software engineer with a special interest in Security and Digital WebTV. Mark writes about daily engineering with GX WebManager
Other blog entries: