How To Build Fast-Flow Content Conversion Pipelines With OmniMark

OmniMark Fast-Flow Content Conversion Pipeline Components:

Parsers. OmniMark includes parsers for many common data formats, including XML, SGML and RTF. You can also write custom parsers using OmniMark script.
OmniMark Script. For creating your own filters, parsers, validators and business rules, OmniMark provides a high-level scripting language designed for creating content conversion components that operate in a streaming pipeline environment.
Multiple Inputs. An OmniMark conversion pipeline can integrate content from multiple sources into a single content stream. Supported scenarios include combining similar content from different suppliers or enriching content with data drawn from corporate databases on the internet.
Filters. OmniMark includes a number of pre-built filters for common content conversion operations. You can use these filters in your pipelines or use them as templates for developing your own filters.
Multiple Outputs. An OmniMark pipeline can be split to send output to two or more different destinations. Support scenarios include the output of the same content to multiple formats, and splitting a single content stream into two different streams with different content in each.
Database Interface. OmniMark can pull data from or send data to most popular databases.
File System Access. OmniMark provides complete access to local networked file systems.

Enabling Organizations To Meet Critical Demand

Large organizations today need to process increasing volumes of content, including corporate data, office documents, plain text and markup (XML, SGML, HTML), for delivery to enterprise information portals or supply chain partners.

When you need to acquire content from multiple sources, and convert, transform, validate and integrate it into your mission critical business systems, the processing of that content can rapidly become a major business issue.

Processing bottlenecks can develop within enterprise information architectures that are designed to provide real-time delivery of content to hundreds, or possibly thousands of online users. When you need to modify your system to handle new content types or integrate new business rules, and processing volumes increase substantially, bottlenecks can increase and system performances can rapidly deteriorate.

Building high-performance content conversion solutions requires specialist content engineering skills, supported by specialist processing tools.

Event-Based Parsing

With conventional tools, there is little more you can do to optimize the development process and/or the overall conversion time. With OmniMark, however, there is another option. OmniMark allows you to create conversion pipelines which can be broken down into smaller steps without the need to serialize and parse the data between each conversion step.

Like some other tools, OmniMark uses an event-based parsing approach. Unlike other tools, however, OmniMark allows you to combine multiple parsing sources in a common parse event stream, and to generate parse events at each stage in the pipeline. Because each filter in the pipeline can catch incoming parse events and insert new parse events into the parse event stream, there is no need to serialize data between filters, which means the pipeline runs faster and uses fewer resources.

Solving the Time Crunch

Because there is no need to serialize and parse between each step, you can break the process down much more finely, which keeps each filter as simple as possible and allows you to build a library of reusable filters. This helps you to maintain and update your conversion pipeline with minimal effort and disruption.

Because OmniMark is a full-feature content processing platform, there is no need to use different programming languages for different parts of the process. All the capabilities you need for the content processing are present in OmniMark. Taken together, these features provide the solution to the content conversion time crunch: rapid development execution add up to rapid completion of the content conversion.

The OmniMark Solution

OmniMark allows developers to build efficient content conversion pipelines that support the rapid insertion of multiple content filter elements without loss of processing speed. Organizations can easily create purpose-built conversion pipelines that enable them to convert structured, semi-structured and unstructured content, even content that is unique to their business in either it’s format or meaning.

The modular nature of the OmniMark pipeline architecture means that content conversion specialists can develop plug-and-play conversion modules that can be swapped into the pipeline architecture as needed, with confidence, and without impacting the flow of the working pipeline.

OmniMark offers outstanding speed, scalability and stability, regardless of the format or semantics of the content being processed or the business rules that are applied during processing. It is the superior performance for high volumes, time sensitive content conversion environments, that sets Omnimark apart.