View Categories

extract-before-paragraph

3 min read

Table of Contents

Syntax

extract-before-paragraph
extract-after-paragraph

Purpose #

The annotations extract-before-paragraph and extract-after-paragraph are used to pull out an element from another element and place it in a new paragraph either directly before or after the location from which it was extracted. This is often used to remove things like images and tables from headings, due to the insertion point occurring within a sentence or heading rather than just before or just after it. Many software programs render these items as we expect to see them, so it isn’t always known that an insertion point is in the middle of a title, for example. One method of handling this type of issue is in pre-conversion cleanup, making sure that insertion points don’t break up other content. These annotations are used for cases where pre-conversion cleanup wasn’t possible or simply missed an example or two. It is important to note that these annotations work on spans, so you may need to use the on feature to accurately identify the element you want to extract. For more information on how to use the on feature, please see the on page.

Example:

Here we see the heading and image appear fine in the source document. This is because the insertion point is not always visible and because source formats can often present the image in a paragraph after the heading instead of in the middle of it or at the beginning of it.

extract-source

Here we see the image in the Rules Editor appearing at the start of the heading. The Rules Editor is placing the image exactly at its insertion point.

extract-rules-editor-view

Let’s apply the extract-after-paragraph annotation now to clean this up. This is what your rule may look like.

extract after paragraph rule

The following shows you what the output may look like after this annotation is applied. The image has been removed from the heading and appears after it in a new paragraph.

<concept>
  <title>Issue 7: Insertion points for images appearing in headings</title>
  <conbody>
  <p><image href="image1.jpg" width="15.024cm" height="11.27cm"/></p>
  <p>
  Sometimes insertion points for images end up in headings or other text that you don’t
  want to contain an image. There are ways to work around this in Migrate, using some
  of the content manipulation annotations.
  </p>
  </conbody>
</concept>

Let’s apply the extract-before-paragraph annotation now. This is what your rule may look like.

extract-before-rule
The following shows you what the output may look like after this annotation is applied. The image has been removed from the heading and appears in a new paragraph at the end of the previous topic in the document.

  <p>
  Other issues with tables occur where they are not really tables but just used for
  visual representation.
  </p>
  <p>
    <image href="image1.jpg" width="15.024cm" height="11.27cm"/>
  </p>
  </conbody>
</concept>