RegexXMLReader Documentation: An Implementation of the XMLReader Interface
Prev	Chapter 2. The RegexXMLReader Stylesheet	Next

2.2. Directives

2.2.1. for-each

This directive splits up the textual context stream on certain criteria found in its one of two attributes and then applies its children upon the relevant text.

Table 2-1. for-each Attributes

regex	Each matching part of this elements contextual text is separated and the children of this node are processed using each individual part of the matching text.
split	The contextual text is broken up on this attribute's regular expression contents and the children of this node are applied on each individual part.

During each itteration against each matching part of the incoming text stream or split part of it, the contextual text stream is modified to only that part that is pertinent. For example, <for-each regex=".">. Beacause this particular regular expression (".") matches each and every character in the incoming contextual text stream, its children are processed upon each and every character and that is they are aware of and have access to (there is one exception to this, however).

2.2.2. replace

This directive modifies the current textual context according to the given criteria and applies the modified textual context upon its children. It is useful for such things as stripping out newlines or certain characters as well as normalizing text.

Table 2-2. replace Attributes

regex	The regular expression used to match the current contextual stream.
with	The character data that replaces the regular expression supplied with regex.
trim	Invoke the Java String command trim() upon the newly generated text prior to processing the children.

2.2.3. group

The group directive is used for processing either a regular expression that contains grouping commands (e.g. "([^,]+)") or an individual part of the split directive. The text of the group becomes the current contextual text.

A group's location determines which part of the match or split that is used unless the item attribute is used.

Table 2-3. group Attributes

item	Use the numerical value in this attribute as the determinent for which part of the matched grouped text or split part and not the order that the group node appears under its parent.

2.2.4. match-string

Sends the entire contextual text string to the output document.

2.2.5. text

Sends literal text to the output document.

Example 2-1. text Example


<output xmlns:re="http://regexxmlreader.sourceforge.net/1.0">
  <re:text>Text Sent to Output Document</re:text>
</output>

2.2.6. warning

Sends any textual-node children to the ErrorHandler as a warning.

Example 2-2. warning Example


<output xmlns:re="http://regexxmlreader.sourceforge.net/1.0">
  <title re:match="^[A-Z]+$">
    <re:match-string/>
  </title>
  <re:otherwise>
    <re:warning>
      <re:text>No match for: </re:text>
      <re:match-string/>
    </re:warning>
  </re:otherwise>
</output>

2.2.7. error

Sends textual-node children (not raw-text) to the ErrorHandler as a non-fatal error, an exception is thrown and processing stops (thus, this is a fatal error although it is not reported to the ErrorHandler as such.

Example 2-3. error Example


<output xmlns:re="http://regexxmlreader.sourceforge.net/1.0">
  <title re:match="^[A-Z]+$">
    <re:match-string/>
  </title>
  <re:otherwise>
    <re:error>
      <re:text>No match for: </re:text>
      <re:match-string/>
      <re:text>Please correct this and try again.</re:text>
    </re:error>
  </re:otherwise>
</output>

2.2.8. otherwise

During the course of processing, a count is kept for each match upon a contextual text stream for each group of siblings (or, in other words, each group of children of a node has their own count beginning with zero [0]). The children of this directive are not processed unless the count is zero (0), which is to say that this node should only be processed if there has not been a match prior to this element.

Example 2-4. otherwise Example


<output xmlns:re="http://regexxmlreader.sourceforge.net/1.0">
  <title re:match="^[A-Z]+$">
    <re:match-string/>
  </title>
  <content re:match="[A-Za-z]+$">
     <re:match-string/>
  </content>
  <re:otherwise>
    <unknown>
      <re:match-string/>
    </unknown>
  </re:otherwise>
</output>

2.2.9. attribute

This adds an attribute to the previous output element; it is not for adding an attribute to any stylsheet directive.

Table 2-4. attribute Attributes

name	The name of the attributue.

Example 2-5. attribute Example


<output xmlns:re="http://regexxmlreader.sourceforge.net/1.0">
  <part re:match="^([A-Z]+)(.*)$">
    <re:group>
      <re:attribute name="title">
        <re:match-string/>
      </re:attribute>
    </re:group>
    <re:group>
      <re:match-string/>
    </re:group>
  </part>
</output>

The above example places the first matching group in an attribute entitled "title" within the <part> element.

2.2.10. match

Causes the given regular expression in the supplied regex attribute to be applied upon the current contextual text stream and if there is a match then children of this directive are processed.

Table 2-5. match Attributes

regex

The regular expression that must match for the children of this directive to be processed.

Example 2-6. match Example


<output xmlns:re="http://regexxmlreader.sourceforge.net/1.0">
  <re:match re:regex="^([A-Z]+)(.*)$">
    <re:group>
      <title>
        <re:match-string/>
      </title>
    </re:group>
    <re:group>
      <description>
        <re:match-string/>
      </description>
    </re:group>
  <re:match>
</output>

2.2.11. split

The split causes the current contextual text to be split using the regular expression found in one of the two attributes split or regex. It is important to note that both of these attributes do the exact same thing; two different attributes are provided for convenience.

The split does not itterate at all over the resulting parts of the contextual text. Rather, it simply breaks up whatever that contextual text stream is based on the given criteria. split is usually soon followed by one or more group directives. For itteration, however, the for-each directive is provided.

Table 2-6. split Attributes

split	The regular expression for which the contextutal text is split upon.
regex	The regular expression for which the contextutal text is split upon.

Example 2-7. split Example


<output xmlns:re="http://regexxmlreader.sourceforge.net/1.0">
  <information>
    <re:split re:split=",">
      <re:group>
        <re:attribute name="title">
          <re:match-string/>
        </re:attribute>
      </re:group>
      <re:group>
        <part-number>
          <re:match-string/>
        </part-number>
      </re:group>
    </re:split>
  </information>
</output>

2.2.12. call-template

This directive causes the execution path to change to the node referenced by the ref-id attribute, which must be an id in the processor's namespace (http://regexxmlreader.sourceforge.net/1.0) although it does not need to be a directive.

Table 2-7. call-template Attributes

ref-id

A reference to a id attribute in some other node.

Example 2-8. call-template Example


<output xmlns:re="http://regexxmlreader.sourceforge.net/1.0">
  <page-number re:id="process-page-number" re:match="[0-9]+">
    <re:match-string />
  </page-number>
  <entry re:match="^([A-Z]+) +([0-9]+)$">
    <re:group>
      <title>
        <re:match-string />
      </title>
    </re:group>
    <re:group>
      <re:call-template ref-id="process-page-number"/>
    </re:group>
  </entry>
</output>

The above is operates exactly the same as:


<output xmlns:re="http://regexxmlreader.sourceforge.net/1.0">
  <page-number re:match="[0-9]+">
    <re:match-string />
  </page-number>
  <entry re:match="^([A-Z]+) +([0-9]+)$">
    <re:group>
      <title>
        <re:match-string />
      </title>
    </re:group>
    <re:group>
      <page-number re:match="[0-9]+">
        <re:match-string />
      </page-number>
    </re:group>
  </entry>
</output>

2.2.13. wrapper

This directive is simply a place holder and is meant to be used in conjunction with the call-template directive. wrapper is simply a place holder.

Example 2-9. wrapper Example


<output xmlns:re="http://regexxmlreader.sourceforge.net/1.0">
  <information>
    <title re:match="^[A-Z]+$" re:id="handle-title">
      <re:match-string/>
    </title>
    <re:wrapper id="deal-with-pages">
      <page re:match="^[0-9]+$">
        <re:match-string/>
      </page>
      <page-range re:match="^([0-9]+)-([0-9]+)$">
        <re:group>
          <start>
            <re:match-string/>
          </start>
        </re:group>
        <re:group>
          <end>
            <re:match-string/>
          </end>
        </re:group>
      </page-range>
      <!-- note that wrapper resets the match count -->
      <re:otherwise>
        <re:warning>
          <re:text>Pages did not match: </re:text>
          <re:match-string/>
        </re:warning>
      </re:otherwise>
    </re:wrapper>
    <re:match regex="^([A-Z]+) ([-0-9]+)$">
      <re:group>
        <re:call-template ref-id="handle-title"/>
      </re:group>
      <re:group>
        <re:call-template ref-id="deal-with-pages"/>
      </re:group>
    </re:match>
  </information>
</output>

2.2.14. pull-out-following-split

FIXME:

Table 2-8. pull-out-following-split Attributes

regex	FIXME:
inforce-order	FIXME:

Prev	Home	Next
The RegexXMLReader Stylesheet	Up	Embedded Attributes