In order to appropriately parse an incoming text file, RegexXMLReader needs information to apply to the incoming text file. This information is an XML document that I call a Regular Expression Stylesheet, or simply "the Stylesheet" although there are major differences between this type of stylesheet and an XSLT stylesheet, which it it somewhat patterned after.
The stylesheet is made up of various directives which are essentially commands that are in the required namespace of RegexXMLReader, "http://regexxmlreader.sourceforge.net/1.0". Using these commands, the document is transformed into a series of SAX events thus turning the arbitrary text file into an XML document.
One concept that must be made clear is what I refer to in the documentation is contextual text. The contextual text is the entire incoming text file at the beginning of processing - it is just one large stream of data. As processing ensues, this stream of text usually changes to the more relevent parts; that is to say, the context changes for each inner child in the RegexStylesheet. This contextual text normally becomes less and less depending on the directives that are placed on its parent.
For example, consider the replace directive. This directive modifies the contextual text stream and that directive's children are then processed on that modified version. At the same level of replace there is no change in the contextual text stream. Rather, the change is only apparent for the directive's children:
<!-- assume that the contextual text at this level is: "abcdefg hijklmno pqrstuvwxyz" --> <re:replace regex="^." with="X" xmlns:re="http://regexxmlreader.sourceforge.net/1.0"> <!-- the contextual text is now: "Xbcdefg hijklmno pqrstuvwxyz" --> <re:for-each split=" "> <!-- In this case the contextual text will change for each itteration through the split parts, namely: 1) Xbcdefg 2) hijklmno 3) pqrstuvwxyz etc. --> </re:for-each> <!-- The contextual text is what it was prior to the previous directive: "Xbcdefg hijklmno pqrstuvwxyz" --> </re:replace> <!-- And now it is the same way it was at the beginning: "abcdefg hijklmno pqrstuvwxyz" -->