Tuesday, October 28, 2008

Using the FME GREPPER Factory

Instead of using the typical FME Workbench factories for filtering records by attributes e.g. AttributeFilter, AttributeClassifier or Tester factories, you could use a powerful regular expression processing factory - the Grepper. As the name suggests, it performs a Unix 'grep' like operation on the input attributes. This factory is available under the Strings category of the FME Workbench as shown in the screen shot below.


The screen shot below shows a typical usage of the Grepper factory in the FME Workbench.

In the example above, the records from LAND_PARCELS feature are filtered through the GREPPER factory. Any unmatched records will be output to the new LAND_PARCELS feature; any matched records to the GREPPER's regular expression will be ignored.

Display the GREPPER's parameters to set the regular expression as shown below.


You will have to click the Attribute field and choose an attribute to grep or filter against with the regular expression. In the example above, the field LOT_NO was chosen.

Find Null Strings
If you want to find matches for null strings, i.e. strings with just carriage returns or line feeds, then define the Regular Expression '^$' as shown in the example screen shot above. In plain English, the carat character '^' anchors the expression to the beginning of the string while the dollar '$' character anchors the expression to the end of the string; put together with nothing in between, the expression says 'match all strings with nothing from start to the end of the string'.

Find Strings that Begin with a Character
If you want to find matches for strings that begin with a character such as 'N', then define the Regular Expression '^N'. Some matching strings include the following:
NO12-12345
Nobody

Find Strings that Match a Pattern
The following Regular Expression 'NO\d\d-\d\d\d\d\d' will match any strings with the pattern NOxx-xxxxx where x is any numeric character. Some examples of matching strings include the following:
NO12-12345
Hello World NO12-12345 Hello World

Find Null Numeric Values
The Grepper can be used on numeric fields such as double or integer fields. FME will do a string conversion first before passing the value to the Grepper factory. If you want to filter out any null values in double or integer database fields, you can define the Regular Expression '^$'.

No comments: