Dag Hovland

Search using regular expressions with numerical constraints

The article Regular Expressions with Numerical Constraints and Automata with Counters (ICTAC 2009) describes a new algorithm for searching for strings matching regular epxressions with numerical constraints in text documents. The algorithm is implemented in a manner similar to Gnu egrep (http://www.gnu.org/software/grep). It lacks features to be a full replacement for grep.

The algorithm is, unlike the one used in GNU grep, polynomial time, but only extended regular expressions that are counter-1-unmabiguous are allowed. The program exits with a warning if given other epxressions. With the flag "-g" it outputs information about the "Finite Automaton with Counter" (FAC) which is constructed by the program, and how the FAC matches the string.

Building

Extract with "tar -xjvf facgrep.tar.bz2", move into the directory with "cd cgrep" and run "make".

Running

The executable is "grep" and lies in the folder "cgrep" made in the section above. The available flags are "-x" for exact matches, also called "line-regexp" (usually it returns a match if some part of a line matches) "-g" for printing out the full FAC and matching process, "-v" for inverted match, "-n" for printing line numbers of occurrences, "-E" for compatibility with grep (extended regular expressions are the standard) and "-c" for counting number of matches. Note that the regular expressions supported are those usually called "extended regular expressions".

Download

facgrep.tar.bz2

Last updated: 11. September 2009

Valid XHTML 1.0 Strict