| DEELX Regular Expression Engine FeaturesResearch project of RegExLab. 
 Content:
 Coded using templateDEELX is coded using template, so DEELX can be used on char, wchar_t and other base types, such as unsigned char, int etc. But only simple types can be used, not struct or union. For example: 
              
                
                  | #include "deelx.h" 
 int pattern[] = {65, 
                  '+', 66, 0}; // 0 to indicate end
 int text[] = {180000, 
                  65, 65, 66, 0};
 
 CRegexpT <int> regexp(pattern);
 MatchResult result = regexp.Match(text);
 
 if(result.IsMatched())
 {
 printf("start at: %d\nlength: %d\n", result.GetStart(),
 result.GetEnd() - result.GetStart());
 }
 |  More details: 
              [ Program Help] 
 RIGHTTOLEFT match modeDEELX supports right to left match mode, which lets regular expression matches from the end of text to the begin. Within the pattern, the expression in the right side will do match first. To compose a pattern for "RIGHTTOLEFT" is the same as to compose a common 
            pattern. The quantifiers still lays on the right side; ^ still matches the begin of text; Lookahead assertion still matches rightward; The group number still counts from left to right; etc. There is one point should be paid attention to. If RIGHTTOLEFT, the expression in the right side will do match first. If you used backward reference, the group which is refered should lays on the right side. For example: 
              
                
                  | "RIGHTTOLEFT" Mode | Common |  
                  | \2(.*?)('|") | ('|")(.*?)\1 |  The recursive expression has not this limit, so both of the following is right: 
              
                
                  | (?2)(.*?)(") |  
                  | (")(.*?)(?1) |  
 Backward assertion"Backward assertion" is to ask the text on the left side of current position must matches certain expression. The format is (?<=xxx) or (?<!xxx). As \b, it is just a condition of the position, and it will not match any character. For example: 
              
                
                  | Expression | Description |  
                  | \b | Matches word boundary. |  
                  | (?<!\w)(?=\w) | Matches only begin of word. |  
                  | (?<=\w)(?!\w) | Matches only end of word. |  On the function of "backward assertion", the details in Perl, Java, GRETA and DEELX are different from each other: 
              
                
                  | Engine | Description | Example |  
                  | Perl | Works only for fixed-width lookbehind. | (?<=\t)print |  
                  | Java | Non-fixed width, but maximum needed. | (?<=\{\s{0,100})print |  
                  | GRETA | Non-fixed width, but the negative has some bugs. | (?<=\{\s*)print |  In DEELX: DEELX uses RIGHTTOLEFT mode to match "Backward assertion". The backward assertion has the same logic as lookahead assertion, except the direction. So, in DEELX, the backward assertion works for non-fixed width lookbehind. For example, in DEELX: 
              
                
                  | Text | Expression | Result |  
                  | {print} | (?<!\{\s*)print | Fail |  
                  | (?<=\{\s*)print | Success |  
 Easy to useThe whole DEELX is coded using template, so there is no cpp or lib. All the DEELX source code is in a single header file (deelx.h). You need not to create a project for DEELX, nor add any cpp or lib. When runs, no special dll file is needed. You can simplely add an include in your program: 
              
             DEELX is included into your project directly, so need not worry about link problem. 
 CompatibilityDEELX is coded in pure C++, no STL nor MFC class is used. DEELX has been tested on following compiler and OS: 
              
                
                  | Compiler | Version | OS | Remark |  
                  | VC++ | 6.0, 7.1, 8.0 | Windows |   |  
                  | GCC | 3.4 | Cygwin |   |  
                  | GCC | 3.4 | Linux |   |  
                  | Turbo C++ | 3.0 | DOS |   |  
                  | C++ Builder | 6.0 | Windows |   |  
                  | Borland C++ | 5.5 | Windows |   |  
                  | GCC | 2.7 | FreeBSD 2.2 |   |  
                  | GCC | 3.4 | Solaris 10 Unix | http://www.unix-center.net/ |  We have not tested it on other platforms. If you have compiled on other platforms, you can tell us whether you have succeeded or failed, through regexlab@gmail.com. Thanks very much. 
 Named groupDEELX supports named group of Python style and .NET style: 
              
                
                  | Expression | Style | Description |  
                  | (?P<the_name>xxxx) | Python | To define a named group 'the_name' |  
                  | (?<the_name>xxxx) | .NET |  
                  | (?'the_name'xxxx) |  After a successful match, the captured content of named group can be gotten through the name of the group. DEELX allows more than one groups to have the same group name. Logically, they are the same group, for example: (?<string>".*?")|(?<string>'.*?') 
            . If one named group contains another group which has the same name, the one which is contained will do not capture, for example: (?<number>(?<number>\d+)\.?) 
              . Other features concerned: backward reference, recursive expression, conditional expression, replace operation: 
              
                
                  | Feature | Expression | Style | Description |  
                  | Backwardreference
 | \g<the_name> | Python | Backward refer to 'the_name' |  
                  | \k<the_name> | .NET |  
                  | \k'the_name' |  
                  | Recursiveexpression
 | (?R<the_name>) |  | Recursive refer to named 'the_name' |  
                  | (?R'the_name') |  
                  | Conditionalexpression
 | (?(the_name)yes|no) | .NET | According to if named 'the_name' captured |  
                  | Replace | ${the_name} |   | Stands for what captured by 'the_name' |  
 Conditional expressionAccording to a certain condition, conditional expression matches one of two options. The condition can be: 
              Whether a certain group captured or not.Whether a certain expression can match at current position or not. Conditional expression and description: 
              
                
                  | Expression | Characteristic | Description |  
                  | (?(1)yes|no) | Is number | If group 1 captured, match yes, else match no |  
                  | (?(?=a)aa|bbb) | Assertion | If next char is 'a', match aa, else match bbb |  
                  | (?(xxx)aa|bbb) | Not named group | If there is no named group 'xxx', same as (?=xxx)
                   |  
                  | (?(name)yes|no) | Is named group | If there is named group 'name', captured or not |  More: 
              If it is RIGHTTOLEFT mode,   (?(xxx)aa|bbb) 
                is same as (?(?<=xxx)aa|bbb)
                .If there is only one option, the option will do match if the condition success.If there are more than 2 options separated by "|", only the first "|" is regarded as conditional expression's separator. For example: (?(?=xxx)yes|no1|no2), 
              if condition success, "yes" matches, or "no1|no2" matches. 
 Recursive expression"Recursive expression" is to refer to another part of expression itself, not to refer to what is captured. For example: 
              
                
                  | Expression | Equivalent1 | Equivalent2 | Matches |  
                  | (\w)(?1) | (\w)(\w) |   | ab |  
                  | (?1)(\w(?2))(\d) | (?1)(\w(\d))(\d) | (\w(\d))(\w(\d))(\d) | a1b23 |  If the part which is referred, contains the "recursive expression", this makes a recursion. For example: 
              
                
                  | Expression | Equivalent1 | Equivalent1 | Matches |  
                  | (\w(?1)?) | (\w(\w(?1)?)?) | (\w+) | ghjk5...... |  
                  | \(([^()]|(?R))*\) | \(([^()]|\(([^()]|(?R))*\))*\) |   | (a * (c + 2)) |  Formats of recursive expression: 
              
                
                  | Format | Description |  
                  | (?R) | Recursively refer to the whole expression. |  
                  | (?R1), (?R2) | Recursively refer to certain group. |  
                  | (?1), (?2) |  
                  | (?R<named>) | Recursively refer to named group. |  
                  | (?R'named') |  
 Prevent infinite loopIf a repeated patterns matches zero-length substring, it may cause infinite loop in many engines. DEELX has resolved this problem and find a technical method to prevent infinite loop on that situation. For example: 
              
                
                  | Expression | Description |  
                  | (a*)* |  a* matches zero-width, but it will not cause infinite loop in DEELX |  
                  | (a|b|c?)* | One option matches zero-width, but it will not cause infinite loop in DEELX |    |