Simple Text Patterns and Regular Expressions
You can use regular expressions when searching to specify a set of strings that all match the regular expression.
Simple Text Patterns
A simple text pattern is a combination of characters that can be supplemented by an asterisk '*' as a placeholder for any number of any type of character.
You can use simple text patterns when searching for names.
Examples of simple patterns:
- Anna returns Anna
- An* returns Anna and Anthony
- *nn* returns Anna and Hannah
Using Regular Expressions
You can use text patterns or regular expressions in various Innovator components to specify not just a single string but a set of strings that all match the text pattern or the regular expression.
Regular expressions are used in Innovator in the following places:
-
In the Innovator model editor when searching for contained text in specification texts or annotations
-
In the configuration editor to describe a name constraint
Here, an entered name is checked for validity against this pattern and then approved or rejected.
The regular expressions in Innovator use a subset of the options for forming text patterns described, for example, in Wikipedia (http://en.wikipedia.org/wiki/Regular_expression).
Forming Regular Expressions
This section explains how to form regular expressions.
-
In Innovator, regular expressions are always evaluated within a paragraph.
Example:
Searching for class.*multiple returns the following text.
Searching for class.*attributes does not return this text.
The class has multiple
- attributes
-methods
-
Each character is represented by itself apart from the special meta character:
[ ] ( ) { } | ? + - . * ^ $ \
-
If you want to search for special meta characters themselves, they must be masked with a backslash '\'.
Examples:
surname\\first name returns surname\first name
\[a\] returns [a]
\^hat returns ^hat
dollar\$ returns dollar$
-
The minus sign '-' only has a special meaning if it is in square brackets.
-
A set of characters is defined between square brackets []. The expression then stands for a character from this set.
The set can be defined by enumeration of its elements and/or specifying an area in the ASCII table. Such ranges are specified by their first and last characters, separated by a minus sign '-'.
Within these square brackets, special characters have no meaning, i.e. '\', '$', '^', and '.' represent themselves. This means that the beginning of a line cannot be defined as an element of a set, for example. However, the hat '^' has another special meaning in a certain position within square brackets. If a hat '^' immediately follows an opening square bracket '[', the complement of the specified set is searched for.
Examples:
m[aiu]ster returns master, mister or muster
test[1-4] returns test1, test2, test3 or test4
[^A-Za-z] returns all characters apart from letters
-
A period '.' represents any character.
Example:
function(.) returns function(a), function(3) ...
-
Any multiplicities of the preceding expression are shown in curly brackets {}.
Examples:
{n} indicates that the expression must exist precisely n times
{n,} indicates that the expression must exist n times
{n,m} indicates that the expression must exist at least n times and no higher than m times
{0,m} indicates that the expression must no exist more than m times
-
A question mark '?' indicates that the preceding expression may optionally exist and corresponds to {0,1}.
Example:
D[- ]?[0-9]{5} finds German zip code with the country code 'D', 'D ' or 'D-'.
-
An asterisk '*' indicates the number of repetitions of the preceding expression (also none) and corresponds to {0,}.
Examples:
off[a-z]* returns off, offline, office ...
[1-9][0-9]* returns all natural numbers
-
A plus '+' indicates that the preceding expression exists at least once and corresponds to {1,}.
Example:
[ab]+ finds a, b, aa, bbaab etc.
-
A hat '^' represents the beginning of a line.
Example:
^variable returns variables at the beginning of a line
-
The dollar sign '$' represents the end of a line.
Example:
term;$ returns term; at the end of a line
-
The vertical line '|' means 'or'. Here, the alternatives are separated from the remaining text using parentheses.
Example:
(in|out)put returns input and output
A more complex example for a date (does not check for the number of days in a given month!):
^(0[1-9]|[12][0-9]|3[0-1])\.(0[1-9]|1[0-2])\.(19|20|21)[0-9][0-9]$