16. December 2022 Regex
What’s a Regular Expression (RegEx)?
A regular expression, known as RegEx, defines a search pattern for character strings using syntactical rules. A familiar example for the usage of a regular expression describes the search in texts for numbers, characters, special characters or complete terms. The regular expression serves as pattern to find a definite character string within the text. Furthermore, regular expressions are used for validations e.g. to validate email addresses or credit card numbers.
How does a Regular Expression work?
A regular expression can be build of a
- singleton set (number or character),
- multi-element set (numbers and/or characters) or
- multi-element set within metacharacters.
The metacharacters within the regular expression describes definite constructions or arragements of the characters.
Syntax of Regular Expressions and examples
- Single element patterns
- [1,2,3,4,5,6] could be shortened to [1-6]: all numbers from 1 to 6 are valid
- [a,b,c,d,e,f] could be shortened to [a-f]: all letters from a to f are valid
- [1-24-6]: all numbers from 1 to 6 except for number 3 are valid
- [a-ce-f]: all letters from a to f except for letter d are valid
- Multi element patterns
- [1-6][a-f]: all numbers from 1 to 6 are valid for the first element and all letters from a to f are valid for the second element
- [1-6][a-fA-D]: all numbers from 1 to 6 are valid for the first element and all lower case letters from a to f and also all upper case letters from A to D are valid for the second element
- [7-9]{1,3}[4-6]{0,2}: all numbers from 7 to 9 are valid for at least the first and up to the third element and all numbers from 4 to 6 are valid for the following two elements but it could be none as well
- Multi element patterns with metacharacters
- validate a house number: [1-9][0-9]?[0-9]?[a-zA-Z]?
- the first element is a number from 1 to 9 and mandatory
- the following elements are optional and could be max. two more numbers from 0 to 9 and one lower or upper case letter from a-z/A-Z (indicator is ?)
- validate a email address: ^[\w-\.]+@([\w-]+\.)+[\w-]{2,3}$
- ^ beginning of string
- [\w-\.] matching an character set for alphanumeric elements, underscore, hyphen and point
- + matching one or more
- @ matching @
- ([\w-]+\.) groups multiple elements together and creates a capture group for extracting a substring or using a backreference
- + matching one or more
- [\w-] matching an character set for alphanumeric elements, underscore and hyphen
- {2,3} match between 2 and 3 elements of [\w-]
- $ end of string
- validate a house number: [1-9][0-9]?[0-9]?[a-zA-Z]?
Tools for creation and testing Regular Expressions
SAP Programms:
The example program DEMO_REGEX and its enhancement DEMO_REGEX_TOY makes it possible to test the search and replace functions by applying regular expressions to texts.
Web Tool:
This RegEx Web tool includes explanations to the different parts of the regular expression on top of the general test function.
Test function
If the text matches the regular expression if the text is highlighted in blue.
Explanation function
The bottom part of the page includes an explanation of every single part in the regular expression.
Interesting Links
SAP BLOG
https://blogs.sap.com/2021/09/23/regular-expressions-regex-in-modern-abap/
This blog entry summarises the different possible syntax of Regular Expressions that can be used in modern ABAP, e.g. the meanwhile obsolet POSIX syntax, the PCRE syntax and the XPath & XSD syntax.
RegEx Golf
https://pastafahndung.de/regex-golf/
Try to solve regular Expression problems with less hits as possible! Challenge yourself and train your RegEx expertise.
–>
Lena Giebichenstein & Ivonne Büchner