View on GitHub

NEH Institute materials

July 2017

Home | Admin | Week 1 | Week 2 | Week 3 | Misc

Regular Expressions 2

Last session we covered simple patterns and repetition. We also did some exercises on this using egrep. Today we firstly want to cover alternation and grouping before we continue using egrep with more andvanced expressions. Later on we will start compare egreps REs to python.

Alternation

Alternation is the RE equivalent of or. word|weapon matches words in About words and other mighty weapons. Applied again the RE alternation matches weapons. You can add as many alternatives as you want, e.g. letter|syllable|word|phrase|sentence|paragraph.

Alternation has the lowest precedence of all RE operators.

Exercise: Find the preference for all types, e.g concatenation, repetition and alternation.

Grouping

Since we introduced precedence in the previous section we also want to be able to change the behaviour. This is one of the things grouping does.

Metacharacter Explanation
( starts a group
) ends a group

Precedence examples

StringPatternMatch?
word and phrase levelword|phrase levelYes, both word and phrase level
walked up to the talking lamp posted\b|ing\bYes, both ed at the end of walked and ing at the end of talking
word level and phrase levelword|phrase levelYes, but only word and phrase level (not all of word level)
word level and phrase level(word|phrase) levelYes, both word level and phrase level

In addition to use the grouping metachararacters to alter the precedence you can use it for back reference. Some RE implementations have named grouping back references others just \1, \2 etcetera.

Exercise: Check out how this is in egrep.

Comparing to Python

Without going into actual Python programming we are going to see how the egrep REs compare to Python’s:

Make sure to tick Python in the Flavour list to the left.