Notes: Perl Lab 09

More Regular Expressions

  1. Wildcard Matching
  2. Pattern Repetition
  3. Alternation
  4. Grouping

  1. Stuff

    We looked at regular expression as part of the section on conditionals in Lesson #3 - Regular expression tests. Note that there are several reserved characters in regular expressions: \ | { [ ( ) ^ $ * + ? .

  2. Wildcard Matching

    The regular expression "wildcard" is the dot or period character (.). When used in a regular expression, it will match any single character in the string other than the newline. For example, the following perl statement will match any line other than an empty line:
    if( $line =~ /./) { print "success\n" }
    Or another example:
    if( $string =~ /^...x/) { print "success\n" }
    In the above example, the if condition would be true for any string where x is the fourth character in the line. So, the string "uvwxyz" would match while "vwxyz" and "tuvwx" would fail.

    Example lab09_0.pl
     
    #!/usr/bin/perl -w
    
    # Name: Mark Tucker
    # Assignment: Lab09 Example 00
    # Description: An example of regular expression wild cards
    #==========================================================================
    
    # create an array of names
    @list = qw( Fred Wilma Betty Barney Pebbles Dino);
    
    # check each element of the array against a regex pattern
    foreach $k (@list) {
        print "'$k' matches i and o\n" if($k =~ /i.o/);
        print "'$k' matches the e\n" if($k =~ /e.$/);
    }
    
    exit;
    # DONE
    
    

    When executed, the script above produces the following output:
     
    [mark@platypus PERL] ./lab09_0.pl  
    'Fred' matches the e
    'Barney' matches the e
    'Pebbles' matches the e
    'Dino' matches i and o
    [mark@platypus PERL] 
    

  3. Pattern repetition

    There are several of ways of repeating the match of multiple characters within a regular expression. A character followed by the (*) character will match zero or more instances of that character. While not particularly useful for being specific in the match, it is most useful when used with the wild card character (.)
    if( $string =~ /^a.*z$/) { print "success\n" }
    The statement above will match a string of any length that begins with the character "a" and ends with the character "z". "az" would match just as well as "abz" or "abcdefgwxyz".

    A character followed by the (+) character will be matched if there are one or more consecutive instances of the character found in the string.

    if($string =~ /mo+n/) { print "success\n" }
    The previous perl statement will print "success" for strings "moon" or "mon" but not for "mn", "omn" or "man".

    Another method for matching repetition is to specify the number of times the item must be repeated. An example:

    if($string =~ /mo{3}n/) { print "success\n" }
    This statement will match "mooon" but not "mmmoooon" or "moon".

    Example lab09_1.pl
     
    #!/usr/bin/perl -w
    
    # Name: Mark Tucker
    # Assignment: Lab09 Example 01
    # Description:  Examples of regular expression repetition
    #==========================================================================
    
    # create an array of animal sounds
    @list = qw( moo cluck baaaa oink bzzzz moomoomoo gobble bark meow);
    
    # check each element of the array against regex patterns
    foreach $k (@list) {
        print "A '$k' matches\n" if($k =~ /o+/);
        print "B '$k' matches \n" if($k =~ /m.*o/);
        print "C '$k' matches \n" if($k =~ /o{2}/);
    }
    
    exit;
    # DONE
    
    

    When executed, the script above produces the following output:
     
    [mark@platypus PERL] ./lab09_1.pl  
    A 'moo' matches
    B 'moo' matches 
    C 'moo' matches 
    A 'oink' matches
    A 'moomoomoo' matches
    B 'moomoomoo' matches 
    C 'moomoomoo' matches 
    A 'gobble' matches
    A 'meow' matches
    B 'meow' matches 
    [mark@platypus PERL] 
    

  4. Alternation

    Sometimes you will want to match a couple of different substrings with the same statement. This may be done with the (|) character.
    if($string =~ /a|b/) { print "success\n" }
    The above statement will match any string which contains the letter "a" or the letter "b".
    if($string =~ /Fred|Wilma/) { print "success\n" }
    Similarly, a multiple matches of a single character can be specified by placing the character options between square brackets ([]).
    if($string =~ /^[ab]c/) { print "success\n" }
    In this example the condition will match any line that begins with the characters a or b followed by the character "c" in the second column. The following example shows how a series of values can be specified to match a single character.
    if($string =~ /[m-z]$/) { print "success\n" }
    This statement will match any line that ends with a final character between m and z in the alphabet.

    Example lab09_2.pl
     
    #!/usr/bin/perl -w
    
    # Name: Mark Tucker
    # Assignment: Lab09 Example 02
    # Description:  Examples of regular expression alternation
    #==========================================================================
    
    # create an array of animal sounds
    @list = qw( Bugs Daffy Yosemite Marvin Porky Wile Pepe Tweety
                Sylvester Foghorn Elmer );
    
    # check each element of the array against a regex pattern
    foreach $k (@list) {
        # note the trailing "i" in the regex that makes it case insensitive
        print "A '$k' matches this\n" if($k =~ /^[a-e]/i);
        print "B '$k' matches that\n" if($k =~ /in|ee/);
    }
    
    exit;
    # DONE
    
    

    When executed, the script above produces the following output:
     
    [mark@platypus PERL] ./lab09_2.pl  
    A 'Bugs' matches this
    A 'Daffy' matches this
    B 'Marvin' matches that
    B 'Tweety' matches that
    A 'Elmer' matches this
    [mark@platypus PERL] 
    

  5. Grouping

    Groups of characters to match can be specified by grouping them within parenthesis "()".
    if($string =~ /^(alpha|beta)/ { print "success\n" }
    For the statement above, the string must begin with "alpha" or "beta" to match.

    Example lab09_3.pl
     
    #!/usr/bin/perl -w
    
    # Name: Mark Tucker
    # Assignment: Lab09 Example 03
    # Description:  Examples of regular expression grouping
    #==========================================================================
    
    # create an array of animal sounds
    @list = qw( Bugs Daffy Yosemite Marvin Porky Wile Pepe Tweety
                Sylvester Foghorn Elmer mississippi);
    
    # check each element of the array against a regex pattern
    foreach $k (@list) {
        print "A '$k' matches\n" if($k =~ /[BW](ug|il)./);
        print "B '$k' matches\n" if($k =~ /(iss)+/);
    
    }
    
    exit;
    # DONE
    
    

    When executed, the script above produces the following output:
     
    [mark@platypus PERL] ./lab09_3.pl  
    A 'Bugs' matches
    A 'Wile' matches
    B 'mississippi' matches
    [mark@platypus PERL] 
    


last updated: 18 Mar 2012 13:02