[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
13.3 An Extended Example
Here’s an extended example from Friedl that covers many of the features
described above. The problem is to fashion a regexp that will match any
and only IP addresses or dotted quads, ie, four numbers separated
by three dots, with each number between 0 and 255. We will use the
commenting mechanism to build the final regexp with clarity. First, a
subregexp n0-255
that matches 0 through 255.
(define n0-255 "(?x: \\d ; 0 through 9 | \\d\\d ; 00 through 99 | [01]\\d\\d ;000 through 199 | 2[0-4]\\d ;200 through 249 | 25[0-5] ;250 through 255 )")
The first two alternates simply get all single- and double-digit numbers. Since 0-padding is allowed, we need to match both 1 and 01. We need to be careful when getting 3-digit numbers, since numbers above 255 must be excluded. So we fashion alternates to get 000 through 199, then 200 through 249, and finally 250 through 255.(9)
An IP-address is a string that consists of
four n0-255
s with three dots separating
them.
(define ip-re1 (string-append "^" ;nothing before n0-255 ;the first n0-255, "(?x:" ;then the subpattern of "\\." ;a dot followed by n0-255 ;an n0-255, ")" ;which is "{3}" ;repeated exactly 3 times "$" ;with nothing following ))
Let’s try it out.
(pregexp-match ip-re1 "1.2.3.4") ⇒ ("1.2.3.4") (pregexp-match ip-re1 "55.155.255.265") ⇒ #f
which is fine, except that we also have
(pregexp-match ip-re1 "0.00.000.00") ⇒ ("0.00.000.00")
All-zero sequences are not valid IP addresses! Lookahead to the rescue.
Before starting to match ip-re1
, we look ahead to ensure we don’t
have all zeros. We could use positive lookahead to ensure there
is a digit other than zero.
(define ip-re (string-append "(?=.*[1-9])" ;ensure there's a non-0 digit ip-re1))
Or we could use negative lookahead to ensure that what’s ahead isn’t composed of only zeros and dots.
(define ip-re (string-append "(?![0.]*$)" ;not just zeros and dots ;(note: dot is not metachar inside []) ip-re1))
The regexp ip-re
will match all and only valid IP addresses.
(pregexp-match ip-re "1.2.3.4") ⇒ ("1.2.3.4") (pregexp-match ip-re "0.0.0.0") ⇒ #f
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated on March 31, 2014 using texi2html 5.0.