3SL: Requirements management and model driven systems engineering from concept to creation.
Cradle®
Login:
Username:
Password:
 
Search:  
Visitor not logged in, You are: Home > News > 3SL web based newsletter
 

 

3SL Web-based newsletter for November 2007 [Cradle 5.6]

Regular Expressions

Cradle supports regular expressions as a means to find variable text in places such as:

  • Queries
  • Toolset text editors’ Find dialogues
  • Category recognition strings used by the Source Document Manager

For instance, regular expressions can be used in queries to find all items in which any frame, or a specific frame, or any of a list of frames, contains text matching the regular expression that we are searching for. For example, if we wanted to find all items containing sequences of capital letters followed by numbers, then the regular expression would be:

[A-Z][A-Z]+[0-9][0-9]+

As this example shows, regular expressions are a very powerful method to specify variable text, but their syntax may not be familiar to everyone.

We will describe the main elements of regular expressions here with the Cradle extensions. For further information, please see the Cradle on-line help.

Simple Strings

The simplest regular expression is a simple string, for example:

fred

will match the string fred, or frederick but not the string Fred since Fred starts with a F and not an f. What this means is that in a regular expression, an ordinary character will match itself.

Special Characters

Some characters have special meanings inside regular expressions:

  • A ^ matches the beginning of a string
  • A $ matches the end of a string
  • A . (period) matches any character (including null) except newline

You can disable these special meanings by preceding the character by \

Therefore the regular expression:

^fred

will match the string fred at the start of a line or at the start of a Cradle attribute value, and the regular expression:

fred$

will match the string fred at the end of a line or at the end of a Cradle attribute value.

Lists

A list is a collection of characters inside [ and ] brackets and matches any character in the list. For example:

[Ff]red

will match the string Fred and the string fred.

Any special meaning that a character may have disappears in a list, for example:

[*$]

will match either the character * or the character $.

A list enclosed by [^ and ] will match any character not in the list.

There are some abbreviations for common lists. Inside a list, [:name:] matches any of the elements of the class name, which is any of:

  • alnum : letters and digits
  • alpha : letters
  • blank : space or tab
  • cntrl : control characters (ASCII < 32 and 127)
  • digit : digits
  • graph : same as print except omits space
  • lower : lowercase letters
  • print : printable characters (ASCII 32 to 126 inclusive)
  • punct : neither control nor alphanumeric characters
  • space : space, carriage return, newline, vertical tab and form feed
  • upper : uppercase letters
  • xdigit : hexadecimal digits: 0-9, a-f, A-F

For example:

[:alpha:]

would match any single letter.

Ranges

We can specify a range of characters inside the [], so:

[a-z]

will match any single lowercase letter and:

[A-Z]

will match any single uppercase (capital) letter. These ranges can be combined, so to match any letter, the expression would be:

[A-Za-z]

or it could also be:

[a-zA-Z]

To include a ] character in a list, put it first. To include the – character in a list, place it so that it cannot be a range, for example:

[-a-z]

Repetitions

You can specify a match for repetitions of a regular expression:

  • * matches zero or more of the previous regular expression
  • + matches one or more of the previous regular expression
  • ? matches zero or one of the previous regular expression

For example:

a*

will match zero or more a characters, and:

[A-Z]+

will match one or more uppercase characters.

Note that this meaning of * is different to what most people expect, as we might expect a* to match a followed by zero or more of any character. This is what a* means if finding files on a computer, for example.

The equivalent regular expression would be:

a.*

which means to match an a followed by any character, zero or more times.

Context Delimiters

There are some special sequences to help us find strings in particular contexts:

  • \b : matches the empty string at the beginning or end of a word
  • \B : matches the empty string within a word
  • \< : matches the empty string at the beginning of a word
  • \> : matches the empty string at the end of a word
  • \w : matches any word-constituent character
  • \W : matches any character that is not word-constituent

For example, the expression:

\bfred\b

will match the string fred if it is a word, and so will not match the substring fred at the start of the string frederick.

Cradle Extensions

Cradle provides full support for regular expressions. Cradle also provides some additional syntax:

  • An expression can be preceded by ! to mean not the expression, which means to match everything other than that matched by the regular expression
  • A comma separated list of expressions (any of which can use the ! operation) means to match any of the expressions in the list
  • The value <null> means to match nothing. It is another way to specify:
  • ^$
  • Similarly the expression:
  • !<null>

means to match something or anything.

Back to index

 
 
[Copyright © 3SL 2008 | Last Updated: Thu Aug 28th, 2008 ]
Registered office: 2 Highfield Road, Barrow in Furness, Cumbria, LA14 5PA, Registered in England No. 2153654