Search this blog

You can search for all the topics in our blog using our google custom search now.

Search the blog

Loading

Wednesday, June 23, 2010

Learning regular expression step by step tutorials: Step 2

If you haven't watched the previous post, then below is the link:
Step 1 Learning regular expressions

Now we will going little further with regular expressions. In previous tutorial, we have seen that matches were done for very simple expression. In practicality we will be looking for more complex expressions to get the exact match. To provide us with these capabilities, Special Characters/Meta Characters are available.

There are 11 characters with special meanings:
1. The opening square bracket [
2. The backslash \
3. The caret ^
4. The dollar sign $
5. The period or dot .
6. The vertical bar or pipe symbol |
7. The question mark ?
8. The asterisk or star *
9. The plus sign +
10. The opening round bracket (
11. The closing round bracket ).

Since the above characters have a special meaning, we cannot find a match  for "2+2" in a string "2+2 equals 4". When you use these special characters for ex: plus sign in  "2+2" it will match with 22,222,2222 but not just 2+2,2,22+2 etc. '+' matches the preceding character one or more times.

If you would like special characters to be treated like literals in the previous example. That is you want "2+2" regex to match with "2+2 equals 4". Then you need to escape these special characters with "backslash(\)" like this: 2\+2 will ignore the "+" sign as literal which will now match with the given string.

Point to remember:  When you use "backslash" with literals other than special characters because "backslash" in combination with other literals creates a regex token which has a special meaning in itself.
For example:
when "d " is used in combination with "/", it creates a regex token "/d" which matches all digits from 0-9.

The following example is simple demonstration of what we've learnt,
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;

namespace RegexStep2
{
    class Program
    {
        static void Main(string[] args)
        {
            string s = "2+2 equals 4";
            string regex = @"2+2";
            if (Regex.IsMatch(s, regex))
            {
                Console.WriteLine("Given string " + s + " matches with regular expression " + regex);
                Console.ReadKey();
            }
            else
            {
                Console.WriteLine("Given string " + s + " does not match with regular expression " + regex);
                Console.ReadKey();

            }

        }
    }
}

Non Printable Characters:
We can also use special characters to put non printable characters in our regular expressions.Some examples of non printable characters are as follows:
1. "\t" to match a TAB character
2. "\n" to match a Line Feed.
3. "\r" to match a Carriage Return.
etc etc
 




Regards
Sameer Shaik

No comments:

Post a Comment