Index

Subject : Re: LUG: Regex help

From : Kevin Hunter <hunteke@earlham.[redacted]>

Date : Mon, 05 Jul 2010 05:44:11 -0600

Parent


At 12:24am -0600 Sun, 27 Jun 2010, Daniel Underwood wrote:
> I have lines of text such as the following:
>
> 23,45,45,
> 24.45.65
>
> I want to match "periods" not the "commas". I know that I can simply
> [^\,] take the compliment of the commas for the above text, but assume
> that's not sufficient. How can I match a period and only a period.
> When I use
>
> ^[0-9]{2}\.
>
> it matches both of the above lines. It matches 23, and 24..

You solved your problem with another solution already, but I thought I'd
follow up on this because I have hunch of what the problem was: shell
interpolation. The issue is that there are multiple layers of escaping
and interpretation happening. If you escape something on the command
line, it's passed through to the underlying program now unescaped. Example:

$ echo -e "T\test"; echo -e "T\\test"; echo -e "T\\\test"
T est # shell passed a tab character to echo
T est # shell passed \t to echo, which did the tab conversion
T\test # shell passed \\t to echo, which escaped the '\' char

Contrast now with the use of single quotes, which tells the shell *not*
to do anything to the argument enclosed

$ echo -e 'T\test'; echo -e 'T\\test'; echo -e 'T\\\test'
T est # echo converted \t to a tab
T\test # echo received \\t, and then escaped the second '\' char
T\ est # echo received \\\t, converting one '\', with a
remaining '\t' which it converted to a tab

So, another solution to your original question:

$ cat test
23,45,45,
24.45.65

$ egrep ^[0-9]{2}\. test # egrep receives '^[0-9]{2}.'
23,45,45,
24.45.65

$ egrep "^[0-9]{2}\." test # egrep receives '^[0-9]{2}.'
23,45,45,
24.45.65

$ egrep ^[0-9]{2}\\. test # egrep receives '^[0-9]{2}\.'
24.45.65

$ egrep '^[0-9]{2}\.' test # egrep receives '^[0-9]{2}\.'
24.45.65

About clear as mud?

Kevin


Replies :