Index

Subject : Re: LUG: Using sed to mass rename files

From : Edward Anderson <nilbus@nilbus.[redacted]>

Date : Wed, 03 Mar 2010 08:23:24 -0500

Parent


First, I should say that the easiest way to do this is to use the
rename command.

rename s/0000/000/ F0000*

That's about a million times more readable.

As for understanding the sed command, the sed manpage is helpful. If
you run man sed and search for & (using the / command to search),
you'll find it's a special character in s/foo/bar/ replacements.

s/regexp/replacement/
Attempt to match regexp against the pattern space. If success‐
ful, replace that portion matched with replacement. The
replacement may contain the special character & to refer to that
portion of the pattern space which matched, and the special
escapes \1 through \9 to refer to the corresponding matching
sub-expressions in the regexp.

Therefore, \(.\) matches the first character, which can be referenced by \1.
Then . matches the next character, which is always 0.
Then \(.*\) matches the rest of the filename, which can be referenced by \2.

The replacement string puts it all together using & (the original
filename) and \1\2 which is every part of the filename except the 2nd
character, which was a 0.

This is a pretty cryptic and retarded way to do this, IMHO. If for
some reason the rename command was not available and you wanted to use
sed to do the rename (or perhaps you were doing something too complex
for rename?), being more explicit in your regex would make it much
more readable. Perhaps something like:

ls F00001-0708-*|sed 's/F0000\(.*\)/mv & F000\1/' | sh

Being able to see what's actually changing in the
s/search/replacement/ makes it much more readable. Also it won't keep
sucking characters out of your filename if you accidentally run it
twice or something.

Edward

On Wed, Mar 3, 2010 at 3:00 AM, Daniel Underwood
<daniel.underwood@ncsu.[redacted]> wrote:
> +-----------+
> | Objective |
> +-----------+
>
> Change these filenames:
>
> F00001-0708-RG-biasliuyda
> F00001-0708-CS-akgdlaul
> F00001-0708-VF-hioulgigl
>
> to these filenames:
>
> F0001-0708-RG-biasliuyda
> F0001-0708-CS-akgdlaul
> F0001-0708-VF-hioulgigl
>
> +------------+
> | Shell Code |
> +------------+
>
> To test:
>
> ls F00001-0708-*|sed 's/\(.\).\(.*\)/mv & \1\2/'
>
> To perform:
>
> ls F00001-0708-*|sed 's/\(.\).\(.*\)/mv & \1\2/' | sh
>
> +---------------+
> | My Question |
> +---------------+
>
> I don't understand the sed code.  I understand what the substitution
> command
>
> $ sed 's/something/mv'
>
> means.  And I understand regular expressions somewhat.  But I don't
> understand what's happening here:
>
> \(.\).\(.*\)
>
> or here:
>
> & \1\2/
>
> The former, to me, just looks like it means: "a single character,
> followed by a single character, followed by any length sequence of a
> single character"--but surely there's more to it than that.  As far as
> the latter part:
>
> & \1\2/
>
> I have no idea.  I really want to understand this code.  Please help me
> out here, guys.
>
> TIA,
> Daniel
> --
> Daniel Underwood
> North Carolina State University
> Graduate Student - Operations Research
> email: daniel.underwood@ncsu.[redacted]
> phone: XXX.302.3291
> web: http://www4.ncsu.edu/~djunderw/
>
>



Replies :