1

I read the following article "Using grep and sed to find and replace a string" but how can I extend it to chain multiple greps. For example I have the following directory/file structure

dir1/metadata.txt
dir2/metadata.txt

dir1/metadata.txt has

filename1 '= 1.0.0'
filename2 '= 1.0.0'

dir2/metadata.txt has

filename1     '= 1.0.0'
long_filename '= 1.0.0'

In other words, both dir1/metadata.txt and dir2/metadata.txt contain "filename '1.0.0'" but the spaces between the "filename" and the "'1.0.0'" in each file is different.

Now I want to replace filename1's associated version to '2.0.0' in ALL metadata.txt files so the resulting files look like...

dir1/metadata.txt has

filename1 '= 2.0.0'
filename2 '= 1.0.0'

dir2/metadata.txt has

filename1     '= 2.0.0'
long_filename '= 1.0.0'

I'm trying

find . -name metadata.txt | xargs grep filename1 | sed -i "s/1\.0\.0/2.0.0/g" <some option here>

but I know the "some option here" part. Any clues?

6
  • do you need to change all *filename* ( e.g. filename2 and long_filename ) too or only filename1 ? Commented May 2, 2015 at 3:56
  • tivn: Only filename1. filename2 and long_filename remain unchanged Commented May 2, 2015 at 3:59
  • shelter: your command will only change ONE file Commented May 2, 2015 at 4:00
  • 1
    sed simply CANNOT operate on strings but see stackoverflow.com/questions/29613304/… for a workaround. Commented May 2, 2015 at 4:52
  • @EdMorton: Good tip in general, but the issue here is not how to generically replace strings with sed, but how to combine find with sed (with specific search and replacement strings, for which the OP has already provided the required escaping). Commented May 2, 2015 at 5:13

2 Answers 2

4

Try the following:

Linux:

find . -name metadata.txt \
  -exec sed -i "s/^\(filename1[[:space:]]\{1,\}'= \)1\.0\.0/\12.0.0/" {} +

OSX / BSD:

find . -name metadata.txt \
  -exec sed -i '' "s/^\(filename1[[:space:]]\{1,\}'= \)1\.0\.0/\12.0.0/" {} +

Note: The only reason why platform-specific commands are required is that GNU sed and BSD sed interpret the nonstandard -i option, which specifies the suffix to use for an optional backup of the original file, differently: GNU sed considers the option-argument for -i optional, whereas BSD sed considers it mandatory, requiring an explicit argument to specify the empty string (indicating the desire not to create a backup file)

  • exec ... + is a find feature that invokes the specified command with as many matching paths as can fit on a single command line, potentially resulting in multiple invocations, but typically resulting in only 1, which makes the invocation efficient.

  • "s/\(filename1[[:space:]]\{1,\}'= \)1\.0\.0/\12.0.0/" is a POSIX-compliant sed script that matches literal filename1 at the beginning of a line, followed by a variable amount of whitespace ([[:space:]]\{1,\}), followed by literal '= 1.0.0, and replaces the 1.0.0. with 2.0.0.

  • Note that if there are metadata.txt files that do not have lines beginning with filename1, they are still rewritten, because sed's -i option blindly "updates" the input files given (read: creates a new file that then replaces the original). If that is undesired, consider John1024's answer.

POSIX-compliance notes:

  • The -exec ... + variant of find's -exec primary has been part of POSIX since 2001 (POSIX.1-2001 / IEEE Std 1003.1-2001 / SUS v3 - see http://pubs.opengroup.org/onlinepubs/009695399/; thanks, @JonathanLeffler)
  • By contrast, sed's -i option for in-place updating is not POSIX-compliant - so you may have to work around that.
Sign up to request clarification or add additional context in comments.

9 Comments

Use + in place of \;?
According to Apple's man find, its current version of find supports {} +. Of course, there will be other or older BSD systems that don't support it.
Thanks, @John1024: find's -exec .... + is POSIX-compliant (see pubs.opengroup.org/onlinepubs/9699919799/utilities/find.html), but sed's -i is not.
@mklement0 Yes, very good. Plus 1 for both GNU and BSD versions.
Yup: the 'title page' for the POSIX 2004 documentation says: Abstract: The 2004 edition incorporates Technical Corrigendum Number 1 and Technical Corrigendum 2 addressing problems discovered since the approval of the 2001 edition. These are mainly due to resolving integration issues raised by the merger of the Base documents. So there is room to argue that manufacturers who have not gotten around to adding + to find have had all of thirteen years (2002-2014) and bits of 2001 and 2015 in which to fix the issue.
|
3
find . -name metadata.txt -exec grep -l --null filename1 {} + | xargs -0 sed -i "/^filename1 /{s/'= 1\.0\.0'/'= 2.0.0'/;}"

sed -i will update the timestamp of every file it processes regardless of whether it changes the contents of the file. This is because, in operation, sed -i creates a new file for each file processed and then overwrites the old file with the new file. To limit this, the above code uses grep to select only the files that might need modification and sends only those file names, via a pipeline, to sed -i for the update.

If the timestamp/overwriting issue is not important, consider mklement0's answer which eliminates the need for a pipeline, simplifying the command.

How it works

  • find . -name metadata.txt -exec grep -l --null filename1 {} +

    This produces the list of files name metadata.txt that also contain filename.

    The --null tells grep to separate file names with the NUL character.

  • xargs -0 sed -i "/^filename1 /{s/'= 1\.0\.0'/'= 2.0.0'/;}"

    This applies sed -i to change in-place the files whose names were returned by the above find command.

    In more detail:

    • /^filename1 /

      This selects lines that start with filename1 followed by a space. This assures that we match neither sfilename1 nor filename12.

    • s/'= 1\.0\.0'/'= 2.0.0'/

      This changes the version number for the selected lines. (This assumes only one space after the equal sign. If this assumption is not correct, we can easily change it.)

    The -0 option to xargs tells it to expect its input to be a NUL-separated list of file names. This makes the pipeline safe even if the file names include spaces, newlines, or other difficult characters.

7 Comments

Be aware that while the -exec handles spaces in file names, the xargs won't. You can fix that with the GNU toolchain by using grep -lZ and xargs -0 so that the file names are terminated with a null byte (instead of a newline). Alternatively, you can execute the sed in the -exec option. The downside of using -exec is that it might edit a file which does not contain filename1. That's relatively unlikely to matter, even if there are thousands of files to process, unless there are reasons not to risk modifying the 'last changed time' of the files unless something actually changes.
The 'might modify files that don't need modifying' observation applies to the other answer. I like the simpler 'match the marker; substitute the relevant text on the marked line' operation in the sed script. I don't understand why people insist on using a single line for shell scripts — 'one-liner' is a pejorative term in APL.
@JonathanLeffler Thanks. I updated the answer to include -Z/-0. I kept the grep-to-sed pipeline because the last changed time issue does surprise/confuse users who aren't expecting it. Separately, I found well-written APL scripts to be quite readable. I wouldn't mind if shell tools were rewritten by someone with Ken Iverson's eye for logical consistency.
Note that BSD grep has a -Z option, but it is wholly different from the GNU grep -Z: it makes it work like zgrep (so it searches compressed files too).
@mklement0 I updated the answer to --null, mentioned the issue with timestamps, and linked back to your answer.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.