5

I have two files, fileA with a list of name :

AAAAA 
BBBBB
CCCCC
DDDDD

and another fileB with another list :

111 
222
333
444

and a third fileC with some text :

Hello AAAAA toto BBBBB dear "AAAAA" trird BBBBBB tuizf AAAAA dfdsf CCCCC

So I need to find and replace every pattern of fileA in fileC by fileB pattern. It works ! But i realised that fileC contains words like "AAAAA" and it isn't replace by "111".

I'm doing this but it doesn't seems to work.

#! /bin/bash
while IFS= read -r lineA && IFS= read -r lineB <&3; do
sed -i -e "s/$lineA/$lineB/g" fileC
done <fileA 3<fileB
8
  • So you mean you need to replace AAAAA with 111 ? Commented Oct 15, 2013 at 8:21
  • "doesn't seems to work." - What is the output? Commented Oct 15, 2013 at 8:36
  • 1
    I tested your solution and it works for me: Hello 111 toto 222 dear 111 trird 222B tuizf 111 dfdsf 333 Commented Oct 15, 2013 at 8:41
  • 1
    Maybe you just didn't look in your fileC ( -i ). Commented Oct 15, 2013 at 8:50
  • 1
    @PeterDev AAAAA in fileC isn't replaced because fileA contains AAAAA and not AAAAA (notice the trailing space). Commented Oct 15, 2013 at 9:09

3 Answers 3

3

This is a good job for GNU awk:

$ cat replace.awk 
FILENAME=="filea" {
    a[FNR]=$0
    next
}
FILENAME=="fileb" {
    b[a[FNR]]=$0
    next
}
{
    for (i=1;i<=NF;i++) {
        printf "%s%s",(b[$i]?b[$i]:$i),(i==NF?RS:FS)
    }
}

Demo:

$ awk -f replace.awk filea fileb filec
Hello 111 toto 222 dear 111 trird BBBBBB tuizf 111 dfdsf 333

A solution for sehe:

FILENAME==ARGV[1] {              # Read the first file passed in
    find[FNR]=$0                 # Create a hash of words to replace
    next                         # Get the next line in the current file
}
FILENAME==ARGV[2] {              # Read the second file passed in
    replace[find[FNR]]=$0        # Hash find words by the words to replace them 
    next                         # Get the next line in the current file
}
{                                # Read any other file passed in (i.e third)
    for (i=1;i<=NF;i++) {        # Loop over all field & do replacement if needed
        printf "%s%s",(replace[$i]?replace[$i]:$i),(i==NF?RS:FS)
    }
}

For replacements the ignore word boundaries:

$ cat replace.awk 
FILENAME==ARGV[1] {
    find[FNR]=$0
    next
}
FILENAME==ARGV[2] {
    replace[find[FNR]]=$0
    next
}
{
    for (word in find)
        gsub(find[word],replace[find[word]])
    print
}

Demo:

$ awk -f replace.awk filea fileb filec
Hello 111 toto 222 dear "111" trird 222B tuizf 111 dfdsf 333
Sign up to request clarification or add additional context in comments.

6 Comments

I'm continuously amazed how, after years of exposure, awk manages to make zero lasting impression on my brain. I mean, it always looks like the tool for the job, but I really can make heads or tails of it (FNR? NF,RS,FS?) Also, while is filea and fileb still on the command line when they're also hardcoded in the script? Just - foreign to me.
It's pretty natural to me.. you probably already describe data in terms of records and fields so FS for field separator, RS for record separator and NF for number of fields is pretty sane. You could match files on positions and use argv but using the name is more readable IMO and of course you still need to pass in the handle of each file.
My script works ! But i realised that fileC contains words like "AAAAA" and it isn't replace by "111". Any idea ?
@PeterDev I added script that does replacements regardless or word boundaries.
Thx and nice work but it doens't seems to work on the "" ~/test# awk -f replace.awk fileA fileB fileC 111 toto 222 dear "AAAAA" trird 222B tuizf 111 dfdsf 333
|
2
sed 's/.*/s/' fileA | paste -d/ - fileA fileB | sed 's/$/\//' | sed -f - fileC

and the correct and faster version would be

paste -d/ fileA fileB | sed 's/^/s\//;s/$/\/g/' | sed -f - fileC

Comments

1

A two-phase rocket:

sed -e "$(paste file[AB] | sed 's/\(.*\)\t\(.*\)/s\/\1\/\2\/g;/')" fileC 

What this does is create an adhoc sed script using paste file[AB] | sed 's/\(.*\)\t\(.*\)/s\/\1\/\2\/g;/':

s/AAAAA/111/g;
s/BBBBB/222/g;
s/CCCCC/333/g;
s/DDDDD/444/g;

And then runs it with fileC as the input

1 Comment

@hipe I didn't notice. Anyways, my version is limited as well (fileA/fileB cannot contain tabs)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.