Find and replace pattern of fileA in fileC by fileB pattern

Question

I have two files, fileA with a list of name :

AAAAA 
BBBBB
CCCCC
DDDDD

and another fileB with another list :

and a third fileC with some text :

Hello AAAAA toto BBBBB dear "AAAAA" trird BBBBBB tuizf AAAAA dfdsf CCCCC

So I need to find and replace every pattern of fileA in fileC by fileB pattern. It works ! But i realised that fileC contains words like "AAAAA" and it isn't replace by "111".

I'm doing this but it doesn't seems to work.

#! /bin/bash
while IFS= read -r lineA && IFS= read -r lineB <&3; do
sed -i -e "s/$lineA/$lineB/g" fileC
done <fileA 3<fileB

I tested your solution and it works for me: Hello 111 toto 222 dear 111 trird 222B tuizf 111 dfdsf 333 — svante
– svante, Commented Oct 15, 2013 at 8:41
@PeterDev AAAAA in fileC isn't replaced because fileA contains AAAAA and not AAAAA (notice the trailing space). — devnull
– devnull, Commented Oct 15, 2013 at 9:09

Community · Accepted Answer · 2017-05-23 12:15:11Z

3

This is a good job for GNU awk:

$ cat replace.awk 
FILENAME=="filea" {
    a[FNR]=$0
    next
}
FILENAME=="fileb" {
    b[a[FNR]]=$0
    next
}
{
    for (i=1;i<=NF;i++) {
        printf "%s%s",(b[$i]?b[$i]:$i),(i==NF?RS:FS)
    }
}

Demo:

$ awk -f replace.awk filea fileb filec
Hello 111 toto 222 dear 111 trird BBBBBB tuizf 111 dfdsf 333

A solution for sehe:

FILENAME==ARGV[1] {              # Read the first file passed in
    find[FNR]=$0                 # Create a hash of words to replace
    next                         # Get the next line in the current file
}
FILENAME==ARGV[2] {              # Read the second file passed in
    replace[find[FNR]]=$0        # Hash find words by the words to replace them 
    next                         # Get the next line in the current file
}
{                                # Read any other file passed in (i.e third)
    for (i=1;i<=NF;i++) {        # Loop over all field & do replacement if needed
        printf "%s%s",(replace[$i]?replace[$i]:$i),(i==NF?RS:FS)
    }
}

For replacements the ignore word boundaries:

$ cat replace.awk 
FILENAME==ARGV[1] {
    find[FNR]=$0
    next
}
FILENAME==ARGV[2] {
    replace[find[FNR]]=$0
    next
}
{
    for (word in find)
        gsub(find[word],replace[find[word]])
    print
}

Demo:

$ awk -f replace.awk filea fileb filec
Hello 111 toto 222 dear "111" trird 222B tuizf 111 dfdsf 333

edited May 23, 2017 at 12:15

CommunityBot

11 silver badge

answered Oct 15, 2013 at 8:38

Chris Seymour

86.4k32 gold badges166 silver badges209 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

sehe Over a year ago

I'm continuously amazed how, after years of exposure, awk manages to make zero lasting impression on my brain. I mean, it always looks like the tool for the job, but I really can make heads or tails of it (FNR? NF,RS,FS?) Also, while is filea and fileb still on the command line when they're also hardcoded in the script? Just - foreign to me.

Chris Seymour Over a year ago

It's pretty natural to me.. you probably already describe data in terms of records and fields so FS for field separator, RS for record separator and NF for number of fields is pretty sane. You could match files on positions and use argv but using the name is more readable IMO and of course you still need to pass in the handle of each file.

Peter Dev Over a year ago

My script works ! But i realised that fileC contains words like "AAAAA" and it isn't replace by "111". Any idea ?

Chris Seymour Over a year ago

@PeterDev I added script that does replacements regardless or word boundaries.

Peter Dev Over a year ago

Thx and nice work but it doens't seems to work on the "" ~/test# awk -f replace.awk fileA fileB fileC 111 toto 222 dear "AAAAA" trird 222B tuizf 111 dfdsf 333

|

hipe · Accepted Answer · 2013-10-15 09:16:05Z

2

sed 's/.*/s/' fileA | paste -d/ - fileA fileB | sed 's/$/\//' | sed -f - fileC

and the correct and faster version would be

paste -d/ fileA fileB | sed 's/^/s\//;s/$/\/g/' | sed -f - fileC

edited Oct 15, 2013 at 9:16

answered Oct 15, 2013 at 8:35

hipe

8226 silver badges8 bronze badges

Comments

sehe · Accepted Answer · 2013-10-15 08:37:03Z

1

A two-phase rocket:

sed -e "$(paste file[AB] | sed 's/\(.*\)\t\(.*\)/s\/\1\/\2\/g;/')" fileC

What this does is create an adhoc sed script using paste file[AB] | sed 's/\(.*\)\t\(.*\)/s\/\1\/\2\/g;/':

s/AAAAA/111/g;
s/BBBBB/222/g;
s/CCCCC/333/g;
s/DDDDD/444/g;

And then runs it with fileC as the input

answered Oct 15, 2013 at 8:37

sehe

401k49 gold badges475 silver badges674 bronze badges

1 Comment

sehe Over a year ago

@hipe I didn't notice. Anyways, my version is limited as well (fileA/fileB cannot contain tabs)

Collectives™ on Stack Overflow

Find and replace pattern of fileA in fileC by fileB pattern

3 Answers 3

6 Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

6 Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related