1

I would like to replace the "." that is in the middle of the column two, by the string in column 3.

Input file (tab-delimited):

0   AAAAAAAAGTTT.TATAGTAATATA   T   x   HPNK_05032012_new.fna
1   AAAAAAACGACG.ATTTTACAATAC   C   x   HPNK_05032012_new.fna
2   AAAAAAAGCAGG.CATTATCGCTGG   G   x   HPNK_05032012_new.fna
3   AAAAAAAGGAAC.GTGGAACGTTGG   A   x   HPNK_05032012_new.fna
5   AAAAAACACAAC.ATTGAGCAACTT   A   x   HPNK_05032012_new.fna
6   AAAAAACACCCA.CTGTGAAAGAAA   T   x   HPNK_05032012_new.fna
9   AAAAAACGCCAA.GTCAGCTACAAA   C   x   HPNK_05032012_new.fna

Desired output:

0   AAAAAAAAGTTTTTATAGTAATATA   T   x   HPNK_05032012_new.fna
1   AAAAAAACGACGCATTTTACAATAC   C   x   HPNK_05032012_new.fna
2   AAAAAAAGCAGGGCATTATCGCTGG   G   x   HPNK_05032012_new.fna
3   AAAAAAAGGAACAGTGGAACGTTGG   A   x   HPNK_05032012_new.fna
5   AAAAAACACAACAATTGAGCAACTT   A   x   HPNK_05032012_new.fna
6   AAAAAACACCCATCTGTGAAAGAAA   T   x   HPNK_05032012_new.fna
9   AAAAAACGCCAACGTCAGCTACAAA   C   x   HPNK_05032012_new.fna
4
  • Hi mpapec, your one-liner is just deleting ".", 0 AAAAAAAAGTTTTATAGTAATATA T x HPNK_05032012_new.fna Commented Feb 18, 2014 at 11:21
  • How does it differ from desired output? Commented Feb 18, 2014 at 11:28
  • @mpapec: in the desired output he REPLACES . with the content of the next column Commented Feb 18, 2014 at 13:04
  • @OlivierDulac tnx for comment Commented Feb 18, 2014 at 13:22

3 Answers 3

3

Use:

$ awk '{sub("\.", $3, $2)}1' file
0 AAAAAAAAGTTTTTATAGTAATATA T x HPNK_05032012_new.fna
1 AAAAAAACGACGCATTTTACAATAC C x HPNK_05032012_new.fna
2 AAAAAAAGCAGGGCATTATCGCTGG G x HPNK_05032012_new.fna
3 AAAAAAAGGAACAGTGGAACGTTGG A x HPNK_05032012_new.fna
5 AAAAAACACAACAATTGAGCAACTT A x HPNK_05032012_new.fna
6 AAAAAACACCCATCTGTGAAAGAAA T x HPNK_05032012_new.fna
9 AAAAAACGCCAACGTCAGCTACAAA C x HPNK_05032012_new.fna

It is basically replacing the . with the 3rd field by using the sub() function. Then 1 performs the awk's default behaviour: {print $0}.

Since your question shows spaces in between columns, my output is just showing one space. In case your input uses tabs, add tab as field separator:

awk 'BEGIN{FS=OFS="\t"} {sub("\.", $3, $2)}1' file
Sign up to request clarification or add additional context in comments.

12 Comments

at the second column ,. are not removed for me
@MortezaLSC but does your sample code contain tabs in between columns? Note that if not, the second field won't be the second column.
Hmmm...I copied what popnard wrote...Yes..worked great...thank you
My input file is tab-separated and is not working the second fedorqui's command, why?
To make sure it is tab-separated, @popnard , try doing awk 'BEGIN{FS=OFS="\t"} {print $2} file to make sure awk is taking the second column as the second field.
|
2
perl -lane '$F[1] =~ s/[.]/$F[2]/; print "@F"' file

or shorter,

perl -ape 's/[.]/$F[2]/' file

Comments

1

Using awk, which will keep the original format

awk '$19=$33' FS="" OFS="" file

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.