2

I've read lots of very helpful posts about adding tabs between every number and letter, or adding/deleting tabs and spaces in other locations within a string, but I'm struggling to adapt those solutions to my current problem, so I figured it was worth adding to the stack! I have a bunch of genetic data that looks like this:

chr1    1324000 1325000tgagggtctgctg...
chr1    1318000 1319000gggactgcagctg...

etc.

Is there a way to add a tab between the last number and the first letter? The lengths of the sequences vary, so the tab isn't always going to be in the same position. Additionally, the first set of numbers don't always end in 0. However, the tab will always be immediately after the last number. I think it's going to be something like:

sed -e 's/\([[0-9]\+]\)/[\t/'

But of course this doesn't work. How can I tell sed to put a tab in this location?

Desired output:

chr1    1324000 1325000  tgagggtctgctg...
chr1    1318000 1319000  gggactgcagctg...
2
  • 1
    Something like this? sed -E 's/([0-9])([acgt])/\1\t\2/' file Commented Dec 16, 2022 at 1:44
  • I spoke too soon.. such a noob. Inspecing the file, it looks like this only worked on SOME of the lines. I failed to mention there are both lower and upper case letters. I have updated the answer. Commented Dec 16, 2022 at 2:09

2 Answers 2

1
sed -E 's/([0-9])([acgtACGT])/\1\t\2/' file

works! Thank you Cyrus.

1
$ sed 's/[^0-9]*$/\t&/' file
chr1    1324000 1325000 tgagggtctgctg...
chr1    1318000 1319000 gggactgcagctg...
2
  • This will add a tab character to lines which end with a number as well. It is unclear if thread-O/P wants this or if maybe that fringe case might never occur. Commented Dec 17, 2022 at 12:49
  • 1
    @bakunin Right, I took the OPs ellipses to mean "more of the same letters". Commented Dec 17, 2022 at 13:03

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.