3

I want to get the numbers after the B*, but not including Y and after it. I tried to use this command which I got from a friend, but it doesn't work:

grep -oP 'B.\K[\w\s\d]' < tus.txt | sort -u

tus.txt

~TUS*21424565*4716129*B*222791885833*Y*
~TUS*21470045*4733746*B*36*Y*19-OCT-16**B*2239681
~TUS*21758806*4932668*B*00081907*Y*2707826
~TUS*21758851*4932694*B*00082025*Y*2707871
~TUS*21758862*4932739*B*262105589241-20172-31489016
~TUS*21758767*4932626*B*00081684*Y*2707792
~TUS*21758861*4932693*B*00082024*Y*2707881
~TUS*21758895*4932764*B*4578873831221*Y*
~TUS*21760350*4933404*B*00082603*Y*2708838
~TUS*21759295*4932379*B*00082403*Y*2708332

Desired result:

222791885833
36
00081907
00082025
262105589241-20172-31489016
00081684
00082024
4578873831221
00082603
00082403
0

3 Answers 3

10

The input is *-delimited. Get the fifth field:

$ cut -d '*' -f 5 tus.txt
222791885833
36
00081907
00082025
262105589241-20172-31489016
00081684
00082024
4578873831221
00082603
00082403

This is the desired output that you mentioned, but you also talk about sorting it:

$ cut -d '*' -f 5 tus.txt | sort -u
00081684
00081907
00082024
00082025
00082403
00082603
222791885833
262105589241-20172-31489016
36
4578873831221

If you, for whatever reason, want to sort the original data on this field (not removing duplicates here):

$ sort -t '*' -k5,5 tus.txt
~TUS*21758767*4932626*B*00081684*Y*2707792
~TUS*21758806*4932668*B*00081907*Y*2707826
~TUS*21758861*4932693*B*00082024*Y*2707881
~TUS*21758851*4932694*B*00082025*Y*2707871
~TUS*21759295*4932379*B*00082403*Y*2708332
~TUS*21760350*4933404*B*00082603*Y*2708838
~TUS*21424565*4716129*B*222791885833*Y*
~TUS*21758862*4932739*B*262105589241-20172-31489016
~TUS*21470045*4733746*B*36*Y*19-OCT-16**B*2239681
~TUS*21758895*4932764*B*4578873831221*Y*
4

Your command also works, you need to add a * or + to it:

$ grep -oP 'B.\K[\w\s\d]+' tus.txt | sort -u
00081684
00081907
00082024
00082025
00082403
00082603
222791885833
2239681
262105589241
36
4578873831221

Or, more simply:

$ grep -oP 'B\*\K[^*]*' tus.txt | sort -u
00081684
00081907
00082024
00082025
00082403
00082603
222791885833
2239681
262105589241
36
4578873831221

Or, use awk to print the penultimate *-separated field:

$ awk  -F'[*]' '{print $5}' tus.txt | sort -u
00081684
00081907
00082024
00082025
00082403
00082603
222791885833
2239681
262105589241
36
4578873831221
3

Use the following approach:

grep -Po '(?<=\*B\*)[^*]+' tus.txt | sort -u

The output:

00081684
00081907
00082024
00082025
00082403
00082603
222791885833
2239681
262105589241-20172-31489016
36
4578873831221

Note, sort -u will reorder the initial grep output

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.