2

I need to pull the 7 digit number, 5 digit number and 9 digit (including a hyphen)

example string:

Headers
User ID: 12345, Claim No: 1234567, Invoice No: 1-12345678
Claim No: 1234567,User ID: 12345, Invoice No: 1-12345678
Invoice No: 1-12345678, Claim No: 1234567,User ID: 12345

I have tried the below for 7 digit number but this pulls the first 7 digits it finds (and not the actual 7 digit number)

=MID(Q6,SEARCH(REPT(0,7),CONCAT(IFERROR(MID(Q6,SEQUENCE(LEN(Q6)),1)*0,2))),7)
2
  • 1
    Are all your strings like the sample string? A sample of one is rarely representative of a data set. Commented Oct 7 at 13:30
  • They would be similar, but yes they will differ. However, if there's a 7 digit number for example I want to extract it regardless of the rest of the string Commented Oct 7 at 13:37

4 Answers 4

4

You could try using the following formula:

enter image description here

=CHOOSECOLS(TEXTSPLIT(J2, {" ",", "}), 3, 6, 9)

Or,

=INDEX(TEXTSPLIT(J2, {" ",", "}), {3,6,9})

Or,

=TEXTAFTER(TEXTSPLIT(J2, ","), " ", -1)

Updated Formula based on Edited Post by OP:

enter image description here


=TEXTSPLIT(J2, TEXTSPLIT(J2, HSTACK("-", SEQUENCE(, 10)-1), , 1), , 1)

Also, to show even the 3rd alternative works:

enter image description here

There is another way, if you keep the headers for the output as shown in the screenshot, then can get the output aligned

enter image description here

=IFERROR(TOCOL(TEXTAFTER(TEXTSPLIT($J2, ","), K$1&": "), 2), "")

Sign up to request clarification or add additional context in comments.

15 Comments

Thank you for this, I have tested all 3 but each one has an issue when the string changes. The order of the values can be different and sometimes some of them can be missing
Refer the updated version, it seems to work!
Thank you this is amazing, is there any way to dictate that it spits them out in the same order? So 5 digit number in first column, 7 digit in second etc (even if there is no 5 digit number then it still puts the 7 digit in the second column)
[deleted]
[deleted]
Thanks, this is splitting per line where as I need it per column, the other formula is still being caught out by missing values/ order
Imma not sure what you are saying but it split by column only not by row. This is the updated formula: =IFERROR(TOCOL(TEXTAFTER(TEXTSPLIT($J2, ","), K$1&": "), 2), "") which i have posted and it references the headers also you can change the TOCOL() to TOROW() the formula needs to copy down and copy across
do you mind showing me a screenshot in the following comments
Unfortunately it won't allow me to upload an image. But the formula is spitting the results into cells K2, K3, and K4. Ideally I would like K2, L2, M2
See the screenshot it will split to K2, L2 and M2. Do you have those headers, as I have suggested?
I was able to add my screenshot to your answer, wondering is it a setting issue?
Check the updated one
Getting difficult to follow now :D So =IFERROR(TOCOL(TEXTAFTER(TEXTSPLIT($J2, ","), K$1&": "), 2), "") still is doing the same thing on my machine
If you need it row wise then it will be like this: =IFERROR(TOCOL(TEXTAFTER(TEXTSPLIT($J$2, , ","), INDEX({"User ID: ","Claim No: ","Invoice No: "}, ROW(ZZ1))), 2),"") but if column wise then the formula posted should work, you dont have the headers check the screenshot, you are not following
I can see what you mean now, it is now working column wise. but is not extracting correctly. It is combining invoice numbers and claim ids
I think I can work with this though, thank you for all your help
Post your workbook link using onedrive, let me check. You are not correctly following! You clearly dont have the headers like I have suggested
1

Could try some regex like this:

=IFERROR(REGEXEXTRACT(A1,{"User ID:\s*([0-9]+)","Claim No:\s*([0-9]+)","Invoice No:\s*([0-9-]+)"},2),"")

I have tried to make it fairly flexible by not being sensitive to the order of items, allowing for missing items, and allowing for a variable amount of space between the tag (User ID etc.) and the associated value.

1 Comment

Thanks for this, unfortunately my labels aren't consistent in my actual data set
1

If your examples are exhaustive (i.e. you don't have a 3-long value that is sometimes included, or occasionally exclude the 7-long value, etc), then this should work:

=LET(vals, TRIM(TEXTAFTER(TEXTSPLIT(A2, ","), ":")), 
    lens, MAP(vals, LAMBDA(v, LEN(v))),
    SORTBY(vals, lens))

It splits the text into columns based on commas; discards colons and anything before them; eliminates any starting/trailing spaces; then sorts the results by Length (meaning that it outputs the 5, 7, and then 10-long strings in that order)

If you don't want all the items, and/or if you have various other values that might appear, then you can try this instead:

=LET(target, 7,
    vals, TRIM(TEXTAFTER(TEXTSPLIT(A2, ","), ":")),
    lens, MAP(vals, LAMBDA(v,LEN(v))),
    FILTER(vals, lens=target,NA()))

This is very similar, except it will only return the value(s) whose length(s) matches the target length.

Comments

1

This regex returns 5 , 7, 1-8 length numbers in this sequence.

=REGEXEXTRACT(A1,{"(?<!\d)\d{5}(?!\d)","(?!<\d)\d{7}(?!\d)","(?!<\d)\d-\d{8}(?!\d)"})

The formula is in B1 and drag down.

A B C D
User ID: 12345, Claim No: 1234567, Invoice No: 1-12345678 12345 1234567 1-12345678
Claim No: 1234567,User ID: 12345, Invoice No: 1-12345678 12345 1234567 1-12345678
Invoice No: 1-12345678, Claim No: 1234567,User ID: 12345 12345 2345678 1-12345678

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.