I have complex string in which I need to pull single words and/or multiple words.
Here is the string:
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:count="5" yahoo:created="2013-07-28T18:37:23Z" yahoo:lang="en-US"><diagnostics><publiclyCallable>true</publiclyCallable><user-time>145</user-time><service-time>141</service-time><build-version>38483</build-version></diagnostics><results><Result xmlns="urn:yahoo:cate">**RED**</Result><Result xmlns="urn:yahoo:cate">**GREEN**</Result><Result xmlns="urn:yahoo:cate">**BLUE**</Result><Result xmlns="urn:yahoo:cate">**A, E, I, O, U **</Result><Result xmlns="urn:yahoo:cate">**SOMETIMES Y**</Result></results></query><!-- total: 145 -->
(I really wish that wouldn't scroll, since it makes it difficult to see the entire picture)
Anyway, I need to be able to pull out the:
RED
GREEN
BLUE
A, E, I, O, U
SOMETIMES Y
++++ btw, I tried to make those values BOLD in the big string, but they show up with asteriks instead. Disredard the asterisks. They are not part of the string. However I'm leaving them in there since it makes them easier to find when you look at the entire string) ++++
My goal is to turn that complex string into this:
RED|GREEN|BLUE|A, E, I, O, U|SOMETIMES Y
My preference is to do this on the sheet level using a single nested function (or a combination of multiple functions if necessary).
Failing that, a script version would be preferable to nothing.
I've been at this for hours using SPLIT, FIND, SUBSTITUTE, and a few other things that I tried on a whim - just to try everything. But I've now reached the saturation point of thinking clearly on this, and I'm hoping that someone can put me on a path for how to attack this logically.
I'm truly stumped (and frustrated).
==========================================
I said that I'd post the solution if I figured out the sheet-level solution, so this is it:
=mid(substitute(substitute(regexreplace(mid(A1,find("<Result",A1),find("</query",A1)-find("<Result",A1)),"<.*?>+","-"),"--","|"),"-","|"),2,len(substitute(substitute(regexreplace(mid(A1,find("<Result",A1),find("</query",A1)-find("<Result",A1)),"<.*?>+","-"),"--","|"),"-","|"))-2)