0

I have an array of strings and I want to extract only what's inside <>.

<div class=\"name\" title=\"&quot;User&quot; <John Appleseed>\">
<div class=\"name\" title=\"&quot;User&quot; <Bill Gates>\">

So the result I expect is ["John Appleseed", "Bill Gates"]

6
  • 1
    "Is there any way to do this?" – just ask yourself: do you see any logical pattern which you as a human would use to distinguish <John Appleseed> from <div> if you didn't know HTML? Commented Sep 23, 2019 at 10:35
  • 2
    From the top of my head, there are three ways here: 1) <John Appleseed> is inside quotes. Will that always be the case? If so, you could make use of that. 2) You could make a list of known html tags to distinguish between actual and seeming html tags 3) parse the HTML and then find strings inside tags (which then aren't real html tags). Commented Sep 23, 2019 at 10:38
  • @LinusGeffarth - I was able to collect all div classes. No I have an array of '<div class=\"name\" title=\"&quot;User&quot; <John Appleseed>\">', how do I extract only what's inside "<>" in this case is John Appleseed Commented Sep 23, 2019 at 11:01
  • Try <[\w\s]+> as starting point using regexpal.com (or similar). Depending on how similar the elements in the array are, that may be it already. Commented Sep 23, 2019 at 11:04
  • @LinusGeffarth the pattern you provided is not valid in Swift Commented Sep 23, 2019 at 11:26

1 Answer 1

1

If you have filtered out the correct rows and the structure of the string is the same you can use lastIndex(of:) and firstIndex(of:) functions to find the inner <> pair and then extract a substring from that

if let first = str.lastIndex(of:"<"), let last = str.firstIndex(of:">") {
    let name = String(str[str.index(after: first)..<last])
}

Example

let strings = ["<div class=\"name\" title=\"&quot;User&quot; <John Appleseed>\">", "<div class=\"name\" title=\"&quot;User&quot; <Bill Gates>\">"]

for str in strings {
  if let first = str.lastIndex(of:"<"), let last = str.firstIndex(of:">") {
    let name = String(str[str.index(after: first)..<last])
    print(name)
  }
}

produces

John Appleseed
Bill Gates

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.