1

I'am trying to analyze string and decompose it into clear values: taskValue and timeValue.

var str1 = "20 minutes (to do)/(for) some kind of task"
// other possibilities
var str2 = "1 hour 30 minutes for some kind of task"
var str3 = "do some kind of task for 1 hour"

How can I apply multiple regexes in one function? Maybe, something like array of regexes

["[0-9]{1,} minutes", 
 "[0-9] hour", 
 "[0-9] hour, [0-9]{1,} minutes",
  ...]

The values returned from function aren't clean, it remains with "of ..", "for...", "to..." etc.

Can you give me advice how to improve it? Maybe it's possible to do some machine learning with MLKit? How to add a couple of regex patterns? Or to check if string contains certain things manually?

// check it out
var str = "20 minutes to do some kind of task"
func decompose(_ inputText: String) -> (time: String, taskName: String) {
    
    let pattern = "[0-9]{1,} minutes"
    let regexOptions: NSRegularExpression.Options = [.caseInsensitive]
    let matchingOptions: NSRegularExpression.MatchingOptions = [.reportCompletion]
    let range = NSRange(location: 0, length: inputText.utf8.count)
    
    var time = ""
    var taskName = inputText
    
    let regex = try! NSRegularExpression(pattern: pattern, options: regexOptions)
    if let matchIndex = regex.firstMatch(in: inputText, options: matchingOptions, range: range) {
        
        let startIndex = inputText.index(inputText.startIndex, offsetBy: matchIndex.range.lowerBound)
        let endIndex = inputText.index(inputText.startIndex, offsetBy: matchIndex.range.upperBound)
        
        time = String(inputText[startIndex..<endIndex])

        taskName.removeSubrange(startIndex..<endIndex)
           
    } else {
        print("No match.")
    }


    return (time, taskName)
}

print(decompose(str))

Overall, I look to learn how to do text analysis on premise that we know the thematics beforehand.

1
  • 3
    To do what you're looking to do and not just handle the simplest cases, you probably want to look into "natural language processing". There's an Apple framework for it and quite a few articles/tutorials around that explain how to start using it. Commented May 9, 2021 at 5:20

1 Answer 1

1

Use capture groups:

(\d+)\s*minute|(\d+)\s*hour

See regex proof. Then check which group matched and use captured values as you need. If the first group matched, you have minutes, else, you have hours in the second group.

EXPLANATION

--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    \d+                      digits (0-9) (1 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  minute                   'minute'
--------------------------------------------------------------------------------
 |                        OR
--------------------------------------------------------------------------------
  (                        group and capture to \2:
--------------------------------------------------------------------------------
    \d+                      digits (0-9) (1 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \2
--------------------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  hour                     'hour'
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.