2

I'm using a Swift function that successfully loads data from a text file into an Double array, but it is slow. Is there a way to load numeric data directly without using the String initializer that may be faster? Or any other suggestions to speed this up?

func arrayFromContentsOfFileWithPath(path: String) -> [Double]? {
    do {
        let content = try String(contentsOfFile:path, encoding: NSUTF8StringEncoding)
        let stringArray = content.componentsSeparatedByString("\n").map{
            $0.stringByTrimmingCharactersInSet(NSCharacterSet.whitespaceAndNewlineCharacterSet())
        }
        return stringArray.map{Double($0)}.flatMap{$0}
    } catch _ as NSError {
        return nil
    }
}

EDIT: To quantify things a bit, the data file is 10000 samples and the load time is 0.183 s for a single load (according to a measureBlock in my unit tests). In comparison, MATLAB loads the file in 0.033 s. Here are the first few samples of the data:

   8.1472369e-01
   9.0579194e-01
   1.2698682e-01
   9.1337586e-01
   6.3235925e-01
   9.7540405e-02
   2.7849822e-01
   5.4688152e-01
   9.5750684e-01
   9.6488854e-01

UPDATE: Following @appzYourLife's advice to combine the mappings (I used .flatMap{Double($0)}) and to use a Release build, the load time is now 0.119 s. Much better, but still about 4x the time of MATLAB, which was very unexpected.

13
  • Why do you think the String initializer is what's slow? Commented Feb 13, 2016 at 13:50
  • @matt Not much else there; what else do you think might be causing it? Commented Feb 13, 2016 at 13:53
  • @Rogare: What is the size of the input file, the time of execution and the hardware you are using? Commented Feb 13, 2016 at 13:55
  • I don't think anything. I would use Instruments and see. Also please show your actual input file content if you want real help. Commented Feb 13, 2016 at 13:55
  • @appzYourLife Thanks for the comment, please see new edits Commented Feb 13, 2016 at 14:01

1 Answer 1

3

You can read data quite fast with NSScanner(). The scanDouble() method skips leading whitespace, so no intermediate strings or arrays are needed:

func arrayFromContentsOfFileWithPath(path: String) -> [Double]? {
    do {
        let content = try String(contentsOfFile:path, encoding: NSUTF8StringEncoding)
        let scanner = NSScanner(string: content)
        var doubleArray = [Double]()
        var value = 0.0
        while scanner.scanDouble(&value) {
            doubleArray.append(value)
        }
        return doubleArray
    } catch _ as NSError {
        return nil
    }
}

In my test, reading 10,000 samples in Release configuration is done in 0.0034 seconds, compared to 0.077 seconds with your code, that is an improvement of more than factor 20.

Update for Swift 3:

func arrayFromContentsOfFileWithPath(path: String) -> [Double]? {
    guard let content = try? String(contentsOfFile:path, encoding: .utf8) else {
        return nil
    }
    let scanner = Scanner(string: content)
    var doubleArray = [Double]()
    var value = 0.0
    while scanner.scanDouble(&value) {
        doubleArray.append(value)
    }
    return doubleArray
}
Sign up to request clarification or add additional context in comments.

1 Comment

Yup, I'm seeing 180 ms -> 9 ms on my end as well, so a 20x improvement. Fantastic, thanks! Seems like all that mapping was the bottleneck.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.