0

I have a following code. It contains getPointAndPos function that needs to be as fast as possible:

struct Point {
   let x: Int
   let y: Int
}

struct PointAndPosition {
   let pnt: Point
   let pos: Int
}

class Elements {
    var points: [Point]

    init(points: [Point]) {
        self.points = points
    }

    func addPoint(x: Int, y: Int) {
        points.append(Point(x: x, y: y))
    }

    func getPointAndPos(pos: Int) -> PointAndPosition? {
        guard pos >= 0 && points.count > pos else {
            return nil
        }
        return PointAndPosition(pnt: points[pos], pos: pos)
    }
}

However, due to Swift memory management it is not fast at all. I used to use dictionary, but it was even worse. This function is heavily used in the application, so it is the main bottleneck now. Here are the profiling results for getPointAndPos function:

Profiler

As you can see it takes ~4.5 seconds to get an item from array, which is crazy. I tried to follow all performance optimization techniques that I could find, namely:

  • Using Array instead of Dictionary
  • Using simple types as Array elements (struct in my case)

It helped, but it is not enough. Is there a way to optimize it even further considering that I do not change elements from array after they are added?

UPDATE #1:

As suggested I replaced [Point] array with [PointAndPosition] one and removed optionals, which made the code 6 times faster. Also, as requested providing the code which uses getPointAndPos function:

private func findPoint(el: Elements, point: PointAndPosition, curPos: Int, limit: Int, halfLevel: Int,  incrementFunc: (Int) -> Int) -> PointAndPosition? {
    guard curPos >= 0 && curPos < el.points.count else {
        return nil
    }
    // get and check point here
    var next = curPos
    
    while true {
        let pnt = el.getPointAndPos(pos: next)
        if checkPoint(pp: point, pnt: pnt, halfLevel: halfLevel) {
            return pnt
        } else {
            next = incrementFunc(next)
            if (next != limit) {
                continue //then findPoint next limit incrementFunc
            }
            break
        }
    }
    return nil
}

Current implementation is much faster, but ideally I need to make it 30 times faster than it is now. Not sure if it is even possible. Here is the latest profiling result:

enter image description here

1
  • Just tried it. It adds to performance degradation. Looks better, but slower. Commented Mar 16, 2021 at 19:50

1 Answer 1

3

I suspect you're creating a PointAndPosition and then immediately throwing it away. That's the thing that's going to create a lot of memory churn. Or you're creating a lot of duplicate PointAndPosition values.

First make sure that this is being built in Release mode with optimizations. ARC can often remove a lot of unnecessary retains and releases when optimized.

If getPointAndPos has to be as fast as possible, then the data should be stored in the form it wants, which is an array of PointAndPosition:

class Elements {
    var points: [PointAndPosition]

    init(points: [Point]) {
        self.points = points.enumerated().map { PointAndPosition(pnt: $0.element, pos: $0.offset) }
    }

    func addPoint(x: Int, y: Int) {
        points.append(PointAndPosition(pnt: Point(x: x, y: y), pos: points.endIndex))
    }

    func getPointAndPos(pos: Int) -> PointAndPosition? {
        guard pos >= 0 && points.count > pos else {
            return nil
        }
        return points[pos]
    }
}

I'd take this a step further and reduce getPointAndPos to this:

    func getPointAndPos(pos: Int) -> PointAndPosition {
        points[pos]
    }

If this is performance critical, then bounds checks should already have been done, and you shouldn't need an Optional here.

I'd also be very interested in the code that calls this. That may be more the issue than this code. It's possible you're calling getPointAndPos more often than you need to. (Though getting rid of the struct creation will make that less important.)

Sign up to request clarification or add additional context in comments.

5 Comments

As requested, provided the code which uses getPointAndPos function in the update and added suggested optimization results. Thanks a lot! It made the code much faster. Would be glad if you can suggest how to optimize it it even further. Appreciate your help!
This code is probably ok. Optimizing it further would depend on the nature of checkPoint and incrementFunc and how many points you're talking about. For example, there may be better algorithms for implementing checkPoint or pre-computing some or all of it. This algorithm might be amenable to Accelerate methods to perform more checks in parallel. But if it's fast enough for your purposes, I don't see any red flags here. One thing that might help is to rethink passing incrementFunc if it's always something like $0 + n. In that case, just pass n and inline the math.
Function/closure calls are expensive if they really happen. But the compiler often inlines things so the function call doesn't happen, so you have to test on Release builds to see what's happening. But if I had to pick just one thing for performance-critical code: organize your data so that your algorithms become trivial. Wise (and sometimes clever) data organization is probably the single most important tool in high-performance code.
Increment functions are very simple, it is either {v in return v - 1} or {v in return v + 1}. checkPoint function is quite simple as well return pp.pnt.x - n == pnt.x || pp.pnt.x + n == pnt.x || pp.pnt.y - n == pnt.y || pp.pnt.y + n == pnt.y. The number of points in array is about 1000, but can be more in a rare cases.
1000 is pretty small, so may not matter unless you're calling this a lot, but it would be faster (and clearer) to get rid of the increment function, and just use whether curPos > limit to decide what direction to increment.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.