Fastest way to calculate average RGB pixel value for AVCaptureVideoDataOutput feed - CPU/GPU

Question

I want the average pixel value for the entire image in the feed from AVCaptureVideoDataOutput, and I'm currently catching the image and looping through pixels to sum them.

I was wondering if there's a more efficient way to do this with the GPU/openGL, given that this is a parallelisable image processing task. (perhaps a heavy gaussian blur, and read the central pixel value?)

One specific requirement is for a high precision result, making use of the high level of averaging. Note the CGFloat result below.

Current swift 2 code:

Edit: Added an implementation with CIAreaAverage, as suggested below by Simon. It's separated by the useGPU bool.

func captureOutput(captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, fromConnection connection: AVCaptureConnection!) {

    var redmean:CGFloat = 0.0;
    var greenmean:CGFloat = 0.0;
    var bluemean:CGFloat = 0.0;

    if (useGPU) {
            let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
            let cameraImage = CIImage(CVPixelBuffer: pixelBuffer!)
            let filter = CIFilter(name: "CIAreaAverage")
            filter!.setValue(cameraImage, forKey: kCIInputImageKey)
            let outputImage = filter!.valueForKey(kCIOutputImageKey) as! CIImage!

            let ctx = CIContext(options:nil)
            let cgImage = ctx.createCGImage(outputImage, fromRect:outputImage.extent)

            let rawData:NSData = CGDataProviderCopyData(CGImageGetDataProvider(cgImage))!
            let pixels = UnsafePointer<UInt8>(rawData.bytes)
            let bytes = UnsafeBufferPointer<UInt8>(start:pixels, count:rawData.length)
            var BGRA_index = 0
            for pixel in UnsafeBufferPointer(start: bytes.baseAddress, count: bytes.count) {
                switch BGRA_index {
                case 0:
                    bluemean = CGFloat (pixel)
                case 1:
                    greenmean = CGFloat (pixel)
                case 2:
                    redmean = CGFloat (pixel)
                case 3:
                    break
                default:
                    break
                }
                BGRA_index++

            }
     } else {
            let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
            CVPixelBufferLockBaseAddress(imageBuffer!, 0)

            let baseAddress = CVPixelBufferGetBaseAddressOfPlane(imageBuffer!, 0)
            let bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer!)
            let width = CVPixelBufferGetWidth(imageBuffer!)
            let height = CVPixelBufferGetHeight(imageBuffer!)
            let colorSpace = CGColorSpaceCreateDeviceRGB()

            let bitmapInfo = CGBitmapInfo(rawValue: CGImageAlphaInfo.PremultipliedFirst.rawValue).rawValue | CGBitmapInfo.ByteOrder32Little.rawValue

            let context = CGBitmapContextCreate(baseAddress, width, height, 8, bytesPerRow, colorSpace, bitmapInfo)
            let imageRef = CGBitmapContextCreateImage(context)
            CVPixelBufferUnlockBaseAddress(imageBuffer!, 0)
            let data:NSData = CGDataProviderCopyData(CGImageGetDataProvider(imageRef))!
            let pixels = UnsafePointer<UInt8>(data.bytes)
            let bytes = UnsafeBufferPointer<UInt8>(start:pixels, count:data.length)
            var redsum:CGFloat = 0
            var greensum:CGFloat  = 0
            var bluesum:CGFloat  = 0
            var BGRA_index = 0
            for pixel in UnsafeBufferPointer(start: bytes.baseAddress, count: bytes.count) {
            switch BGRA_index {
            case 0:
                bluesum += CGFloat (pixel)
            case 1:
                greensum += CGFloat (pixel)
            case 2:
                redsum += CGFloat (pixel)
            case 3:
                //alphasum += UInt64(pixel)
                break
            default:
                break
            }

            BGRA_index += 1
            if BGRA_index == 4 { BGRA_index = 0 }
        }
        redmean = redsum / CGFloat(bytes.count)
        greenmean = greensum / CGFloat(bytes.count)
        bluemean = bluesum / CGFloat(bytes.count)            
        }

print("R:\(redmean) G:\(greenmean) B:\(bluemean)")

If you're handed a surface from that API (I'm not familiar with it), you should be able to feed it through OpenGL's explicit mipmap generation. It'll them proceed to average 1/4 resolution mipmaps in succession down to the final LOD: 1x1. That last LOD is your average. I don't know how it's implemented on iOS or OS X though, so performance might be the same or worse. — Andon M. Coleman
– Andon M. Coleman, Commented Sep 23, 2015 at 7:49

Community · Accepted Answer · 2017-05-23 11:47:08Z

6

The issue and the reason for the poor performance of your CIAreaAverage filter is the missing definition of the input extent. As a consequence the output of the filter has the same size as the input image and therefore you loop over a full-blown image instead of a 1-by-1 pixel image. Therefore the execution takes the same amount of time as your initial version.

As described in the documentation of CIAreaAverage you can specify an inputExtent parameter. How this can done in swift can be found in this answer of a similar question:

    let cameraImage = CIImage(CVPixelBuffer: pixelBuffer!)
    let extent = cameraImage.extent
    let inputExtent = CIVector(x: extent.origin.x, y: extent.origin.y, z: extent.size.width, w: extent.size.height)
    let filter = CIFilter(name: "CIAreaAverage", withInputParameters: [kCIInputImageKey: cameraImage, kCIInputExtentKey: inputExtent])!
    let outputImage = filter.outputImage!

If you want to squeeze out even more performance, you can ensure that you reuse your CIContext, instead of recreating it for each captured frame.

edited May 23, 2017 at 11:47

CommunityBot

11 silver badge

answered Apr 21, 2016 at 23:28

pd95

2,7621 gold badge27 silver badges44 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

YourMJK Over a year ago

This works great, but I found that this leaks memory if not enclosed in an autoreleasepool

Flex Monkey · Accepted Answer · 2015-09-23 08:42:01Z

2

There's a Core Image filter that does this very job, CIAreaAverage, which returns a single-pixel image that contains the average color for the region of interest (your region of interest will be the entire image).

FYI, I have a blog post that discusses applying Core Image filters to a live camera feed here. In a nutshell, the filter requires a CIImage which you can create inside captureImage based on sampleBuffer:

let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
let cameraImage = CIImage(CVPixelBuffer: pixelBuffer!)

...and it's that cameraImage you'll need to pass to CIAreaAverage.

Cheers,

Simon

answered Sep 23, 2015 at 8:42

Flex Monkey

3,66319 silver badges20 bronze badges

5 Comments

Ian Over a year ago

Good suggestion! I've just implemented it, and just about to add the code to the question. I'd love to get float precision out, and currently this method gives me int. Any ideas?

Ian Over a year ago

Interestingly this appears to run slower than my manual loop. I am running this on an iPhone 5, so it may be related to the relative GPU performance on the 5

Flex Monkey Over a year ago

Ugh - sorry about that. I know some of the Core Image filters are Metal backed and Metal on a 5s can be a little ropey. Let me know if your previous question still holds or if you've given up on that approach.

Ian Over a year ago

I'd like to keep trying this approach. I'll be getting a 6S soon, and it's OK for me to require newer devices for this app. If you have any idea for float, I'd appreciate your thoughts. Cheers

MirekE Over a year ago

Generally speaking, when you pass kCGBitmapFloatComponents as an option to CGBitmapContextCreate, you get floating point data. You need to orchestrate the other parameters like bit depth etc. too, obviously.

MirekE · Accepted Answer · 2015-09-23 07:45:52Z

1

If you had your data as floating point values, you could use

func vDSP_meanv

If that's not an option, try working with the data in a way so that the optimizer can use SIMD instructions. I don't have any good recipe for that, it has been trial and error exercise for me, but certain rearrangings of the code may give better chance than others. For example, I would try removing the switch from the loop. The SIMD will vectorize your calculations and in addition you can use multithreading via GCD by processing each row of the image data on a separate core...

edited Sep 23, 2015 at 7:45

answered Sep 23, 2015 at 7:40

MirekE

11.6k5 gold badges38 silver badges28 bronze badges

3 Comments

Ian Over a year ago

vDSP_meanv looks great... I started implementing that but got distracted by @simon-gladman's suggestion. If you think vDSP would be a quicker route, I'll happily persevere

MirekE Over a year ago

I would start with Simon's suggestion. Looks really cool. vDSP is CPU accelerated only, btw.

Rubaiyat Jahan Mumu Over a year ago

how to use vDSP_meanv ?

Collectives™ on Stack Overflow

Fastest way to calculate average RGB pixel value for AVCaptureVideoDataOutput feed - CPU/GPU

Current swift 2 code:

3 Answers 3

1 Comment

5 Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

Current swift 2 code:

3 Answers 3

1 Comment

5 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related