0

In VBA, conveniently, programmers can access Excel functions via the Application.WorksheetFunction object. To save the programmer defining everything as spreadsheet-esque 2D arrays, the aggregate functions like WorksheetFunction.Average() will accept 1D arrays as well.

However, I've just spent the past few hours trying to pin down a unit test failure in a complex bit of statistical code, and I've found that the behaviour of the aggregate functions on 1D arrays changes once you go past an array length of 65536 (i.e. the max value of an Integer in VBA). In particular, the following code constructs the series 1,2,3,4,5,... and then takes the average, which should equal (n+1)/2:

Dim arr1D() As Double, arr2D() As Double, kk As Long

Debug.Print "Arr length", "True mean", "Excel mean (1D)", "Excel mean (2D)"
For kk = 65535 To 65538
    ReDim arr1D(1 To kk), arr2D(1 To kk, 1 To 1)
    For ii = 1 To kk
        arr1D(ii) = ii
        arr2D(ii, 1) = ii
    Next ii ' both arrays now contain 1, 2, 3, ...
    Debug.Print kk, (kk + 1) / 2, WorksheetFunction.Average(arr1D), WorksheetFunction.Average(arr2D)
Next kk

It produces this output - note the difference depending on whether the array is 1D or 2D:

Arr length  True mean  Excel mean (1D)  Excel mean (2D)
 65535       32768      32768            32768 
 65536       32768.5    32768.5          32768.5 
 65537       32769      1                32769 
 65538       32769.5    1.5              32769.5 

Similar weird behaviour is seen with other aggregate functions - for example StDev_S similarly rolls over at 65538 items, and outright crashes at 65537. In the words of my generation: nani the f***?

Excel version is: Microsoft® Excel® for Microsoft 365 MSO (Version 2205 Build 16.0.15225.20028) 32-bit. A possible related question is here. An obvious workaround is to implement my own aggregate functions - which is what I'm currently doing - but that doesn't answer the question of why the above code is doing what it does.

5
  • "...single-dimensional arrays are passed back to Excel as a row of columns...", cf Charles WIlliams' response here: stackoverflow.com/questions/11237195/… Commented Jul 15, 2022 at 12:08
  • This seems related to a previously described limitation of WorksheetFunction.Transpose when transposing a 1D array wherein only the Only the first uBound(1D_array) mod 2^16 elements will be returned to the transposed array. And, btw, the upper limit of the VBA integer data type is 32767. Commented Jul 15, 2022 at 12:20
  • @JosWoolley Does that really apply? Excel-365 has a limit 16,384 columns, yet 1D arrays larger than that can be successfully processed. Commented Jul 15, 2022 at 12:29
  • @JosWoolley But the array never gets passed back to Excel; it gets consumed by the aggregate function which, in this case, is passed to Debug.Print. So the question of whether the single dimension reflects a row or a column should never come up. Commented Jul 15, 2022 at 14:46
  • @RonRosenfeld Good point re the VBA integer type - that suggests that the problem is an unsigned int in the underlying C code. Which I kinda expected given that this has all the symptoms of a classic case of integer wraparound. Commented Jul 15, 2022 at 14:50

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.