Using go version go1.25.3 darwin/arm64.
The below implementation is a simplified version of the actual implementation.
type WaitObject struct{ c chan struct{} }
func StartNewTestObject(d time.Duration) *WaitObject {
obj := &WaitObject{c: make(chan struct{})}
go func() {
time.Sleep(d)
close(obj.c)
}()
return obj
}
// WaitAllObjects is the target for benchmarking.
func WaitAllObjects(objs ...*WaitObject) {
for _, obj := range objs {
<-obj.c
}
}
I created helper functions to create multiple WaitObject values, all by calling the above StartNewTestObject function with different values for d, then benchmarking the above WaitAllObjects function on the generated WaitObject values.
For the purpose of generating accurate benchmarking results, I using b.StopTimer() and StartTimer() inside the b.Loop(), to make sure that each loop runs on new WaitObject values.
However, the results showing higher than expected allocations, and it seems that the only way to get expected allocations number is by using an inaccurate benchmark (which generates the WaitObject values once, and reuse them each loop).
This is the benchmark code:
func BenchmarkWaitTestObjects3(b *testing.B) {
for _, tc := range waitObjectCases {
b.Run(tc.name, func(b *testing.B) {
helperBenchmarkWaitTestObjects3(b, tc)
})
}
}
func helperBenchmarkWaitTestObjects3(b *testing.B, tc waitObjectCase) {
b.ReportAllocs()
for b.Loop() {
b.StopTimer()
objs := startWaitObjects(b, tc) // helper function to create `WaitObject` values.
b.StartTimer()
WaitAllObjects(objs...)
}
}
This produced the below results:
goos: darwin
goarch: arm64
cpu: Apple M2
BenchmarkWaitTestObjects3
BenchmarkWaitTestObjects3/500ns
BenchmarkWaitTestObjects3/500ns-8 60702 19406 ns/op 7241 B/op 75 allocs/op
What am I missing? How to benchmark only my WaitAllObjects function on newly generated objects?
The full code is available here: https://go.dev/play/p/hBtyRNWAEkA?v=
EDIT:
As noted by @Mr_Pink in the comments, I wasn't waiting for all the goroutines to be started before starting the timer and benchmarking my target function.
So, I updated the functions that start the goroutines to make sure all goroutines are started before the benchmarking begins.
They're now as follows:
func StartNewTestObject(d time.Duration, wg *sync.WaitGroup) *WaitObject {
wo := NewWaitObject()
go func() {
wg.Done()
time.Sleep(d)
close(wo.c)
}()
return wo
}
func startWaitObjects(tb testing.TB, c waitObjectCase) []*WaitObject {
tb.Helper()
wg := sync.WaitGroup{}
wg.Add(c.n)
objs := make([]*WaitObject, 0, c.n)
for range c.n {
objs = append(objs, StartNewTestObject(c.d, &wg))
}
wg.Wait()
return objs
}
And this made the benchmarks produce more expected results:
goos: darwin
goarch: arm64
cpu: Apple M2
BenchmarkWaitTestObjects3
BenchmarkWaitTestObjects3/500ns
BenchmarkWaitTestObjects3/500ns-8 183475 6753 ns/op 13 B/op 0 allocs/op
The updated code is available here: https://go.dev/play/p/QaGOnmDSmTd
StartNewTestObjectgoroutine may not actually be dispatched until long after the function returns, so it it can get caught by the memory comparison for allocations.