confusing concurrency and performance issue in Go

Question

now I start learning Go language by watching this great course. To be clear for years I write only on PHP and concurrency/parallelism is new for me, so I little confused by this.

In this course, there is a task to create a program which calculates factorial with 100 computations. I went a bit further and to comparing performance I changed it to 10000 and for some reason, the sequential program works same or even faster than concurrency.

Here I'm going to provide 3 solutions: mine, teachers and sequential

My solution:

package main

import (
 "fmt"
)

func gen(steps int) <-chan int{
     out := make(chan int)
     go func() {
         for j:= 0; j <steps; j++ {
             out <- j
         }
         close(out)
      }()
      return out
}

func factorial(in <-chan int) <-chan int {
    out := make(chan int) 
    go func() {
        for n := range in {
            out <-  fact(n)
        }
        close(out)
    }()
    return out
}

func fact(n int) int {
    total := 1
    for i := n;i>0;i-- {
        total *=i
    }
    return total
}

func main() {
    steps := 10000
    for i := 0; i < steps; i++ {
        for n:= range factorial(gen(10)) {
            fmt.Println(n)
        }
     }
}

execution time:

real 0m6,356s
user 0m3,885s
sys 0m0,870s

Teacher solution: package main

import (
 "fmt"
)

func gen(steps int) <-chan int{
    out := make(chan int)
    go func() {
        for i := 0; i < steps; i++ {
            for j:= 0; j <10; j++ {
                out <- j
            }
        }
        close(out)
    }()
    return out
}

func factorial(in <-chan int) <-chan int {
    out := make(chan int) 
    go func() {
        for n := range in {
            out <-  fact(n)
        }
        close(out)
    }()
    return out
}

func fact(n int) int {
    total := 1
    for i := n;i>0;i-- {
        total *=i
    }
    return total
}

func main() {
    steps := 10000
    for n:= range factorial(gen(steps)) {
        fmt.Println(n)
    }
}

execution time:

real 0m2,836s
user 0m1,388s
sys 0m0,492s

Sequential:

package main

import (
 "fmt"
)

func fact(n int) int {
    total := 1
    for i := n;i>0;i-- {
        total *=i
    }
    return total
}

func main() {
    steps := 10000
    for i := 0; i < steps; i++ {
        for j:= 0; j <10; j++ {
            fmt.Println(fact(j))
        }
    }
}

execution time:

real 0m2,513s
user 0m1,113s
sys 0m0,387s

So, as you can see the sequential solution is fastest, teachers solution is in the second place and my solution is third.

First question: why the sequential solution is fastest? And second why my solution is so slow? if I understanding correctly in my solution I'm creating 10000 goroutines inside gen and 10000 inside factorial and in teacher solution, he is creating only 1 goroutine in gen and 1 in factorial. My so slow because I'm creating too many unneeded goroutines?

GilesW · Accepted Answer · 2018-04-20 09:41:59Z

2

It's the difference between concurrency and parellelism - your's, you teachers and the sequential are progressively less concurrent in design but how parallel they are depends on number of CPU cores and there is a set up and communication cost associated with concurrency. There are no asynchronous calls in the code so only parallelism will improve speed.

This is worth a look: https://blog.golang.org/concurrency-is-not-parallelism

Also, even with parallel cores speedup will be dependent on nature of the workload - google Amdahl's law for explanation.

answered Apr 20, 2018 at 9:41

GilesW

6153 silver badges7 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Bogdan Dubyk Over a year ago

yeah, I know that concurrency and parallelism are not the same. Btw as I know the current version of go using more than one core by default, isn't it? If so I think it should work faster, no?

GilesW Over a year ago

Yes It can still be slower on more than one core if the cost of setting up the goroutines and communicating between them is high relative to the benefits of the parallel computation

GilesW Over a year ago

I can't be more specific relative to the code as the first two code samples in the question appear to be the same :)

Bogdan Dubyk Over a year ago

hmm, great thank, I think I need some time and more practice to figure out where and when is better to use parallelism. And I think the main idea of this example is just to show how the program can work concurrently and not to show a performance increase? But teacher explain question why we need goroutines to calculate factorials? is to make a program work faster with a lot of computations, which is not true in this case and that is weird.

Bogdan Dubyk Over a year ago

well, they are not same, you can see how much time we need for each of them to be executed. One took 6 seconds and other one 3 seconds

peterSO · Accepted Answer · 2018-04-20 13:15:41Z

Let's start with some fundamental benchmarks for factorial computation.

$ go test -run=! -bench=. factorial_test.go 
goos: linux
goarch: amd64
BenchmarkFact0-4            1000000000           2.07 ns/op
BenchmarkFact9-4            300000000            4.37 ns/op
BenchmarkFact0To9-4         50000000            36.0 ns/op
BenchmarkFact10K0To9-4          3000        384069 ns/op
$

The CPU time is very small, even for 10,000 iterations of factorials zero through nine.

factorial_test.go:

package main

import "testing"

func fact(n int) int {
    total := 1
    for i := n; i > 0; i-- {
        total *= i
    }
    return total
}

var sinkFact int

func BenchmarkFact0(b *testing.B) {
    for N := 0; N < b.N; N++ {
        j := 0
        sinkFact = fact(j)
    }
}

func BenchmarkFact9(b *testing.B) {
    for N := 0; N < b.N; N++ {
        j := 9
        sinkFact = fact(j)
    }
}

func BenchmarkFact0To9(b *testing.B) {
    for N := 0; N < b.N; N++ {
        for j := 0; j < 10; j++ {
            sinkFact = fact(j)
        }
    }
}

func BenchmarkFact10K0To9(b *testing.B) {
    for N := 0; N < b.N; N++ {
        steps := 10000
        for i := 0; i < steps; i++ {
            for j := 0; j < 10; j++ {
                sinkFact = fact(j)
            }
        }
    }
}

Let's look at the time for the sequential program.

$ go build -a sequential.go && time ./sequential
real    0m0.247s
user    0m0.054s
sys     0m0.149s

Writing to the terminal is obviously a major bottleneck. Let's write to a sink.

$ go build -a sequential.go && time ./sequential > /dev/null
real    0m0.070s
user    0m0.049s
sys     0m0.020s

It's still a lot more than the 0m0.000000384069s for the factorial computation.

sequential.go:

package main

import (
    "fmt"
)

func fact(n int) int {
    total := 1
    for i := n; i > 0; i-- {
        total *= i
    }
    return total
}

func main() {
    steps := 10000
    for i := 0; i < steps; i++ {
        for j := 0; j < 10; j++ {
            fmt.Println(fact(j))
        }
    }
}

Attempts to use concurrency for such a trivial amount of parallel work are likely to fail. Go goroutines and channels are cheap, but they are not free. Also, a single channel and a single terminal are the bottleneck, the limiting factor, even when writing to a sink. See Amdahl's Law for parallel computing. See Concurrency is not parallelism.

$ go build -a teacher.go && time ./teacher > /dev/null
real    0m0.123s
user    0m0.123s
sys     0m0.022s

$ go build -a student.go && time ./student > /dev/null
real    0m0.135s
user    0m0.113s
sys     0m0.038s

teacher.go:

package main

import (
    "fmt"
)

func gen(steps int) <-chan int {
    out := make(chan int)
    go func() {
        for i := 0; i < steps; i++ {
            for j := 0; j < 10; j++ {
                out <- j
            }
        }
        close(out)
    }()
    return out
}

func factorial(in <-chan int) <-chan int {
    out := make(chan int)
    go func() {
        for n := range in {
            out <- fact(n)
        }
        close(out)
    }()
    return out
}

func fact(n int) int {
    total := 1
    for i := n; i > 0; i-- {
        total *= i
    }
    return total
}

func main() {
    steps := 10000
    for n := range factorial(gen(steps)) {
        fmt.Println(n)
    }
}

student.go:

package main

import (
    "fmt"
)

func gen(steps int) <-chan int {
    out := make(chan int)
    go func() {
        for j := 0; j < steps; j++ {
            out <- j
        }
        close(out)
    }()
    return out
}

func factorial(in <-chan int) <-chan int {
    out := make(chan int)
    go func() {
        for n := range in {
            out <- fact(n)
        }
        close(out)
    }()
    return out
}

func fact(n int) int {
    total := 1
    for i := n; i > 0; i-- {
        total *= i
    }
    return total
}

func main() {
    steps := 10000
    for i := 0; i < steps; i++ {
        for n := range factorial(gen(10)) {
            fmt.Println(n)
        }
    }
}

thanks, that exactly what I think. It's too trivial task for creating goroutines

Collectives™ on Stack Overflow

confusing concurrency and performance issue in Go

2 Answers 2

5 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related