122

Is there a specific language implementation in Kotlin which differs from another language's implementation of coroutines?

  • What does it mean that a coroutine is like a lightweight thread?
  • What is the difference?
  • Are Kotlin coroutines actually running in parallel (concurrently)?
  • Even in a multi-core system, is there only one coroutine running at any given time?

Here I'm starting 100,000 coroutines. What happens behind this code?

for (i in 0..100000) {
    async(CommonPool) {
        // Run long-running operations
    }
}
1

3 Answers 3

119

What does it mean that a coroutine is like a lightweight thread?

Coroutine, like a thread, represents a sequence of actions that are executed concurrently with other coroutines (threads).

What is the difference?

A thread is directly linked to the native thread in the corresponding OS (operating system) and consumes a considerable amount of resources. In particular, it consumes a lot of memory for its stack. That is why you cannot just create 100k threads. You are likely to run out of memory. Switching between threads involves OS kernel dispatcher and it is a pretty expensive operation in terms of CPU cycles consumed.

A coroutine, on the other hand, is purely a user-level language abstraction. It does not tie any native resources and, in the simplest case, uses just one relatively small object in the JVM heap. That is why it is easy to create 100k coroutines. Switching between coroutines does not involve OS kernel at all. It can be as cheap as invoking a regular function.

Are Kotlin coroutines actually running in parallel (concurrently)? Even in a multi-core system, is there only one coroutine running at any given time?

A coroutine can be either running or suspended. A suspended coroutine is not associated to any particular thread, but a running coroutine runs on some thread (using a thread is the only way to execute anything inside an OS process). Whether different coroutines all run on the same thread (a thus may use only a single CPU in a multicore system) or in different threads (and thus may use multiple CPUs) is purely in the hands of a programmer who is using coroutines.

In Kotlin, dispatching of coroutines is controlled via coroutine context. You can read more about then in the Guide to kotlinx.coroutines

Here I'm starting 100,000 coroutines. What happens behind this code?

Assuming that you are using launch function and CommonPool context from the kotlinx.coroutines project (which is open source) you can examine their source code here:

The launch just creates new coroutine, while CommonPool dispatches coroutines to a ForkJoinPool.commonPool() which does use multiple threads and thus executes on multiple CPUs in this example.

The code that follows launch invocation in {...} is called a suspending lambda. What is it and how are suspending lambdas and functions implemented (compiled) as well as standard library functions and classes like startCoroutines, suspendCoroutine and CoroutineContext is explained in the corresponding Kotlin coroutines design document.

Sign up to request clarification or add additional context in comments.

5 Comments

So roughly speaking, does that mean starting a couroutine is similar to adding a job into a thread queue where thread queue is controlled by user?
Yes. It can be a queue for a single thread or a queue for a thread pool. You can view coroutines as a higher-level primitive that lets you avoid manually (re)submitting continuations of you business logic to the queue.
so doesn't that mean when we run multiple coroutines parallelly, that's not true parallelism if number of coroutines is much bigger than the thread number of threads in the queue? If that's the case, then this sounds really similar to Java's Executor, is there any relationship between these two?
That is not different from threads. If the number of threads is larger that number of physical core the it is not true parallelism. The difference is that threads are scheduled on cores preemptively, while coroutines are scheduled onto threads cooperatively
The second and third links are broken (404).
93

Since I used coroutines only on JVM, I will talk about the JVM backend. There are also Kotlin Native and Kotlin JavaScript, but these backends for Kotlin are out of my scope.

So let's start with comparing Kotlin coroutines to other languages coroutines. Basically, you should know that there are two types of coroutines: stackless and stackful. Kotlin implements stackless coroutines - it means that coroutine doesn't have its own stack, and that limiting a little bit what coroutine can do. You can read a good explanation here.

Examples:

  • Stackless: C#, Scala, Kotlin
  • Stackful: Quasar, Javaflow

What does it mean that a coroutine is like a lightweight thread?

It means that coroutine in Kotlin doesn't have its own stack, it doesn't map on a native thread, it doesn't require context switching on a processor.

What is the difference?

Thread - preemptively multitasking. (usually). Coroutine - cooperatively multitasking.

Thread - managed by OS (usually). Coroutine - managed by a user.

Are Kotlin coroutines actually running in parallel (concurrently)?

It depends. You can run each coroutine in its own thread, or you can run all coroutines in one thread or some fixed thread pool.

More about how coroutines execute is here.

Even in a multi-core system, is there only one coroutine running at any given time?

No, see the previous answer.

Here I'm starting 100,000 coroutines. What happens behind this code?

Actually, it depends. But assume that you write the following code:

fun main(args: Array<String>) {
    for (i in 0..100000) {
        async(CommonPool) {
            delay(1000)
        }
    }
}

This code executes instantly.

Because we need to wait for results from async call.

So let's fix this:

fun main(args: Array<String>) = runBlocking {
    for (i in 0..100000) {
        val job = async(CommonPool) {
            delay(1)
            println(i)
        }

        job.join()
    }
}

When you run this program, Kotlin will create 2 * 100000 instances of Continuation, which will take a few dozen MB of RAM, and in the console, you will see numbers from 1 to 100000.

So let’s rewrite this code in this way:

fun main(args: Array<String>) = runBlocking {

    val job = async(CommonPool) {
        for (i in 0..100000) {
            delay(1)
            println(i)
        }
    }

    job.join()
}

What do we achieve now? Now we create only 100,001 instances of Continuation, and this is much better.

Each created Continuation will be dispatched and executed on CommonPool (which is a static instance of ForkJoinPool).

8 Comments

Great answer, but I'd suggest to make one important correction. The coroutines in Kotlin used to be stackless in initial pre-release preview, but were actually released in Kotlin 1.1 with support for suspension at any stack depth, just like in Quasar, for example. For those who are familiar with Quasar, it is quite easy to see 1-to-1 correspondence between Quasar's throws SuspendExecution and Kotlin's suspend modifier. The implementation details are quite different, of course, but user experience is quite similar.
You are also welcome to checkout details on the actual implementation of Kotlin coroutines in the corresponding design document.
Frankly, I don't know what the term "stackful coroutine" means. I have not seen any formal/technical definition of this term and I've seen different people using it in completely contradictory ways. I'd avoid using the term "stackful coroutine" altogether. What I can say for sure, and what is easy to verify, is that Kotlin coroutines are way closer to Quasar and are vey much unlike C#. Putting Kotlin corutines into the same bin as C# async does not seem right regardless of your particular definition of the word "stackful coroutine".
I'd classify coroutines in various languages in the following way: C#, JS, etc have future/promise-based coroutines. Any asynchronous computation in these languages must return some kind of future-like object. It is not really fair to call them stackless. You can express async computations of any depth, it is just syntactically and implementation-wise inefficient with them. Kotlin, Quasar, etc have suspension/continuation-based coroutines. They are strictly more powerful, because they can be used with future-like objects or without them, using suspending functions alone.
Ok. Here is a good paper that gives background on coroutines and gives more-or-less precise definition of "stackful coroutine": inf.puc-rio.br/~roberto/docs/MCC15-04.pdf It implies that Kotlin implements stackful coroutines.
|
3

Here I provide my understanding of threads and coroutines as a Kotlin programmer.
Note that the new Java virtual threads is something similar to coroutines in Kotlin.

Thread

A thread is a sequence of programming language statements like assignments, calculations, ifs, fors, whiles, function calls, etc. (each of which compiles to single or multiple CPU instructions) that are executed by CPU sequentially one after the other. Statements in one thread CAN run concurrently/parallelly in respect to statements in other threads (if there are any other threads).

In other words, you say to the OS that this block of code (bunch of statements which is like any other ordinary code, maybe tens or thousands of lines, creating objects, calling functions, those functions calling other functions, changing variables, having loops, etc.) can be executed concurrently/parallelly in respect to code that is not in the block.

So, if we have threads t1 and t2 and t3 in the process, the operating system may execute one or more of t1 statements and then switch to execute statements in t2 where it left off last time and then switch to execute statements in t3 where it left off last time and then again switch to continue execute statements in t1 where it left off last time and this process continues (pun intended) indefinitely until either there is no more statements in the threads or process is terminated/killed.

Parallel vs concurrent

Parallel is simultaneous: in a given moment, two or more things are executing at the same time.
Cocurrent is rapid switching between two or more things so it appears they are parallel: in a given moment, only a single thing is executing.

If the CPU is multicore, the OS may execute threads in parallel/simultaneously instead of concurrently (repeatedly switching between them).

Coroutine

A coroutine is exactly like a thread in that it is a sequence of statements that run one after the other in the coroutine itself, but CAN run concurrently in respect to statements in other coroutines (if there are any other coroutines).

But, coroutines are something that run on top of threads and instead of the operating system being in charge of switching between them, it is the programming language runtime that switches between coroutines. So, if the coroutine c1 is executing in thread t1, the runtime may switch to execute c2 and then c3 and so on while the OS has not yet even switched from t1 to t2. The OS does not even know about coroutines and it sees just a thread of statements (pun intended) some of which may be for c1 and some rest may be for c2 etc.

Comparison and the difference

  • The operating system keeps track of each thread by a data structure called TCB. The language runtime keeps track of each coroutine with a data structure called continuation.
  • Each thread in OS has its own stack in memory which takes several Megabytes but each coroutine continuation takes only several Kilobytes. So, it is much cheaper to create thousands of coroutines vs thousands of threads.
  • I think that switching between threads (which is done by OS) is more expensive and time consuming than switching between coroutines (which is done by language runtime).
  • Coroutines are non-blocking. It means when you do some IO (e.g. read from disk), instead of being blocked and put aside by the OS, coroutines suspend and free their underlying thread so the thread can execute other code from other coroutines (the mechanism that the coroutine uses to know whether what it is waiting for is ready so it can resume is probably by polling or by providing a callback or by OS interrupts).
  • Coroutines have structured concurrency (more control over their lifetime and cancellation).
    In Kotlin, via CoroutineScope.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.