Goroutines

Concurrency in Go

Go имеет X*Processors:Y*Threads:Z*Goroutines модель выполнения. По умолчанию рекомендуется выполнять все горутины на одном процессоре, но это можно настроить установив GOMAXPROCS.

Один процесс имеет множество тредов и один тред может обслуживать множество горутин. В этом смысле горутины реализованы так же как акторы.

In Go, concurrency has three elements:

goroutines (execution)
channels (communication)
select (coordination)

В go можно себе устроить shared state модифицируя например глобальную переменную, но зачем? Нормальный способ - это коммуникация между горутинами посредством каналов. Тогда только одна горутина будет иметь доступ к переменной в один момент времени.

Data races cannot occur, by design. To encourage this way of thinking we have reduced it to a slogan: Do not communicate by sharing memory; instead, share memory by communicating.

Channels in Go -Go 101

Concurrency is a property of the code; parallelism is a property of the running program.

Goroutine

A goroutine has a simple model: it is a function executing concurrently with other goroutines in the same address space. It is lightweight, costing little more than the allocation of stack space. And the stacks start small, so they are cheap, and grow by allocating (and freeing) heap storage as required.

An executable Go program does have at least one goroutine; the goroutine that calls the main function is known as the main goroutine.
Goroutines are multiplexed onto multiple OS threads so if one should block, such as while waiting for I/O, others continue to run.
Goroutine creation is faster than thread creation, because you aren’t creating an operating system–level resource.
Switching between goroutines is faster than switching between threads because it happens entirely within the process, avoiding operating system calls that are (rel‐ atively) slow.
The only difference between a normal function call and a goroutine is that a goroutine is created with the go statement.
When the call completes, the goroutine exits, silently.
All goroutines are anonymous as we learned from concurrency lesson as goroutine does not have an identity.
Goroutines has dynamic stack size. Start small and grows over the time.
The scheduler is able to optimize its decisions because it is part of the Go process. The scheduler works with the network poller, detecting when a goroutine can be unscheduled because it is blocking on I/O. It also integrates with the garbage col‐ lector, making sure that work is properly balanced across all of the operating sys‐ tem threads assigned to your Go process.

Running goroutines don’t stop a program from exiting. Go program exits when the main function exits. Any goroutines running in the background quietly stop.
A panicking goroutine will crash the whole application. Panics inside goroutines must be handled with defer and recover(). Otherwise the whole application will crash.

Creating new context you need to know when to cancel it.
Creating new goroutine you need to know when to stop it.

Goroutines vs Threads

Более подробная разница между горутинами и процессами:

thread

goroutine

OS threads are managed by kernel and has hardware dependencies.

goroutines are managed by go runtime and has no hardware dependencies.

OS threads generally have fixed stack size of 1-2MB

goroutines typically have staring 2KB stack size

Stack size is determined during compile time and can not grow

Stack size of go is managed in run-time and can grow up to 1GB which is possible by allocating and freeing heap storage

There is no easy communication medium between threads. There is huge latency between inter-thread communication.

goroutine use channels to communicate with other goroutines with low latency (read more).

Threads have identity. There is TID which identifies each thread in a process.

goroutine do not have any identity. go implemented this because go does not have TLS (Thread Local Storage).

Threads have significant setup and teardown cost as a thread has to request lots of resources from OS and return once it's done.

goroutines are created and destoryed by the go's runtime. These operations are very cheap compared to threads as go runtime already maintain pool of threads for goroutines. In this case OS is not aware of goroutines.

Threads are preemptively scheduled (read here). Switching cost between threads is high as scheduler needs to save/restore more than 50 registers and states. This can be quite significant when there is rapid switching between threads.

goroutines are coopertively scheduled (read more). When a goroutine switch occurs, only 3 registers need to be saved or restored.

Executing goroutine

Starts a new goroutine running:

go f(x, y, z)

The evaluation of x, y, and z happens in the current goroutine and the execution of f happens in the new goroutine.

func f(from string) {
    for i := 0; i < 3; i++ {
        fmt.Println(from, ":", i)
    }
}

func main() {
    f("direct")

    go f("goroutine")

    // You can also start a goroutine for an anonymous
    // function call.
    go func(msg string) {
        fmt.Println(msg)
    }("going")

    var input string
    fmt.Scanln(&input)
    fmt.Println("done")
}

// direct : 0
// direct : 1
// direct : 2
// goroutine : 0
// going
// goroutine : 1
// goroutine : 2
// <enter>
// done

Goroutines run in the same address space, so access to shared memory must be synchronized. The sync package provides useful primitives, although you won't need them much in Go as there are other primitives.

Goroutine lifecycle

A goroutine has a simpler lifecycle than an OS thread. It can be doing one of the following:

Executing — The goroutine is scheduled on an M and executing its instructions.
Runnable — The goroutine is waiting to be in an executing state.
Waiting — The goroutine is stopped and pending something completing, such as a system call or a synchronization operation (such as acquiring a mutex).

The Go runtime handles two kinds of queues: one local queue per P and a global queue shared among all the Ps.

Since Go 1.14, the Go scheduler is now preemptive: when a goroutine is running for a specific amount of time (10 ms), it will be marked preemptible and can be context-switched off to be replaced by another goroutine. This allows a long-running job to be forced to share CPU time.

If the workload is CPU-bound, a best practice is to rely on GOMAXPROCS. GOMAXPROCS is a variable that sets the number of OS threads allocated to running goroutines.

Go scheduler is also capable of work stealing. One processor can steal work from another processor.

Execution order

In Go, once the main function returns, the program terminates. Any goroutines that were launched and are still running at this time will also be terminated by the Go runtime. When you write concurrent programs, it’s best to cleanly terminate any goroutines that were launched prior to letting the main function return.

Writing programs that can cleanly start and shut down helps reduce bugs and prevents resources from corruption.

Scheduling

Also, rapid switching between goroutines is possible and more efficient compared to threads. Since one goroutine is running on one thread at a time and goroutines are cooperatively scheduled, another goroutine is not scheduled until current goroutine is blocked. If any Goroutine in that thread blocks say waiting for user input, then another goroutine is scheduled in its place. goroutine can block on one of the following conditions

network input
sleeping
channel operation
blocking on primitives in the sync package

Оnly non-sleeping goroutines are considered for scheduling, main won’t be scheduled again for 10 milli-seconds while it’s sleeping.

Goroutine scope

The goroutine that owns a channel should:

Instantiate the channel.
Perform writes, or pass ownership to another goroutine.
Close the channel.
Ecapsulate the previous three things in this list and expose them via a reader channel.

By assigning these responsibilities to channel owners, a few things happen:

Because we’re the one initializing the channel, we remove the risk of deadlocking
by writing to a nil channel.
Because we’re the one initializing the channel, we remove the risk of panicing by
closing a nil channel.
Because we’re the one who decides when the channel gets closed, we remove the risk of panicing by writing to a closed channel.
Because we’re the one who decides when the channel gets closed, we remove the risk of panicing by closing a channel more than once.
We wield the type checker at compile time to prevent improper writes to our channel.

Closing goroutines

Whenever you launch a goroutine function, you must make sure that it will eventually exit. Unlike variables, the Go runtime can’t detect that a goroutine will never be used again. If a goroutine doesn’t exit, the scheduler will still periodically give it time to do nothing, which slows down your program. This is called a goroutine leak.

The done channel pattern provides a way to signal a goroutine that it’s time to stop processing.

It uses a channel to signal that it’s time to exit. Let’s look at an example where we pass the same data to multiple functions, but only want the result from the fastest function:

func searchData(s string, searchers []func(string) []string) []string {
	done := make(chan struct{})
	result := make(chan []string)
	for _, searcher := range searchers {
		go func(searcher func(string) []string) {
			select {
			case result <- searcher(s):
			case <-done:
			}
		}(searcher)
	}
	r := <-result
	close(done)
	return r
}

Go Memory Model

Creating a goroutine happens before the goroutine’s execution begins. “Conversely, the exit of a goroutine isn’t guaranteed to happen before any event.
A send on a channel happens before the corresponding receive from that channel completes.
Closing a channel happens before a receive of this closure.
A receive from an unbuffered channel happens before the send on that channel completes.

i := 0
ch := make(chan struct{})
go func() {
    i = 1
    <-ch 
}()
ch <- struct{}{} // Because receve is before send we will block here
fmt.Println(i)

Context package

Understanding the context packageMedium

we create a Context object in one goroutine and pass it to the other goroutine.
The other goroutine retrieves the signal channel from the context using Done() and starts its work.
Once the Done channel closes, the goroutine stops whatever it was doing and immediately returns.

A Context can be time-dependent. It can close the signal channel after some definite time. We can specify a deadline or a timeout after which the Context object should close the signal channel.

When you create a Context from another Context object, these contexts form a symbiotic relationship. If the parent Context closes its Done channel, the child’s Done channel is automatically closed and so do the Done channels of any Context objects derived from the child.

An empty context is a Context that has no value, no deadline and it’s never canceled. The context.Background() function returns a default empty Context.

Use context Values only for request-scoped data that transits processes and APIs. That means you should not use it to pass dependencies that exist outside of the lifetime of a request — like loggers, template caches and your database connection pool — to your middleware and handlers.

Previous05. Concurrency NextChannels

Last updated 1 year ago