23

The "Go maps in action" entry in the Go blog states:

Maps are not safe for concurrent use: it's not defined what happens when you read and write to them simultaneously. If you need to read from and write to a map from concurrently executing goroutines, the accesses must be mediated by some kind of synchronization mechanism. One common way to protect maps is with sync.RWMutex.

However, one common way to access maps is to iterate over them with the range keyword. It is not clear if for the purposes of concurrent access, execution inside a range loop is a "read", or just the "turnover" phase of that loop. For example, the following code may or may not run afoul of the "no concurrent r/w on maps" rule, depending on the specific semantics / implementation of the range operation:

 var testMap map[int]int
 testMapLock := make(chan bool, 1)
 testMapLock <- true
 testMapSequence := 0

...

 func WriteTestMap(k, v int) {
    <-testMapLock
    testMap[k] = v
    testMapSequence++
    testMapLock<-true
 }

 func IterateMapKeys(iteratorChannel chan int) error {
    <-testMapLock
    defer func() { 
       testMapLock <- true
    }
    mySeq := testMapSequence
    for k, _ := range testMap {
       testMapLock <- true
       iteratorChannel <- k
       <-testMapLock
       if mySeq != testMapSequence {
           close(iteratorChannel)
           return errors.New("concurrent modification")
       }
    }
    return nil
 }

The idea here is that the range "iterator" is open when the second function is waiting for a consumer to take the next value, and the writer is not blocked at that time. However, it is never the case that two reads in a single iterator are on either side of a write - this is a "fail fast" iterator, the borrow a Java term.

Is there anything anywhere in the language specification or other documents that indicates if this is a legitimate thing to do, however? I could see it going either way, and the above quoted document is not clear on exactly what consititutes a "read". The documentation seems totally quiet on the concurrency aspects of the for/range statement.

(Please note this question is about the currency of for/range, but not a duplicate of: Golang concurrent map access with range - the use case is completely different and I am asking about the precise locking requirement wrt the 'range' keyword here!)

3
  • According to this maps can have multiple concurrent readers - no writes. (groups.google.com/forum/#!msg/golang-nuts/HpLWnGTp-n8/…) Commented Nov 6, 2016 at 7:55
  • Of course, the very first paragraph I posted says as much. That does not answer my question. I need to know if the entire body section of the for-range statement is considered a "read", or just the start/ends of the iterations of that loop. Please do look at my code above. Commented Nov 6, 2016 at 13:33
  • I did look. And I wasn't sure. That's why I just put a comment for then going & investigating - and I've found what's described in second answer. Commented Nov 7, 2016 at 6:35

2 Answers 2

23

You are using a for statement with a range expression. Quoting from Spec: For statements:

The range expression is evaluated once before beginning the loop, with one exception: if the range expression is an array or a pointer to an array and at most one iteration variable is present, only the range expression's length is evaluated; if that length is constant, by definition the range expression itself will not be evaluated.

We're ranging over a map, so it's not an exception: the range expression is evaluated only once before beginning the loop. The range expression is simply a map variable testMap:

for k, _ := range testMap {}

The map value does not include the key-value pairs, it only points to a data structure that does. Why is this important? Because the map value is only evaluated once, and if later pairs are added to the map, the map value –evaluated once before the loop– will be a map that still points to a data structure that includes those new pairs. This is in contrast to ranging over a slice (which would be evaluated once too), which is also only a header pointing to a backing array holding the elements; but if elements are added to the slice during the iteration, even if that does not result in allocating and copying over to a new backing array, they will not be included in the iteration (because the slice header also contains the length - already evaluated). Appending elements to a slice may result in a new slice value, but adding pairs to a map will not result in a new map value.

Now on to iteration:

for k, v := range testMap {
    t1 := time.Now()
    someFunction()
    t2 := time.Now()
}

Before we enter into the block, before the t1 := time.Now() line k and v variables are holding the values of the iteration, they are already read out from the map (else they couldn't hold the values). Question: do you think the map is read by the for ... range statement between t1 and t2? Under what circumstances could that happen? We have here a single goroutine that is executing someFunc(). To be able to access the map by the for statement, that would either require another goroutine, or it would require to suspend someFunc(). Obviously neither of those happen. (The for ... range construct is not a multi-goroutine monster.) No matter how many iterations there are, while someFunc() is executed, the map is not accessed by the for statement.

So to answer one of your questions: the map is not accessed inside the for block when executing an iteration, but it is accessed when the k and v values are set (assigned) for the next iteration. This implies that the following iteration over the map is safe for concurrent access:

var (
    testMap         = make(map[int]int)
    testMapLock     = &sync.RWMutex{}
)

func IterateMapKeys(iteratorChannel chan int) error {
    testMapLock.RLock()
    defer testMapLock.RUnlock()
    for k, v := range testMap {
        testMapLock.RUnlock()
        someFunc()
        testMapLock.RLock()
        if someCond {
            return someErr
        }
    }
    return nil
}

Note that unlocking in IterateMapKeys() should (must) happen as a deferred statement, as in your original code you may return "early" with an error, in which case you didn't unlock, which means the map remained locked! (Here modeled by if someCond {...}).

Also note that this type of locking only ensures locking in case of concurrent access. It does not prevent a concurrent goroutine to modify (e.g. add a new pair) the map. The modification (if properly guarded with write lock) will be safe, and the loop may continue, but there is no guarantee that the for loop will iterate over the new pair:

If map entries that have not yet been reached are removed during iteration, the corresponding iteration values will not be produced. If map entries are created during iteration, that entry may be produced during the iteration or may be skipped. The choice may vary for each entry created and from one iteration to the next.

The write-lock-guarded modification may look like this:

func WriteTestMap(k, v int) {
    testMapLock.Lock()
    defer testMapLock.Unlock()
    testMap[k] = v
}

Now if you release the read lock in the block of the for, a concurrent goroutine is free to grab the write lock and make modifications to the map. In your code:

testMapLock <- true
iteratorChannel <- k
<-testMapLock

When sending k on the iteratorChannel, a concurrent goroutine may modify the map. This is not just an "unlucky" scenario, sending a value on a channel is often a "blocking" operation, if the channel's buffer is full, another goroutine must be ready to receive in order for the send operation to proceed. Sending a value on a channel is a good scheduling point for the runtime to run other goroutines even on the same OS thread, not to mention if there are multiple OS threads, of which one may already be "waiting" for the write lock in order to carry out a map modification.

To sum the last part: you releasing the read lock inside the for block is like yelling to others: "Come, modify the map now if you dare!" Consequently in your code encountering that mySeq != testMapSequence is very likely. See this runnable example to demonstrate it (it's a variation of your example):

package main

import (
    "fmt"
    "math/rand"
    "sync"
)

var (
    testMap         = make(map[int]int)
    testMapLock     = &sync.RWMutex{}
    testMapSequence int
)

func main() {
    go func() {
        for {
            k := rand.Intn(10000)
            WriteTestMap(k, 1)
        }
    }()

    ic := make(chan int)
    go func() {
        for _ = range ic {
        }
    }()

    for {
        if err := IterateMapKeys(ic); err != nil {
            fmt.Println(err)
        }
    }
}

func WriteTestMap(k, v int) {
    testMapLock.Lock()
    defer testMapLock.Unlock()
    testMap[k] = v
    testMapSequence++
}

func IterateMapKeys(iteratorChannel chan int) error {
    testMapLock.RLock()
    defer testMapLock.RUnlock()
    mySeq := testMapSequence
    for k, _ := range testMap {
        testMapLock.RUnlock()
        iteratorChannel <- k
        testMapLock.RLock()
        if mySeq != testMapSequence {
            //close(iteratorChannel)
            return fmt.Errorf("concurrent modification %d", testMapSequence)
        }
    }
    return nil
}

Example output:

concurrent modification 24
concurrent modification 41
concurrent modification 463
concurrent modification 477
concurrent modification 482
concurrent modification 496
concurrent modification 508
concurrent modification 521
concurrent modification 525
concurrent modification 535
concurrent modification 541
concurrent modification 555
concurrent modification 561
concurrent modification 565
concurrent modification 570
concurrent modification 577
concurrent modification 591
concurrent modification 593

We're encountering concurrent modification quite often!

Do you want to avoid this kind of concurrent modification? The solution is quite simple: don't release the read lock inside the for. Also run your app with the -race option to detect race conditions: go run -race testmap.go

Final thoughts

The language spec clearly allows you to modify the map in the same goroutine while ranging over it, this is what the previous quote relates to ("If map entries that have not yet been reached are removed during iteration.... If map entries are created during iteration..."). Modifying the map in the same goroutine is allowed and is safe, but how it is handled by the iterator logic is not defined.

If the map is modified in another goroutine, if you use proper synchronization, The Go Memory Model guarantees that the goroutine with the for ... range will observe all modifications, and the iterator logic will see it just as if "its own" goroutine would have modified it – which is allowed as stated before.

Sign up to request clarification or add additional context in comments.

13 Comments

Thanks, this answers my question I think. To clear two things up - 1) I am not expecting the"concurrent modification" to never happen in my code - of course it can /will! I am expecting the error to be handled without a concurrent read/write fatal error or possible race condition (ie. well-defined "fail fast" semantics) "Come, modify the map now if you dare!" is exactly right - this is probably not idiomatic in Go but is very much the way Java maps work. The iterator must detect another thread taking up the dare and refuse to emit any more iterated values, because the iterator is "broken".
Second, I definitely "unlock" before returning the error. The way I use the buffered bool channel there, the read "locks" and the write "unlocks". When the iteratorChannel <- k is blocked, the critical section is unlocked, since testMapLock <- true already unlocked the critical - other goroutines may then modify the map and "break" all of the "open" iterators. This is exactly the behavior I want.
To be clear: "do you think the map is read by the fo/range statement between t1 and t2" - that's the crux of this. I want to make sure the answer is "no". One can see many ways that what you are calling the "pointer" to the map (a C-implemented hash_iter*, really) carries state that may be corrupted / produce undefined behavior, if you look at hashmap.c. Add to that the weirdness of /the runtime trying to detect concurrent r/w/ - of course all kinds of complexities here. I want to know I'll always see my "concurrent modification" and never a segfault or that "fatal error".
(the race cond analyzer on my code has always run cleanly - or of course I wouldn't have even posted this question! but just because it runs cleanly doesn't mean there is not an undetected possible race condition. I need the specification (or a v. painful code audit of Go source) to assure me there isn't. The race detector is best-effort, and not exact.)
@BadZen "Second, I definitely "unlock" before returning the error." I don't think you do. As you wrote, the read "locks" and the write "unlocks". Inside your for, you first write (unlock), then you do something (deliver k), then you read (lock). After the lock you test something with if, and if condition is true, you return without unlocking. This seems to me in case of error your last action is "lock", so it remains locked.
|
3

The unit of concurrent access for a for range loop over a map is the map. Go maps in action.

A map is a dynamic data structure that changes for inserts, updates and deletes. Inside the Map Implementation. For example,

The iteration order over maps is not specified and is not guaranteed to be the same from one iteration to the next. If map entries that have not yet been reached are removed during iteration, the corresponding iteration values will not be produced. If map entries are created during iteration, that entry may be produced during the iteration or may be skipped. The choice may vary for each entry created and from one iteration to the next. If the map is nil, the number of iterations is 0. For statements, The Go Programming Language Specification

Reading a map with a for range loop with interleaved inserts, updates and deletes is unlikely to be useful.

Lock the map:

package main

import (
    "sync"
)

var racer map[int]int

var race sync.RWMutex

func Reader() {
    race.RLock() // Lock map
    for k, v := range racer {
        _, _ = k, v
    }
    race.RUnlock()
}

func Write() {
    for i := 0; i < 1e6; i++ {
        race.Lock()
        racer[i/2] = i
        race.Unlock()
    }
}

func main() {
    racer = make(map[int]int)
    Write()
    go Write()
    Reader()
}

Don't lock after the read -- fatal error: concurrent map iteration and map write:

package main

import (
    "sync"
)

var racer map[int]int

var race sync.RWMutex

func Reader() {
    for k, v := range racer {
        race.RLock() // Lock after read
        _, _ = k, v
        race.RUnlock()
    }
}

func Write() {
    for i := 0; i < 1e6; i++ {
        race.Lock()
        racer[i/2] = i
        race.Unlock()
    }
}

func main() {
    racer = make(map[int]int)
    Write()
    go Write()
    Reader()
}

Use the Go Data Race Detector. Read Introducing the Go Race Detector.

7 Comments

Reading from my buffered bool channel "locks" the critical. Writing to it "unlocks" - I am not locking after the read.
The question is if iterations - specifically what parts of the iteration loop - count as "reads" or not wrt map concurrency. Now, from what you pasted: " If map entries that have not yet been reached are removed during iteration, the corresponding iteration values will not be produced. If map entries are created during iteration, that entry may be produced during the iteration or may be skipped" - it sounds like /none/ of the iteration process is a read, and it is explicitly threadsafe (in so far as operations are atomic and have well-defined semantics).
But, your claim that the "fatal error" is produced contradicts this. So, my question still stands - when or at what point in a map iteration is the map "read"? (If your first code example - and mine - are valid, the answer is "only the start/end of the loop cycle implies a map read". I think this is probably the case! But I do not see this stated explicitly, anywhere. I want to be certain.)
@BadZen: Your code doesn't compile and doesn't run: How to create a Minimal, Complete, and Verifiable example..
There isn't an MCVE because there isn't an unexpected behavior I'm saying. I'm asking a conceptual question about Go map concurrency. Just for you though: play.golang.org/p/ZkWWvZZ2sv
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.