Go Concurrent Test Patterns
This guide provides some useful patterns for writing test for concurrent code. For the basic go testing cheatsheet see Go Unit Testing
Out-of-order Results
Say we have a test:
func Test_GetLatestUsers() {
expected := {"Alice", "Bob", "Charlie"}
srv := NewUserServer(
httptest.NewServer(UserRouter()),
)
defer srv.Close()
for _, user := range expected{
go PostCreateUser(srv.Addr, user)
}
time.Sleep(time.Second)
assert.Equal(t, expected, srv.Users)
}
There are a number of things wrong with the test; but let’s start with the fundamental one - it will fail often.
There is no guarantees around the order in which the new go
routines will be executed, so the assert.Equal
will often be asserting on out of order lists.
A better approach is to anticipate that the users will be out of order, and check that the srv.Users
list is equivalent:
//...
time.Sleep(time.Second)
assert.Len(t, srv.Users, len(expected))
for _, user := range expected {
assert.Contains(t, srv.Users, user)
}
If you don’t use Stretchr’s testify you can always create your own helper for checking list equivalence:
func Equivalent(t *testing.T, expected, result []string) {
t.Helper()
if (len(expected) != len(result)) {
t.Fatal("length mismatch")
}
for _, exp := range expected {
found := false
for _, res := range result {
if res == exp {
found = true
break
}
}
if !found {
t.Fatal("%v not found in %v", exp, result)
}
}
}
Race conditions
Say we have a counter which increments each time our fake service is called:
func Test_SetConfig() {
sendCalled := 0
s := NewConfigurator(
&FakeConfiguratorService{
onSend: func(k, v string) {
sendCalled++
}
},
)
configs := map[string]string{
"A": "a",
"B": "b",
"C": "c",
}
for k, v := range configs {
go s.SendConfig(k, v)
}
time.Sleep(time.Second)
assert.Equal(t, 3, sendCalled)
}
Every now and then, this test will fail. This is because of a data race on sendCalled++
.
To detect the data race, you can run line:
go test . -race
And you should see something like:
==================
WARNING: DATA RACE
Read at 0x00c0003ba010 by goroutine 7:
# ...
Previous write at 0x00c0003ba010 by goroutine 9:
my-test.Test_SetConfig.func1()
To fix the data race, use a mutex:
func Test_SetConfig() {
mu := sync.Mutex{}
sendCalled := 0
s := NewConfigurator(
&FakeConfiguratorService{
onSend: func(k, v string) {
mu.Lock()
defer mu.Unlock()
sendCalled++
}
},
)
configs := map[string]string{
"A": "a",
"B": "b",
"C": "c",
}
for k, v := range configs {
go s.SendConfig(k, v)
}
time.Sleep(time.Second)
mu.Lock()
defer mu.Unlock()
assert.Equal(t, 3, sendCalled)
}
Running go test . -race
should no longer result in DATA RACE
warnings.
Remove arbitrary sleep
To really test that our concurrent tests won’t randomly fail, one technique is to run it many times. For example:
go test . -run=Test_GetLatestUsers -count=1000
The problem with this is that we use time.Sleep(time.Second)
to prevent the test from exiting before all of our go routines are done. This means the above all would take 17 minutes. We aren’t guaranteed that one second is enough time either!
A better pattern is using wait groups. If you aren’t familiar with sync.WaitGroup
, it has three methods:
Add(n)
addsn
to await
counterDone()
adds1
to adone
counterWait()
blocks untildone == wait
counter
For Test_GetLatestUsers
the test will look something like:
func Test_GetLatestUsers() {
expected := {"Alice", "Bob", "Charlie"}
wg := sync.Waitgroup{}
wg.Add(len(expected))
srv := NewUserServer(
httptest.NewServer(UserRouter()),
)
defer srv.Close()
for _, user := range expected{
go func(){
PostCreateUser(srv.Addr, user)
wg.Done()
}()
}
wg.Wait()
assert.Len(t, srv.Users, len(expected))
for _, user := range expected {
assert.Contains(t, srv.Users, user)
}
}
With this logic, the execution time of 1000
tests reduces from 1000 seconds to under 1 second.
Testing for reasonable execution time
Adding the wait group above can allow a test to pass even when the go routines take much longer than expected to execute.
To add this timing expectation, we use a channel and select
statement to detect failure.
func Test_GetLatestUsers() {
// ...
for _, user := range expected{
go func(){
PostCreateUser(srv.Addr, user)
wg.Done()
}()
}
done := chan(struct{})
go func() {
wg.Wait()
close(done)
}()
select{
case <-done: // Expected
case <-time.After(time.Second):
t.Fatal("test timeout")
}
assert.Len(t, srv.Users, len(expected))
for _, user := range expected {
assert.Contains(t, srv.Users, user)
}
}
This ensures that if a go
routine takes too long, the longest the test will take to fail is only one second.
Avoiding failures outside of the test routine
Test failures should not be triggered in new go
routines. For example, don’t do this:
func Test_GetLatestUsers() {
// ...
for _, user := range expected{
go func(){
err := PostCreateUser(srv.Addr, user)
if err != nil {
t.Fatalf("error, %v", err)
}
wg.Done()
}()
}
//...
}
If the test exits early (eg. timeout), but a child go
routine calls t.Fatalf
, it can cause a panic in the test suite.
Use an error channel, which we listen to on the select statement:
func Test_GetLatestUsers() {
// ...
errCh := make(chan error, len(expected))
for _, user := range expected{
go func(){
err := PostCreateUser(srv.Addr, user)
if err != nil {
errCh <- err
}
wg.Done()
}()
}
//...
select{
case <-done: // Expected
case err := <-errCh:
t.Fatalf("error, %v", err)
case <-time.After(time.Second):
t.Fatal("test timeout")
}
//...
}