Go Concurrent Test Patterns
This guide provides some useful patterns for writing test for concurrent code. For the basic go testing cheatsheet see Go Unit Testing
Out-of-order Results
Say we have a test:
func Test_GetLatestUsers() {
expected := {"Alice", "Bob", "Charlie"}
srv := NewUserServer(
httptest.NewServer(UserRouter()),
)
defer srv.Close()
for _, user := range expected{
go PostCreateUser(srv.Addr, user)
}
time.Sleep(time.Second)
assert.Equal(t, expected, srv.Users)
}
There are a number of things wrong with the test; but let’s start with the fundamental one - it will fail often.
There is no guarantees around the order in which the new go routines will be executed, so the assert.Equal will often be asserting on out of order lists.
A better approach is to anticipate that the users will be out of order, and check that the srv.Users list is equivalent:
//...
time.Sleep(time.Second)
assert.Len(t, srv.Users, len(expected))
for _, user := range expected {
assert.Contains(t, srv.Users, user)
}
If you don’t use Stretchr’s testify you can always create your own helper for checking list equivalence:
func Equivalent(t *testing.T, expected, result []string) {
t.Helper()
if (len(expected) != len(result)) {
t.Fatal("length mismatch")
}
for _, exp := range expected {
found := false
for _, res := range result {
if res == exp {
found = true
break
}
}
if !found {
t.Fatal("%v not found in %v", exp, result)
}
}
}
Race conditions
Say we have a counter which increments each time our fake service is called:
func Test_SetConfig() {
sendCalled := 0
s := NewConfigurator(
&FakeConfiguratorService{
onSend: func(k, v string) {
sendCalled++
}
},
)
configs := map[string]string{
"A": "a",
"B": "b",
"C": "c",
}
for k, v := range configs {
go s.SendConfig(k, v)
}
time.Sleep(time.Second)
assert.Equal(t, 3, sendCalled)
}
Every now and then, this test will fail. This is because of a data race on sendCalled++.
To detect the data race, you can run line:
go test . -race
And you should see something like:
==================
WARNING: DATA RACE
Read at 0x00c0003ba010 by goroutine 7:
# ...
Previous write at 0x00c0003ba010 by goroutine 9:
my-test.Test_SetConfig.func1()
To fix the data race, use a mutex:
func Test_SetConfig() {
mu := sync.Mutex{}
sendCalled := 0
s := NewConfigurator(
&FakeConfiguratorService{
onSend: func(k, v string) {
mu.Lock()
defer mu.Unlock()
sendCalled++
}
},
)
configs := map[string]string{
"A": "a",
"B": "b",
"C": "c",
}
for k, v := range configs {
go s.SendConfig(k, v)
}
time.Sleep(time.Second)
mu.Lock()
defer mu.Unlock()
assert.Equal(t, 3, sendCalled)
}
Running go test . -race should no longer result in DATA RACE warnings.
Remove arbitrary sleep
To really test that our concurrent tests won’t randomly fail, one technique is to run it many times. For example:
go test . -run=Test_GetLatestUsers -count=1000
The problem with this is that we use time.Sleep(time.Second) to prevent the test from exiting before all of our go routines are done. This means the above all would take 17 minutes. We aren’t guaranteed that one second is enough time either!
A better pattern is using wait groups. If you aren’t familiar with sync.WaitGroup, it has three methods:
Add(n)addsnto awaitcounterDone()adds1to adonecounterWait()blocks untildone == waitcounter
For Test_GetLatestUsers the test will look something like:
func Test_GetLatestUsers() {
expected := {"Alice", "Bob", "Charlie"}
wg := sync.Waitgroup{}
wg.Add(len(expected))
srv := NewUserServer(
httptest.NewServer(UserRouter()),
)
defer srv.Close()
for _, user := range expected{
go func(){
PostCreateUser(srv.Addr, user)
wg.Done()
}()
}
wg.Wait()
assert.Len(t, srv.Users, len(expected))
for _, user := range expected {
assert.Contains(t, srv.Users, user)
}
}
With this logic, the execution time of 1000 tests reduces from 1000 seconds to under 1 second.
Testing for reasonable execution time
Adding the wait group above can allow a test to pass even when the go routines take much longer than expected to execute.
To add this timing expectation, we use a channel and select statement to detect failure.
func Test_GetLatestUsers() {
// ...
for _, user := range expected{
go func(){
PostCreateUser(srv.Addr, user)
wg.Done()
}()
}
done := chan(struct{})
go func() {
wg.Wait()
close(done)
}()
select{
case <-done: // Expected
case <-time.After(time.Second):
t.Fatal("test timeout")
}
assert.Len(t, srv.Users, len(expected))
for _, user := range expected {
assert.Contains(t, srv.Users, user)
}
}
This ensures that if a go routine takes too long, the longest the test will take to fail is only one second.
Avoiding failures outside of the test routine
Test failures should not be triggered in new go routines. For example, don’t do this:
func Test_GetLatestUsers() {
// ...
for _, user := range expected{
go func(){
err := PostCreateUser(srv.Addr, user)
if err != nil {
t.Fatalf("error, %v", err)
}
wg.Done()
}()
}
//...
}
If the test exits early (eg. timeout), but a child go routine calls t.Fatalf, it can cause a panic in the test suite.
Use an error channel, which we listen to on the select statement:
func Test_GetLatestUsers() {
// ...
errCh := make(chan error, len(expected))
for _, user := range expected{
go func(){
err := PostCreateUser(srv.Addr, user)
if err != nil {
errCh <- err
}
wg.Done()
}()
}
//...
select{
case <-done: // Expected
case err := <-errCh:
t.Fatalf("error, %v", err)
case <-time.After(time.Second):
t.Fatal("test timeout")
}
//...
}