1. 程式人生 > >Introducing the Go Race Detector

Introducing the Go Race Detector

26 June 2013

Introduction

Race conditions are among the most insidious and elusive programming errors. They typically cause erratic and mysterious failures, often long after the code has been deployed to production. While Go's concurrency mechanisms make it easy to write clean concurrent code, they don't prevent race conditions. Care, diligence, and testing are required. And tools can help.

We're happy to announce that Go 1.1 includes a race detector, a new tool for finding race conditions in Go code. It is currently available for Linux, OS X, and Windows systems with 64-bit x86 processors.

The race detector is based on the C/C++

ThreadSanitizer runtime library, which has been used to detect many errors in Google's internal code base and in Chromium. The technology was integrated with Go in September 2012; since then it has detected 42 races in the standard library. It is now part of our continuous build process, where it continues to catch race conditions as they arise.

How it works

The race detector is integrated with the go tool chain. When the -race command-line flag is set, the compiler instruments all memory accesses with code that records when and how the memory was accessed, while the runtime library watches for unsynchronized accesses to shared variables. When such "racy" behavior is detected, a warning is printed. (See this article for the details of the algorithm.)

Because of its design, the race detector can detect race conditions only when they are actually triggered by running code, which means it's important to run race-enabled binaries under realistic workloads. However, race-enabled binaries can use ten times the CPU and memory, so it is impractical to enable the race detector all the time. One way out of this dilemma is to run some tests with the race detector enabled. Load tests and integration tests are good candidates, since they tend to exercise concurrent parts of the code. Another approach using production workloads is to deploy a single race-enabled instance within a pool of running servers.

Using the race detector

The race detector is fully integrated with the Go tool chain. To build your code with the race detector enabled, just add the -race flag to the command line:

$ go test -race mypkg    // test the package
$ go run -race mysrc.go  // compile and run the program
$ go build -race mycmd   // build the command
$ go install -race mypkg // install the package

To try out the race detector for yourself, fetch and run this example program:

$ go get -race golang.org/x/blog/support/racy
$ racy

Examples

Here are two examples of real issues caught by the race detector.

Example 1: Timer.Reset

The first example is a simplified version of an actual bug found by the race detector. It uses a timer to print a message after a random duration between 0 and 1 second. It does so repeatedly for five seconds. It uses time.AfterFunc to create a Timer for the first message and then uses the Reset method to schedule the next message, re-using the Timer each time.

// +build OMIT

package main

import (
	"fmt"
	"math/rand"
	"time"
)

func main() {
    start := time.Now()
    var t *time.Timer
    t = time.AfterFunc(randomDuration(), func() {
        fmt.Println(time.Now().Sub(start))
        t.Reset(randomDuration())
    })
    time.Sleep(5 * time.Second)
}

func randomDuration() time.Duration {
    return time.Duration(rand.Int63n(1e9))
}

This looks like reasonable code, but under certain circumstances it fails in a surprising way:

panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x8 pc=0x41e38a]

goroutine 4 [running]:
time.stopTimer(0x8, 0x12fe6b35d9472d96)
    src/pkg/runtime/ztime_linux_amd64.c:35 +0x25
time.(*Timer).Reset(0x0, 0x4e5904f, 0x1)
    src/pkg/time/sleep.go:81 +0x42
main.func·001()
    race.go:14 +0xe3
created by time.goFunc
    src/pkg/time/sleep.go:122 +0x48

What's going on here? Running the program with the race detector enabled is more illuminating:

==================
WARNING: DATA RACE
Read by goroutine 5:
  main.func·001()
     race.go:14 +0x169

Previous write by goroutine 1:
  main.main()
      race.go:15 +0x174

Goroutine 5 (running) created at:
  time.goFunc()
      src/pkg/time/sleep.go:122 +0x56
  timerproc()
     src/pkg/runtime/ztime_linux_amd64.c:181 +0x189
==================

The race detector shows the problem: an unsynchronized read and write of the variable t from different goroutines. If the initial timer duration is very small, the timer function may fire before the main goroutine has assigned a value to t and so the call to t.Reset is made with a nil t.

To fix the race condition we change the code to read and write the variable t only from the main goroutine:

// +build OMIT

package main

import (
	"fmt"
	"math/rand"
	"time"
)

func main() {
    start := time.Now()
    reset := make(chan bool)
    var t *time.Timer
    t = time.AfterFunc(randomDuration(), func() {
        fmt.Println(time.Now().Sub(start))
        reset <- true
    })
    for time.Since(start) < 5*time.Second {
        <-reset
        t.Reset(randomDuration())
    }
}

func randomDuration() time.Duration {
	return time.Duration(rand.Int63n(1e9))
}

Here the main goroutine is wholly responsible for setting and resetting the Timer t and a new reset channel communicates the need to reset the timer in a thread-safe way.

A simpler but less efficient approach is to avoid reusing timers.

Example 2: ioutil.Discard

The second example is more subtle.

The ioutil package's Discard object implements io.Writer, but discards all the data written to it. Think of it like /dev/null: a place to send data that you need to read but don't want to store. It is commonly used with io.Copy to drain a reader, like this:

io.Copy(ioutil.Discard, reader)

Back in July 2011 the Go team noticed that using Discard in this way was inefficient: the Copy function allocates an internal 32 kB buffer each time it is called, but when used with Discard the buffer is unnecessary since we're just throwing the read data away. We thought that this idiomatic use of Copy and Discard should not be so costly.

The fix was simple. If the given Writer implements a ReadFrom method, a Copy call like this:

io.Copy(writer, reader)

is delegated to this potentially more efficient call:

writer.ReadFrom(reader)

We added a ReadFrom method to Discard's underlying type, which has an internal buffer that is shared between all its users. We knew this was theoretically a race condition, but since all writes to the buffer should be thrown away we didn't think it was important.

When the race detector was implemented it immediately flagged this code as racy. Again, we considered that the code might be problematic, but decided that the race condition wasn't "real". To avoid the "false positive" in our build we implemented a non-racy version that is enabled only when the race detector is running.

But a few months later Brad encountered a frustrating and strange bug. After a few days of debugging, he narrowed it down to a real race condition caused by ioutil.Discard.

Here is the known-racy code in io/ioutil, where Discard is a devNull that shares a single buffer between all of its users.

var blackHole [4096]byte // shared buffer

func (devNull) ReadFrom(r io.Reader) (n int64, err error) {
    readSize := 0
    for {
        readSize, err = r.Read(blackHole[:])
        n += int64(readSize)
        if err != nil {
            if err == io.EOF {
                return n, nil
            }
            return
        }
    }
}

Brad's program includes a trackDigestReader type, which wraps an io.Reader and records the hash digest of what it reads.

type trackDigestReader struct {
    r io.Reader
    h hash.Hash
}

func (t trackDigestReader) Read(p []byte) (n int, err error) {
    n, err = t.r.Read(p)
    t.h.Write(p[:n])
    return
}

For example, it could be used to compute the SHA-1 hash of a file while reading it:

tdr := trackDigestReader{r: file, h: sha1.New()}
io.Copy(writer, tdr)
fmt.Printf("File hash: %x", tdr.h.Sum(nil))

In some cases there would be nowhere to write the data—but still a need to hash the file—and so Discard would be used:

io.Copy(ioutil.Discard, tdr)

But in this case the blackHole buffer isn't just a black hole; it is a legitimate place to store the data between reading it from the source io.Reader and writing it to the hash.Hash. With multiple goroutines hashing files simultaneously, each sharing the same blackHole buffer, the race condition manifested itself by corrupting the data between reading and hashing. No errors or panics occurred, but the hashes were wrong. Nasty!

func (t trackDigestReader) Read(p []byte) (n int, err error) {
    // the buffer p is blackHole
    n, err = t.r.Read(p)
    // p may be corrupted by another goroutine here,
    // between the Read above and the Write below
    t.h.Write(p[:n])
    return
}

The bug was finally fixed by giving a unique buffer to each use of ioutil.Discard, eliminating the race condition on the shared buffer.

Conclusions

The race detector is a powerful tool for checking the correctness of concurrent programs. It will not issue false positives, so take its warnings seriously. But it is only as good as your tests; you must make sure they thoroughly exercise the concurrent properties of your code so that the race detector can do its job.

What are you waiting for? Run "go test -race" on your code today!

相關推薦

Introducing the Go Race Detector

26 June 2013 Introduction Race conditions are among the most insidious and elusive programming errors. The

Introducing the Go Playground

15 September 2010 If you visit golang.org today you'll see our new look. We have given the site a new coat o

Introducing the Münster art exhibition that' Industrie Router s rarer than a solar eclipse

www.inhandnetworks.de Skulptur Projekte Münster is an exhibition that only takes place once every decade. And you’re in luck because this year’s e

Go語言開發者福利 - 國內版 The Go Playground

本文為原創文章,轉載註明出處,歡迎掃碼關注公眾號flysnow_org或者網站www.flysnow.org/,第一時間看後續精彩文章。覺得好的話,順手分享到朋友圈吧,感謝支援。 作為Go語言開發者,我們都知道,Golang為我們提供了一個線上的、可以執行Go語言程式碼的、可以分享Go語言程式碼的

Go race condition以及解決方法

形成條件 一般情況下是由於在沒有加鎖的情況下多個協程進行操作對同一個變數操作形成競爭條件. 如果沒有鎖的情況會輸出結果非1001. func main() { c := 1 g := sync.WaitGroup{} times := 1000 for i:=0

The Go init Function

There are times, when creating applications in Go, that you need to be able to set up some form of state on the initial startup of your program. Th

Introducing the Audience Network SDK Beta Program

At Audience Network, we strive to build solutions to help you deliver the best user experience and grow your business. The input we receive from our develo

The Age of Artificial Intelligence: How To Win The AI Race

It is reminiscent of the Tech Boom in the 1990’s. New tech startups were popping up every day and every major corporation was figuring out their IT strateg

How the Go runtime implements maps efficiently (without generics)

This post discusses how maps are implemented in Go. It is based on a presentation I gave at the GoCon Spring 2018 conference in Tokyo, Japan. What is a

Updating the Go Code of Conduct

23 May 2018 In November 2015, we introduced the Go Code of Conduct. It was developed in a collaboratio

The empty interface in the Go programming language

One of the most hotly debated topics in the world of the Go programming language, is the lack of generics. Generics are considered a key feature in othe

Introducing the beta for always-on tasks

Update 8 January 2018 -- the beta is temporarily on hold while we sort out some issues that cropped up in the first phase. We'll announce here when

Introducing the latest in textiles: Soft hardware

The latest development in textiles and fibers is a kind of soft hardware that you can wear: cloth that has electronic devices built right into it. Researc

Roadmapping the AI race to help ensure safe development of AGI

Roadmapping the AI race to help ensure safe development of AGIThis article accompanies a visual roadmap which you can view and download here.Why are roadma

Introducing the Aurora Storage Engine

What Is Amazon Aurora? Amazon Aurora is a MySQL-compatible relational database service that combines the speed and availability of high-end commer

Introducing the Smart City Cloud Innovation Center at ASU

Arizona State University (ASU) announced the ASU Smart City Cloud Innovation Center (CIC) powered by AWS, an initiative that focuses on building s

Introducing the AWS Database Ready Program

Introducing the new AWS Database Ready Program, enabling software vendors to modernize their software to support Amazon Aurora. Customers are aski

Introducing the Amazon Simple Notification Service

Today I’d like to tell you about our newest service, the Amazon Simple Notification Service. We want to make it even easier for developers

Introducing the SaaS Enablement Framework for AWS Partners

The software as a service (SaaS) delivery model presents developers with a new landscape of technical, operational, and deployment considerations.

A conversation with the Go team

6 June 2013 At Google I/O 2013, several members of the Go team hosted a "Fireside chat." Robert Griese