Good evening

Recently I encountered a curious case with slices in Golang. An obscure feature that might potentially lead to data corruption within a slice. You might think that the capacity of a slice can be set to an arbitrary value with no reasonable consequences, but this case shows that the difference can be drastic. Or should I say, catastrophic?

I assume that you are already familiar with the basics of slice internals.

The example

Let’s initialize the slice. It has zero length and zero capacity. For the sake of verbosity I state both length and capacity explicitly.

package main

import (
	"fmt"
)

func main() {
	ar := make([]int64, 0, 0)
	fmt.Println(ar)
}

Now let’s create another slice from ar and append a value to it.

package main

import (
	"fmt"
)

func main() {
	ar := make([]int64, 0, 0)
	fmt.Println(ar)

	arDeriv1 := ar
	arDeriv1 = append(arDeriv1, 0)
	fmt.Println(ar, arDeriv1)
}

As you know, this slice points to the same underlying array (of length 0, since the capacity is 0). The result is

[]
[] [0]

Nothing unexpected so far.

Let’s create another slice, also from ar and append a different value to it.

package main

import (
	"fmt"
)

func main() {
	ar := make([]int64, 0, 0)
	fmt.Println(ar)

	arDeriv1 := ar
	arDeriv1 = append(arDeriv1, 0)
	fmt.Println(ar, arDeriv1)

	arDeriv2 := ar
	arDeriv2 = append(arDeriv2, 1)
	fmt.Println(ar, arDeriv1, arDeriv2)
}

The result is

[]
[] [0]
[] [0] [1]

Okay, now we see that the contents of arDeriv1 remain unchanged. It is possible because when we perform append, the runtime checks whether there is enough space in the underlying array to put another value into it. In this case there isn’t, since the capacity is 0. So it creates another array with a bigger length, copies the initial contents there (in this case it doesn’t because there is nothing in there) and puts the new value into the bigger array. That means, by the end of the program all three slices point to totally different addresses.

To prove this claim we can use some reflect and unsafe magic found here to get the actual addresses of the underlying arrays.

package main

import (
	"fmt"
	"reflect"
	"unsafe"
)

func main() {
	ar := make([]int64, 0, 0)
	fmt.Println(ar, (*reflect.SliceHeader)(unsafe.Pointer(&ar)).Data)

	arDeriv1 := ar
	arDeriv1 = append(arDeriv1, 0)
	fmt.Println(ar, arDeriv1, (*reflect.SliceHeader)(unsafe.Pointer(&arDeriv1)).Data)

	arDeriv2 := ar
	arDeriv2 = append(arDeriv2, 1)
	fmt.Println(ar, arDeriv1, arDeriv2, (*reflect.SliceHeader)(unsafe.Pointer(&arDeriv2)).Data)
}

The result is

[] 18399184
[] [0] 824634286112
[] [0] [1] 824634327040

The numbers themselves don’t matter, they are just arbitrary addresses. The most important thing here is that they are all not equal. Which means, the derived slices have their own arrays allocated for them. That makes things quite safe.

The recipe for disaster

Now, let’s make just one tiny change. Let’s set the initial capacity to 1, like this ar := make([]int64, 0, 1).

Here comes the result

[] 824634368000
[] [0] 824634368000
[] [1] [1] 824634368000

Boom! The data in arDeriv1 is affected by what we did with arDeriv2. Moreover, we see that the arrays’ addresses are all the same now.

That is because the runtime no longer needs to allocate a new array to store a new value. So when we created arDeriv2 and appended a new value, it simply expanded without any new allocations. But arDeriv1 had the same situation before that and the data appended to it was already there, and it became overwritten.

A real-life case

Suppose you have a slice initialized by a third-party library and you don’t have time to dig into it and see its capacity; so you don’t know it. Then you want to create a local version of the slice and add a zero to it if it is empty (you want the original to remain the same, that’s why you create this local version).

	localAr := ar
	if len(localAr) == 0 {
		localAr = append(localAr, 0)
	}

If you ever create another derivative slice, like localAr2 := ar and append to it, you can never be sure whether the contents of localAr remain the same or are overwritten by the append on localAr2.

That’s why I prefer to simply create a slice with zero value explicitly without appends.

	localAr := ar
	if len(localAr) == 0 {
		localAr = []int64{0}
	}

Or you can simply copy the slice, if it fits your needs. In my case in was inefficient, because if ar contained any values, then I didn’t really require a copy, I could just read from the initial array.

In conclusion

The case discussed here is not surprising provided you are aware of the slice internals. But during the periods of heavy development such obscurities may elude you and make their way into your code. Regrettably, no language is free of those. And the higher the level of the language, the more of such cases arise.

Thanks for tuning in!