Recently I encountered a curious case with slices in Golang. An obscure feature that might potentially lead to data corruption within a slice. You might think that the capacity of a slice can be set to an arbitrary value with no reasonable consequences, but this case shows that the difference can be drastic. Or should I say, catastrophic?
I assume that you are already familiar with the basics of slice internals.
Let’s initialize the slice. It has zero length and zero capacity. For the sake of verbosity I state both length and capacity explicitly.
Now let’s create another slice from
ar and append a value to it.
As you know, this slice points to the same underlying array (of length 0, since the capacity is 0). The result is
  
Nothing unexpected so far.
Let’s create another slice, also from
ar and append a different value to it.
The result is
     
Okay, now we see that the contents of
arDeriv1 remain unchanged. It is possible because when we perform
append, the runtime checks whether there is enough space in the underlying array to put another value into it. In this case there isn’t, since the capacity is 0. So it creates another array with a bigger length, copies the initial contents there (in this case it doesn’t because there is nothing in there) and puts the new value into the bigger array. That means, by the end of the program all three slices point to totally different addresses.
To prove this claim we can use some
unsafe magic found here to get the actual addresses of the underlying arrays.
The result is
 18399184   824634286112    824634327040
The numbers themselves don’t matter, they are just arbitrary addresses. The most important thing here is that they are all not equal. Which means, the derived slices have their own arrays allocated for them. That makes things quite safe.
The recipe for disaster
Now, let’s make just one tiny change. Let’s set the initial capacity to 1, like this
ar := make(int64, 0, 1).
Here comes the result
 824634368000   824634368000    824634368000
Boom! The data in
arDeriv1 is affected by what we did with
arDeriv2. Moreover, we see that the arrays’ addresses are all the same now.
That is because the runtime no longer needs to allocate a new array to store a new value. So when we created
arDeriv2 and appended a new value, it simply expanded without any new allocations. But
arDeriv1 had the same situation before that and the data appended to it was already there, and it became overwritten.
A real-life case
Suppose you have a slice initialized by a third-party library and you don’t have time to dig into it and see its capacity; so you don’t know it. Then you want to create a local version of the slice and add a zero to it if it is empty (you want the original to remain the same, that’s why you create this local version).
If you ever create another derivative slice, like
localAr2 := ar and append to it, you can never be sure whether the contents of
localAr remain the same or are overwritten by the append on
That’s why I prefer to simply create a slice with zero value explicitly without appends.
Or you can simply
copy the slice, if it fits your needs. In my case in was inefficient, because if
ar contained any values, then I didn’t really require a copy, I could just read from the initial array.
The case discussed here is not surprising provided you are aware of the slice internals. But during the periods of heavy development such obscurities may elude you and make their way into your code. Regrettably, no language is free of those. And the higher the level of the language, the more of such cases arise.
Thanks for tuning in!