We’re up to the second release candidate for Go 1.22, which should be released quite soon. In my last blog post, I wrote about my work on reflect.TypeFor for Go 1.22. This time, I’ll be writing about how I proposed and implemented slices.Concat.

Here is the signature for slices.Concat:

// Concat returns a new slice concatenating the passed in slices.
func Concat[S ~[]E, E any](slices ...S) S

It’s a pretty straightforward function to have in a generic slices library, and in fact, I proposed it way back in May of 2021. In the subsequent discussion of what to add to the slices package, it got a soft rejection,

Resolved: decided not to add Concat in the initial set. We can always reconsider if there is significant evidence.

For context, the slices package itself was not added to the Go standard library until Go 1.21. Before then, it was only available in a preview form as the external golang.org/x/exp/slices package. The slices package was deliberately kept small and focused, and the emphasis was on getting real world experience before adding it permanently to the standard library. Eventually, however, I wrote a new proposal for slices.Concat, and it got approved by the proposal review group, so I implemented it as well.

One thing that makes me nervous about the current slices package is that it is too easy to make software that is accidentally quadratic through misuse of O(N) functions in a loop. Before slices.DeleteFunc was added, I did a quick scan of Go code listed on SourceGraph, and found that about a quarter of the uses of slices.Delete were done in a loop, which can lead to quadratic performance slow downs as the slice is scanned and copied over and over again.

This kind of performance problem is insidious because with small slices, the performance effect may be negligible, but a slice that is only ten times larger can take one hundred times longer to process, which leads to programs that work fine in testing but choke in production. What I like about slices.Concat is that the performance implications should be relatively clear to the code reader. Programmers who are tempted to use append or slices.Insert in a loop to join a bunch of slices together will hopefully notice that slices.Concat exists and use this instead.

There is something a little odd in the implementation of slices.Concat that might need an explanation if you don’t already know why it’s there. In order to efficiently join the slices together, slices.Concat does one allocation of a new slice long enough to hold all the concatenated parts, instead of appending as it goes along and risking having multiple allocations as the new slice is built. But the code for getting the lengths of all the parts has a seemingly redundant check in it:

size := 0
for _, s := range slices {
	size += len(s)
	if size < 0 {
		panic("len out of range")

The reason to check if the size is less than zero on each loop is to test for an overflow of the length of the slice. If the number of items in the concatenated slices is too large, it might wrap around and become negative.

Overflow bugs can be fairly insidious. Way back in 2006, Josh Bloch wrote about a overflow bug in a common implementation of binary search that wasn’t noticed for about two decades:

This bug can manifest itself for arrays whose length (in elements) is 230 or greater (roughly a billion elements). This was inconceivable back in the ’80s, when Programming Pearls was written, but it is common these days at Google and other places. In Programming Pearls, Bentley says “While the first binary search was published in 1946, the first binary search that works correctly for all values of n did not appear until 1962.” The truth is, very few correct versions have ever been published, at least in mainstream programming languages.

Hopefully, slices.Concat won’t ever suffer from this particular overflow problem, but testing this code presents an interesting challenge. How do you create a slice with close to the length required for an overflow on a 64-bit machine without using exabytes of RAM for testing? The answer is to use a slice of struct{}, which takes up no additional memory beyond its slice header.

It’s worth noting that if you only want to concatenate two slices, there’s no need to use slices.Concat. You can use the built in append function with the ... spread operator instead:

concat := append(s1, s2...)

Early versions of the Concat proposal included a destination slice argument, like append. (Concat(dest []T, ss ...[]T) []T.) Why doesn’t the final version of slices.Concat have a destination argument to allow users to reuse an existing slice as backing?

The issue goes back to what is called the problem of aliasing. If you concatenate onto a nil slice or some slice with insufficient capacity to hold all the concatenated pieces, then there won’t be any problem because a new slice has to be created with all the parts copied onto it. But what if you are concatenating the parts of a slice onto itself? Take this example:

s := []int{1, 2, 3, 4}
_ = slices.ConcatWithDestination(s[:0], s[3:4], s[2:3], s[1:2], s[0:1])
// What is s now?

A naïve implementation of slices.ConcatWithDestination would clobber the 1 at the start of the slice with 4 before copying it onto the end of the slice, so that you end up with 4, 3, 3, 4 instead of 4, 3, 2, 1 as intended.

As it happens, slices.Insert and slices.Replace had this problem too. The fix isn’t too pretty. The code ends up using the unsafe package to check if the slices overlap. If they do, it avoids an allocation by rotating the values into place:

// The hard case - v overlaps c or d. We can't just shift up
// the data because we'd move or clobber the values we're trying
// to insert.
// So instead, write v on top of d, then rotate.
copy(s[n:], v)

// Now we have
// s: aaaaaaaabbbbccccccccvvvv
//            ^   ^       ^   ^
//            i  i+m      n  n+m

rotateRight(s[i:], m)

// Now we have
// s: aaaaaaaavvvvbbbbcccccccc
//            ^   ^       ^   ^
//            i  i+m      n  n+m
// That's the result we want.
return s

Keith Randall wrote a version of slices.Concat that would make the alias checking work, and it added even more code than slices.Insert or slices.Replace because of all the ways the slices might potentially overlap when concatenating.

In the end, it was decided that just always returning a new slice would keep the implementation of slices.Concat simpler and help prevent any issues with accidentally aliasing a slice and getting unexpected results, so the destination slice argument was dropped.

David Chase gave a good explanation of why preventing slice aliasing is important on a past episode of Go Time.

I have a real soft spot for Fortran. And the thing that makes Fortran fast is just like one tiny, little thing, and it’s usually true in programs. And that has to do with when you would, say, pass a pair of slices to a function, Fortran says – you can pretend they [don’t] overlap. If they overlap, it’s not Fortran. […]

But what this does is it lets you do vectorization, just willy-nilly; all sorts of vectorization transformations and parallelization transformations, and reordering. And this is sort of like the key to why Fortran is so fast.

If a slice has no aliases, then the compiler and the CPU don’t have to worry that writing to one part of the slice will affect another slice, so the reads and writes can be done in any order, which can make it much faster. Unfortunately, Go has no way to represent aliasing in its type system, so the compiler can’t just assume that a particular slice has no aliases. It is something to think about changing if there is ever a Go 2 language with a stronger type system.

That’s it for slices.Concat. Subscribe to learn the story behind cmp.Or, coming soon.