Ruby's Array: Plus vs. Push 07 Sep 2011

Most of the time Ruby let’s us remain ignorant about the details of how it as a language is implemented. But then there’s that some of the time when knowing a little bit about what’s going on under the hood leads to better decision making in code. Take populating arrays.

You’ve likely used +, «, push, or concat. Regardless of which you prefer why do you use the one you do? And when do you use it over the others? If you’re not sure then this post is just for you.

Array#+

ary + other_ary

[1,2,3] + [4,5,6] # => [1,2,3,4,5,6]

This code is deceptively simple and elegant. It’s actually the worst performer. According to rdoc Array#+ returns a new array built by concatenating the two arrays together to produce a third array. Building this brand new array is really what leads to the performance hit. Under the hood Ruby has some work to do:

determine new array length as ary.length + other_ary.length
construct, allocate memory for new array given the new array length
copy contents of ary into new array
copy contents of other_ary into new array

Allocating memory for the new array and then copying the contents from the other arrays are expensive operations. The larger the arrays the more expensive it will be and if you do this in a loop you’ll be in a heap of trouble.

When to use it

Use Array#+ when you really want a new array to work with, but avoid using it in loops to build up arrays, there are better methods suited for that.

Array#push

ary.push *other_ary

ary = [1,2,3]
other_ary = [4,5,6]
ary.push *other_ary # => [1,2,3,4,5,6]

What this code lacks in aesthetic beauty it makes up in performance. The reason this code has the splat operator is so the example acts similar to the Array#+ example. Array#push behaves differently than Array#+. It appends its argument(s) to the receiver rather than construct a new array.

Since Array#push doesn’t construct a new array it’s not nearly as expensive to use. Under the hood we can see why, here’s how it works:

for each argument in the given arguments
  if array is at or over capacity
    increase array capacity by 50%
  
  append argument to array

There’s no memory allocation going on unless the array is at or over capacity in which case ruby will allocate additional memory and increase the array’s capacity by 50%. This makes it much more performant than Array#+.

When to use it

Use Array#push when you don’t actually need a new array to work with. It’s safe to use in loops to build up an array and it doesn’t have the memory or cpu burdens that come with Array#+.

Array#«

ary << value

ary = [1,2,3]
ary << 1 # => [1,2,3,4]
ary << 2 # => [1,2,3,4,5]
ary << 3 # => [1,2,3,4,6]

The shift shift operator is similar to Array#push. Under the hood it’s actually the exact same implementation except for one difference: it doesn’t support variable number of arguments. This means there’s you can’t do the following:

ary << *other_array # WON'T WORK

Using « seems to be idiomatic Ruby in the circles I travel and I see it used more often then push unless something is getting splatted.

When to use it

Use Array#« when you don’t need a new array to work with. Just like its Array#push counterpart it’s safe to use in loops and it doesn’t have the memory or cpu burdents that come with Array#+.

Array#concat

ary.concat other_ary

ary = [1,2,3]
other_ary = [4,5,6]
ary.concat other_ary # => [1,2,3,4,5,6]

Lastly, Array#concat can be used to concatenate a new array with the receiver. This is similar to Array#+ except that concat doesn’t return a new array, it just modifies self.

Underneath it all, the implementation for concat is much lengthier than one might think (it is done using a splice implementation), but it really boils down to the following:

determine new length as ary.length + other_ary.length
if new length makes ary over capacity
  increase ary capacity
copy other_ary into ary

The memory copying makes Array#concat less performant than Array#push and Array#« but more performant than Array#+. As the size of the other_ary increases the worse off the performance of Array#concat will be since that will impact the amount of memory that has to be allocated and copied.

One nice thing about Array#concat is that it more intelligently handles increasing the capacity of ary than say the implementation of Array#push. It is smarter because it determines how long the new length of ary should be and resizes it only once. Array#push checks each time it appends an item, so it could be resizing the array more than once per call if you take advantage of its variable length arguments capability.

When to use it

Use Array#concat when you want to combine two arrays together but you don’t care about having a brand new array. It’s faster and less memory intensive than Array#+ and it will be on par or faster than Array#push if you’re resorting to the splat operator.