Benchmarking various methods of accumulating a set / unique array in ruby

John Bachir's Code Blog

July 24, 2022

I recently benchmarked a few ways to accumulate a set in ruby. Here are my findings and methodology - apologies for messiness, I'm just sharing what I came up.

This is with ruby 3.1. I also ran under 2.7, I believe the set case was actually faster, but I may be misremembering.

Results

Time

                      user     system      total        real
set               0.095938   0.000404   0.096342 (  0.096341)
array after       0.063998   0.002452   0.066450 (  0.066508)
array in place   17.238500   0.939414  18.177914 ( 18.188532)

Memory Allocations

                 set    28.848k memsize (     0.000  retained)
                         3.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
         array after     8.040k memsize (     0.000  retained)
                         1.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
      array in place     8.072B memsize (     0.000  retained)
                         2.000M objects (     0.000  retained)
                         0.000  strings (     0.000  retained)

Comparison:
         array after:       8040 allocated
                 set:      28848 allocated - 3.59x more
      array in place: 8071798800 allocated - 1003955.07x more

Process Memory

array_after
before (base): 42.95
memory above base:
  largest during iteration: 869.63
  average during iteration: 433.52
  after final array is produced: 877.47

set
before (base): 43.41
memory above base:
  largest during iteration: 865.65
  average during iteration: 432.5
  after final array is produced: 865.65

array_in_place
before (base): 42.98
memory above base:
  largest during iteration: 3312.82
  average during iteration: 1546.41
  after final array is produced: 593.1

Discussion

Time

Array in place is very very slow, the others are very fast and in practice probably identical for most workloads, with array unique being the fastest, perhaps because it does less redundant work, at the expense of memory usage.

Memory Allocations

Array in place allocates 1 million times more objects than set because it needs to create the wrapper" array for each element. In practice the total impact on process memory would not be this much because GC would frequently run (which adds cost and is an additional reason this is a bad solution).

Set uses 3.6x more allocations than array after, I don't have a theory why, didn't look at the code.

Process Memory

I tried a variety of things but wasn't able to get consistent or sensible results out of this. Lack of control of the ruby memory allocator made it difficult. Hopefully allocations measurement above is sufficient to compare memory use.

Conclusions

array_after is the most common approach and is also the fastest and is probably Fine for most applications.

Set uses more total memory AND is slower than array_after. This does not make sense to me, I would have expected memory to be lower. Maybe benchmark-memory doesn't actually provide peak memory?

array_in_place uses the most memory and is also the slowest.

Of course array_after might use much much more memory than array_in_place depending on the data / application. (if building up a 100megabyte array and then after uniq! it's only 0.1 megabyte, then it's using much more memory in the loop than it ultimately needs).

Code

require 'set'

# so we can control when GC happens, which I believe is:
#  - valuable for the time measurements being more consistent, and realistic for a post-warmup app
#  - irrelevant for the allocations measurements
#  - chaotic for the process memory measurements
GC.disable

@data = []
1_000_000.times{ @data << rand(1..1_000) }

# Things we are not able to measure
# - average and max memory or allocation usage while uniq! is running
# - average and max memory or allocation usage while |= [d] or << d are happening (hopefully woudn't change conclusions... but maybe!?)
# - while @s.to_a is happening

############################
# The Three Implementations
#
# each ends with nil because without that, benchmark showed slight variations
# in certain scenarios when not expected (such as whether or not doing to_a at end of set)
# i'm guessing because under the hood benchmark is doing somethign with the return result, such as
# inspect
############################

# Typical simple implementation: build up an array, then call uniq on it afterward
def array_after
  a2 = []
  @data.each do |d|
    yield if block_given?
    a2 << d
  end
  a2.uniq!
  nil
end

# Build in Array, unique on each iteration
# Array elements remain unique on each iteration, theoretically has
# total memory benefits over uniqueing after. But, cost of creating an Array on each iteration comes into play.
def array_in_place
  a = []
  @data.each do |d|
    yield if block_given?
    a |= [d]
  end
  nil
end

# Use Set, which is made for just this purpose
def set
  s = Set.new
  @data.each do |d|
    yield if block_given?
    s << d
  end
  set_a = @s.to_a
  nil
end

#######
# Time
#######
puts "\nTime"
require 'benchmark'
5.times{GC.compact}

Benchmark.bmbm(15) do |x|
  x.report("set") do
    set
  end

  x.report("array after") do
    array_after
  end

  x.report("array in place") do
    array_in_place
  end
end


#####################
# Memory Allocations
#####################
puts "\nMemory Allocations"
GC.enable
require 'benchmark-memory'
GC.disable
5.times{GC.compact}
Benchmark.memory do |x|
  x.compare!

  x.report("set") do
    set
  end

  x.report("array after") do
    array_after
  end

  x.report("array in place") do
    array_in_place
  end
end if false

exit

#################
# Process Memory
# To run these, i disabled all above tests, and only ran one at a time, sometimes with
# RUBY_GC_HEAP_GROWTH_FACTOR=1.1 ruby benchmark.rb
#################
puts "\nProcess Memory"

def print_mem_stats(m)
  mem = GetProcessMem.new
  mem_samples = []
  puts "\n#{m}"

  5.times{GC.compact}

  base = mem.mb.round(2)
  self.send(m){ mem_samples << mem.mb }
  after = mem.mb.round(2)

  puts "before (base): #{base}"
  puts "memory above base:"
  puts "\tlargest during iteration: #{(mem_samples.max - base).round(2)}"
  puts "\taverage during iteration: #{((mem_samples.sum(0.0) / mem_samples.size) - base).round(2)}"
  puts "\tafter final array is produced: #{(after - base).round(2)}"
end

require 'get_process_mem'
5.times{GC.compact}

print_mem_stats(:array_after)
print_mem_stats(:set)
print_mem_stats(:array_in_place)

puts @data.count # to trick GC into not thinking it's done, so that "after above base" comparison is more meaningful
# 5.times{GC.compact}
# puts "cleaned up: #{@mem.mb.round(2)}"
# puts "equal? #{@set_a.sort == @a.sort && @a.sort == @a2.sort}"

John Bachir's Code Blog

Subscribe to this blog

Benchmarking various methods of accumulating a set / unique array in ruby

Results

Time

Memory Allocations

Process Memory

Discussion

Time

Memory Allocations

Process Memory

Conclusions

Code

Share this article with friends