Categories
Code challenges Programming

30 Days of Code #10

In which I try to optimize for performance and miss the mark, but relearn how to use benchmarking to prove that I missed the mark.

This problem, Sum of Parts, tells us to be aware of performance when thinking about our solution because there will be large data sets.

The problem asks us to iterate over an array and find the sum of that element and everything after it, storing each sum in an array until the sum is 0 because there are no more elements of which to find the sum. If given an empty array, the result will be [0].

There are so many ways to solve this problem, but I seemed to remember another student telling me that it was faster to create an Array object and iteratively transform it rather than push objects to an Array object. I decided to try to solve the problem this way:

def parts_sums(ls)
  sum = 0
  results = Array.new(ls.size + 1, 0)
  results.map.with_index { |n, i| i == 0 ? 0 : sum += ls[-i] }.reverse
end

This worked, but other people’s solutions seemed to indicate that it wasn’t worth the trouble. I decided to relearn how to do benchmark tests and test other solutions against my own to see which was faster.

def parts_sums2(ls)
  result = [ls.sum]
  ls.each { |el| result << result.last - el }
  result
end

def parts_sums3(ls)
  ls.inject([ls.sum]){|s, n| s << s.last - n }
end

def parts_sums4(ls)
  sum = ls.sum
  sum_of_parts = [sum]
  i = 1
  
  while i <= ls.size
    sum_of_parts.push(sum - ls[i - 1])
    sum -= ls[i - 1]
    i += 1
  end
  
  sum_of_parts
end

def generate_test_arr
  result = []
  rand(1000).times { result << rand(1000)}
  result
end

require 'benchmark'

n = 10_000

Benchmark.bm do |benchmark|

  benchmark.report("version 1") do
    n.times do
      parts_sums(generate_test_arr)
    end
  end

  benchmark.report("version 2") do
    n.times do
      parts_sums2(generate_test_arr)
    end
  end

  benchmark.report("version 3") do
    n.times do
      parts_sums3(generate_test_arr)
    end
  end

  benchmark.report("version 4") do
    n.times do
      parts_sums4(generate_test_arr)
    end
  end

end

I decided that this was probably not as accurate as it could be since I am generating a new Array object every time and they could be different sizes. I decided to just use one array object to make sure I was comparing the same thing every time.

Benchmark.bm do |benchmark|
  test = generate_test_arr
  
  benchmark.report("version 1") do
    n.times do
      parts_sums(test)
    end
  end

  benchmark.report("version 2") do
    n.times do
      parts_sums2(test)
    end
  end

  benchmark.report("version 3") do
    n.times do
      parts_sums3(test)
    end
  end

  benchmark.report("version 4") do
    n.times do
      parts_sums4(test)
    end
  end

end

Here are my results for three different runs:

       user     system      total        real
version 1  0.188000   0.000000   0.188000 (  0.219793)
version 2  0.140000   0.000000   0.140000 (  0.129404)
version 3  0.172000   0.015000   0.187000 (  0.237928)
version 4  0.156000   0.000000   0.156000 (  0.151522)

version 1  1.156000   0.000000   1.156000 (  1.180984)
version 2  0.765000   0.047000   0.812000 (  0.890359)
version 3  1.000000   0.000000   1.000000 (  1.034462)
version 4  0.860000   0.000000   0.860000 (  0.946641)

version 1  0.891000   0.031000   0.922000 (  0.943057)
version 2  0.609000   0.047000   0.656000 (  0.768835)
version 3  0.781000   0.016000   0.797000 (  0.812925)
version 4  0.657000   0.000000   0.657000 (  0.784343)

Each time, my version (version 1) is usually the worst performing. The second worst is version 3. The best is version 2, which is a little better than version 4. I decided to change up version 4 to see if changing the #push method to the shovel operator (a method in Ruby) << makes any difference.

def parts_sums4(ls)
  sum = ls.sum
  sum_of_parts = [sum]
  i = 1
  
  while i <= ls.size
    sum_of_parts << (sum - ls[i - 1])
    sum -= ls[i - 1]
    i += 1
  end
  
  sum_of_parts
end

It appears to make it faster now in comparison to version 2.

       user     system      total        real
version 1  0.937000   0.015000   0.952000 (  0.984466)
version 2  0.578000   0.063000   0.641000 (  0.726278)
version 3  0.875000   0.000000   0.875000 (  0.890843)
version 4  0.563000   0.031000   0.594000 (  0.619923)

       user     system      total        real
version 1  1.000000   0.016000   1.016000 (  1.079614)
version 2  0.672000   0.031000   0.703000 (  0.702127)
version 3  1.000000   0.000000   1.000000 (  1.053134)
version 4  0.594000   0.063000   0.657000 (  0.659660)

       user     system      total        real
version 1  0.718000   0.000000   0.718000 (  0.729323)
version 2  0.485000   0.031000   0.516000 (  0.627555)
version 3  0.640000   0.000000   0.640000 (  0.637833)
version 4  0.407000   0.031000   0.438000 (  0.456607)

I am still not sure why this is, but it is interesting because it seems like the while from version 4 loop is faster than the #each iterator in version 2. One stack overflow answer to a similar question showed that it might be because the use of a block as an argument takes up more time than using a while loop.