This problem, Sum of Parts, tells us to be aware of performance when thinking about our solution because there will be large data sets.
The problem asks us to iterate over an array and find the sum of that element and everything after it, storing each sum in an array until the sum is 0 because there are no more elements of which to find the sum. If given an empty array, the result will be [0]
.
There are so many ways to solve this problem, but I seemed to remember another student telling me that it was faster to create an Array object and iteratively transform it rather than push objects to an Array object. I decided to try to solve the problem this way:
def parts_sums(ls)
sum = 0
results = Array.new(ls.size + 1, 0)
results.map.with_index { |n, i| i == 0 ? 0 : sum += ls[-i] }.reverse
end
This worked, but other people’s solutions seemed to indicate that it wasn’t worth the trouble. I decided to relearn how to do benchmark tests and test other solutions against my own to see which was faster.
def parts_sums2(ls)
result = [ls.sum]
ls.each { |el| result << result.last - el }
result
end
def parts_sums3(ls)
ls.inject([ls.sum]){|s, n| s << s.last - n }
end
def parts_sums4(ls)
sum = ls.sum
sum_of_parts = [sum]
i = 1
while i <= ls.size
sum_of_parts.push(sum - ls[i - 1])
sum -= ls[i - 1]
i += 1
end
sum_of_parts
end
def generate_test_arr
result = []
rand(1000).times { result << rand(1000)}
result
end
require 'benchmark'
n = 10_000
Benchmark.bm do |benchmark|
benchmark.report("version 1") do
n.times do
parts_sums(generate_test_arr)
end
end
benchmark.report("version 2") do
n.times do
parts_sums2(generate_test_arr)
end
end
benchmark.report("version 3") do
n.times do
parts_sums3(generate_test_arr)
end
end
benchmark.report("version 4") do
n.times do
parts_sums4(generate_test_arr)
end
end
end
I decided that this was probably not as accurate as it could be since I am generating a new Array object every time and they could be different sizes. I decided to just use one array object to make sure I was comparing the same thing every time.
Benchmark.bm do |benchmark|
test = generate_test_arr
benchmark.report("version 1") do
n.times do
parts_sums(test)
end
end
benchmark.report("version 2") do
n.times do
parts_sums2(test)
end
end
benchmark.report("version 3") do
n.times do
parts_sums3(test)
end
end
benchmark.report("version 4") do
n.times do
parts_sums4(test)
end
end
end
Here are my results for three different runs:
user system total real
version 1 0.188000 0.000000 0.188000 ( 0.219793)
version 2 0.140000 0.000000 0.140000 ( 0.129404)
version 3 0.172000 0.015000 0.187000 ( 0.237928)
version 4 0.156000 0.000000 0.156000 ( 0.151522)
version 1 1.156000 0.000000 1.156000 ( 1.180984)
version 2 0.765000 0.047000 0.812000 ( 0.890359)
version 3 1.000000 0.000000 1.000000 ( 1.034462)
version 4 0.860000 0.000000 0.860000 ( 0.946641)
version 1 0.891000 0.031000 0.922000 ( 0.943057)
version 2 0.609000 0.047000 0.656000 ( 0.768835)
version 3 0.781000 0.016000 0.797000 ( 0.812925)
version 4 0.657000 0.000000 0.657000 ( 0.784343)
Each time, my version (version 1) is usually the worst performing. The second worst is version 3. The best is version 2, which is a little better than version 4. I decided to change up version 4 to see if changing the #push
method to the shovel operator (a method in Ruby) <<
makes any difference.
def parts_sums4(ls)
sum = ls.sum
sum_of_parts = [sum]
i = 1
while i <= ls.size
sum_of_parts << (sum - ls[i - 1])
sum -= ls[i - 1]
i += 1
end
sum_of_parts
end
It appears to make it faster now in comparison to version 2.
user system total real
version 1 0.937000 0.015000 0.952000 ( 0.984466)
version 2 0.578000 0.063000 0.641000 ( 0.726278)
version 3 0.875000 0.000000 0.875000 ( 0.890843)
version 4 0.563000 0.031000 0.594000 ( 0.619923)
user system total real
version 1 1.000000 0.016000 1.016000 ( 1.079614)
version 2 0.672000 0.031000 0.703000 ( 0.702127)
version 3 1.000000 0.000000 1.000000 ( 1.053134)
version 4 0.594000 0.063000 0.657000 ( 0.659660)
user system total real
version 1 0.718000 0.000000 0.718000 ( 0.729323)
version 2 0.485000 0.031000 0.516000 ( 0.627555)
version 3 0.640000 0.000000 0.640000 ( 0.637833)
version 4 0.407000 0.031000 0.438000 ( 0.456607)
I am still not sure why this is, but it is interesting because it seems like the while
from version 4 loop is faster than the #each
iterator in version 2. One stack overflow answer to a similar question showed that it might be because the use of a block as an argument takes up more time than using a while
loop.