Quantcast
Channel: Hacker News 50
Viewing all articles
Browse latest Browse all 9433

Ruby is too slow for programming competitions - Clif Reeder

$
0
0

Comments:"Ruby is too slow for programming competitions - Clif Reeder"

URL:http://blog.clifreeder.com/blog/2013/04/21/ruby-is-too-slow-for-programming-competitions/


A quick update: a number of people over at Hacker News have pointed out inefficenies in my solution, and various better ways of solving this. At this point, I think it’s fair to say the tldr; is that while my inefficent solution was prohibitevly slow in Ruby, Go could run it with ease. If you are like me and don’t always come up with the best solutions, you may want to consider other languages too.

This past weekend, I participated in the qualification round of Google Code Jam 2013. It was year is my third time participating, and the third time that I have used Ruby as my primary language.

Since I have no prior experience in actually being competitive in programming competitions and I use Ruby at my day job (it’s one of the that drew me there), I’d never given my language choice much of a second thought. Sure, I knew Ruby wasn’t going to be zomg fast, but I always assumed that if I chose the right solution and wrote in an efficient manner (memoizing, storing values/lookups that would be used later, limiting search spaces, etc), my ability to write code quickly and concisely mattered more than sheer processing speed.

I was wrong.

Problem C of the 2013 qualification round was called ‘Fair and Square’, and can be simplified as such:

Given two numbers X and Y, return how many numbers in that range
(inclusive) are palindromes, and also the square of a palindrome.

This seemed pretty straight forward. Obviously, I was going to need a function to determine if a given number was a palindrome, so I benchmarked a method that determined by converting to a string, and one that used ‘math’, because I heard computers do that really fast.

Rehearsal -------------------------------------------------------------
Integer palindrome method 7.700000 0.000000 7.700000 ( 7.705190)
String palindrome method 4.890000 0.010000 4.900000 ( 4.907374)
--------------------------------------------------- total: 12.600000sec
 user system total real
Integer palindrome method 7.770000 0.000000 7.770000 ( 7.773088)
String palindrome method 4.720000 0.010000 4.730000 ( 4.726223)

Source code for this benchmark

I’d been burned in the past in Code Jam problems by manipulating strings when I could have been doing math, so this was a little surprising to me. But hey, numbers don’t lie, and I know I need to be writing efficient Ruby, so this is good to know.

In short order, I had a Ruby program that passed the sample input, as well as the small input.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 def string_palindrome?(num) s = num.to_s s == s.reverse end ARGF.readline count = 0 ARGF.each do |input| count += 1 found = 0 start, finish = input.split(" ").map(&:to_i) sqrt_start = Math.sqrt(start).to_i sqrt_finish = Math.sqrt(finish).to_i (sqrt_start..sqrt_finish).each do |x| if string_palindrome?(x) square = x ** 2 if string_palindrome?(square) && (start..finish).cover?(square) found += 1 end end end puts "Case ##{count}: #{found}" end

I figured that this had a good chance of working. I knew I’d chosen the faster of two options for checking palindromes, and that looking at the square root of the range boundaries cut the search space down a lot.

The first large input for this problem consisted of 10000 test cases with a range of 1 to 1014 (file here). Code Jam requires you run the input and upload the correct output within 8 minutes - it took my implementation 53 minutes to run.

This was pretty surprising to me, so lets dig into why. A ruby-prof output of a partial run of the input:

 %self total self wait child calls name
 49.89 1549.912 984.824 0.000 565.087 326068740 Object#string_palindrome?
 21.47 1974.150 423.798 0.000 1550.352 499 Range#each
 16.34 322.536 322.536 0.000 0.000 326069738 Fixnum#to_s
 12.29 242.552 242.552 0.000 0.000 326068740 String#reverse
 0.02 0.409 0.409 0.000 0.000 557447 Fixnum#**

~50% of execution time spent in string_palindrome? isn’t great, but makes sense. At least we know it would be worst with integer_palindrome?? Much more troubling is that Range#each is 21.47% of the execution time. Extrapolating a bit, that means that from the original runtime of 50 minutes, 10.5 minutes was spent incrementing numbers. There’s no way this is possible, right? Sanity check time, and I’ll even optimize a little by not using Range:

1 2 3 4 5 6 7 8 9 10 11 12 file = File.new(ARGV[0], "r") file.gets count = 0 while (input = file.gets) start, finish = input.split(" ").map(&:to_i) sqrt_start = Math.sqrt(start).to_i sqrt_finish = Math.sqrt(finish).to_i x = sqrt_start while(x <= sqrt_finish) do x += 1 end end

And the output…

real 5m42.952s
user 5m42.787s
sys 0m0.153s

Welp. Five and a half minutes, just to iterate numbers we need. Presumably, almost all of time is just spend incrementing and comparing integers. There should be no GC, but maybe Fixnum to Bignum promotion is afoot.

Obviously, something had to change, so I tried rewriting the program in Go (more on that experience at a later date).

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 package main import ( "fmt" "math" ) func palindrome(num int) bool { var n, reverse, dig int n = num reverse = 0 for (num > 0){ dig = num % 10 reverse = reverse * 10 + dig num = num / 10 } return n == reverse } func main() { var cases int fmt.Scan(&cases) for i := 0; i < cases; i++ { var found, start, finish, sqrt_start, sqrt_finish, square int fmt.Scan(&start, &finish) sqrt_start = int(math.Sqrt(float64(start))) sqrt_finish = int(math.Sqrt(float64(finish))) for j := sqrt_start; j <= sqrt_finish; j++ { if palindrome(j){ square = j*j if palindrome(square) && square >= start && square <= finish { found += 1 } } } fmt.Print("Case #", (i + 1), ": ", found, "\n") } }

While I haven’t gotten around to making the output of this pass the problem, it’s doing the same thing as the Ruby version, except it is FAST.

real 0m0.740s
user 0m0.482s
sys 0m0.189s

I’ve always known that as an interpreted language Ruby was going to be behind on the performance curve, but I am surprised that something so simple as iterating over an integer range could take minutes. In the future, I’ll pass on Ruby for time sensitive programming competitions.


Viewing all articles
Browse latest Browse all 9433

Trending Articles