sane_timeout: a replacement for Ruby's standard library Timeout

Ruby's Timeout library had a serious problems before 1.9: it would sometimes not timeout. This was solved by system-timer (see the readme for more background). But in 1.9, we finally have a Timeout that reliably times out, joy. However, it still has some problems.

As some quick background, here is the basic usage:

require 'timeout'

begin
  Timeout.timeout(1) do
    # spending time doing, say, an IO operation accross the network
    sleep 2
  end
rescue
  puts "Getting the big file took too long. Try again later."
end

output

➔ ruby code.rb
Getting the big file took too long. Try again later.

Timeout.timeout synchronously runs the block passed to it. If it finishes in the time specified (10 seconds in the example), the program continues on. If it does not, Timeout.timeout raises an exception (Timeout::Error), which you can then catch and handle.

Handy!

But the library has problems. Take a look at the line that kills the inner block when it times out. The problem is the behavior of Thread#raise: it raises the exception inside the thread at whatever point of execution the thread happens to be in. I elaborate more on this behavior in this blog post.

This affects the usage of Timeout.timeout in two ways.

First problem: it raises the error at whatever point the inner code was in when the timeout occurred

This makes for a confusing stack trace. Note how in the subsequent invocations below, the top line of the stack trace (where we are used to seeing "where the problem is"), changes.

require 'timeout'

Timeout.timeout(rand) do # random number between 0 and 1 (seconds)
  sleep 0.1
  sleep 0.1
  sleep 0.1
  sleep 0.1
  sleep 0.1
  sleep 0.1
  sleep 0.1
  sleep 0.1
  sleep 0.1
  sleep 0.1
end

output

➔ ruby code.rb
code.rb:5:in `sleep': execution expired (Timeout::Error)
from code.rb:5:in `block in <main>'
from code.rb:3:in `<main>'
➔ ruby code.rb
code.rb:6:in `sleep': execution expired (Timeout::Error)
from code.rb:6:in `block in <main>'
from code.rb:3:in `<main>'
➔ ruby code.rb
code.rb:10:in `sleep': execution expired (Timeout::Error)
from code.rb:10:in `block in <main>'
from code.rb:3:in `<main>'
➔ ruby code.rb
code.rb:4:in `sleep': execution expired (Timeout::Error)
from code.rb:4:in `block in <main>'
from code.rb:3:in `<main>'

This makes debugging more difficult, because whatever point the inner code happened to be in may or may not correlate with what's causing the problem. If we make a call to a database and wrap it in Timeout.timeout, then we want to know "Did this database call take too long?", not "What random line of code inside the database driver was running when we ran out of time?".

Second problem: if the inner code rescues Exception, the outer code will never receive any error

It's inadvisable to rescue from Exception. Code should typically only rescue StandardError, or specific other errors appropriate to the context. Mike Perham covers the issue here.

But the reality is that some code, for reasons bad, questionable, or good, will sometimes rescue from Exception. If you run this code within Timeout.timeout, it affects the semantics of the inner error handling and completely defeats the outer error handling.

require 'timeout'

def process_foos(error_to_rescue)
  begin
    # -> process Foos
    # -> if we run out of Foos, raise
    sleep 2
  rescue error_to_rescue
    # -> email the manager that we ran out of Foos
    puts <<-MESSAGE
      There was a problem. The problem is we ran out of Foos.
      That is definitely what the problem was.
      Don't worry, we emailed the Foo manager and elegantly carried on
      into the outer context.
    MESSAGE
  end
end

begin
  puts "Calling some poorly-written code."
  Timeout.timeout(1){ process_foos(Exception) }
rescue Timeout::Error
  puts "Processing the Foos took too long."
end

begin
  puts "\nCalling some well-written code."
  # even better would be FooError
  Timeout.timeout(1){ process_foos(StandardError) }
rescue Timeout::Error
  puts "Processing the Foos took too long."
end

output

 ruby code.rb
Calling some poorly-written code.
      There was a problem. The problem is we ran out of Foos.
      That is definitely what the problem was.
      Don't worry, we emailed the Foo manager and elegantly carried on
      into the outer context.

Calling some well-written code.
Processing the Foos took too long.

In the first example, the inner code rescues from Exception, so

  1. the error message is incorrect
  2. the outer error never happens and the context calling Timeout.timeout is never informed that the code timed out!

The second example is as things should be.

To fix this problem, I've created a fork of Ruby's Timeout: sane_timeout. sane_timeout never raises an exception within the inner code block, and always raises Timeout::Error if the code times out. Here's the diff between ruby Timeout and sane_timeout. Here are some tests showing the problems with Ruby Timeout. Here are the tests for sane_timeout.

There are still some things to iron out -- most notably, what Ruby Timeout is trying to do with the inner error handling/filtering, and whether sane_timeout should hack it out completely.

Take a look and let me know what you think in the comments below or on twitter!

John BachirCo-Founder and CTO of Medstro

comments powered by Disqus