20160323

Ruby MySQL Concurrency

I lamented the other day that there is a paucity of information out there about Ruby+MySQL. Really, I was able to find basically one decent site, and even that seemed to gloss over some details in the area of concurrency and transactions. I wanted to know a bit more about how this works, so I ran some experiments.

Maybe these aren't really questions about Ruby as much as they're questions just about MySQL, but I don't know what the Ruby interface is hiding, so I did some quick tests to see how it behaves. I simply had my Ruby program open two databse connections at the same time, then I drove them to do various things in different orders to see what happened.

For all of these, I create two connections to the same database and set db.autocommit(false) to force it into transaction mode. For data, I used a simple database that just stored a number along with an ID.

Read Before Commit

The basic flow of steps is to prepare and execute one or more statements, then commit to persist them to the database. The first step doesn't really do anything, only sets up some objects, so we don't really need to worry about interleaving it. What if we do a read while another transaction has executed but not committed a write?

pst1 = db1.prepare('UPDATE test_concurrency SET val = val+1 WHERE id = 1;')
pst2 = db2.prepare('SELECT val FROM test_concurrency WHERE id = 1;')

pst1.execute()
pst2.execute()

db2.commit()
val2 = nil
pst2.each() { |row|
    val2 = row[0]
}
$log.log("val2 = #{val2}")

db1.commit()

The output is val2 = 0, which is the initial value. Both connections to the database read the same value to start with. The concurrent update didn't conflict with the read.

Deadlock Writes

What if we try to do two updates at the same time?

pst1 = db1.prepare('UPDATE test_concurrency SET val = val+1 WHERE id = 1;')
pst1.execute()

pst2 = db2.prepare('UPDATE test_concurrency SET val = val+1 WHERE id = 1;')
pst2.execute()

db1.commit()
db2.commit()

After waiting for what seemed like an eternity (in reality about 60 seconds, which I guess is around a thousand years in Internet Time), I got an exception like this:

Mysql error! 2006 MySQL server has gone away
db_conc_test.rb:91:in `commit': MySQL server has gone away (Mysql::Error)

This failed on db1.commit(). It seems like trying to commit the first transaction failed because another transaction was also updating the same data, and they deadlocked.

Lifting the Deadlock

I looked up the error on the Internet, trying to figure out why such an obtuse error was showing up, and after such a long delay. All the results were unhelpful, mostly from people trying to do really large inserts and updates that were expected to take a long time to complete. That's of course not my usage here.

Somehow I stumbled upon the variable innodb_lock_wait_timeout. It defaults to something like 60 seconds, so I tried reducing it to 5.

pst1 = db1.prepare('set innodb_lock_wait_timeout = 5;')
pst1.execute()
db1.commit()

# proceed to do the deadlock writes

Now the error changed! And it failed after--you guessed it--five seconds. Yay.

Mysql error! 1205 Lock wait timeout exceeded; try restarting transaction
db_conc_test.rb:86:in `execute': Lock wait timeout exceeded; try restarting transaction (Mysql::Error)

Seems like a silly default, for the lock timeout to be longer than the overall statement execution timeout, but whatever. This also made it fail at the pst2.execute() call, not the commit call.

Lock Granularity

All the tests so far were updating the same row of the table. What about updating independent rows?

pst1 = db1.prepare('UPDATE test_concurrency SET val = val+1 WHERE id = 1;')
pst1.execute()

pst2 = db2.prepare('UPDATE test_concurrency SET val = val+1 WHERE id = 2;')
pst2.execute()

db1.commit()
db2.commit()

Exactly the same series of statements that deadlocked before, but since we're updating different rows of the table, they complete immediately. Cool. That's useful.

So if you want this transactional safety, not surprisingly, you have to get an update involved, but the granularity is per-row, so that's pretty good. The findings aren't terribly surprising, but I hope at least this sample code is a little useful. The main reason I'm posting this is because I couldn't find much out there about the "server gone away" error and how I fixed it.

0 comments:

Post a Comment