MySQL data sharding using Spock Proxy

Yesterday at the Silicon valley MySQL Meetup, Frank of Spock.com talked about Spock Proxy. Spock Proxy is a fork of MySQL proxy which has been built to meet the data sharding needs of Spock.com, the people search engine.

Here are some highlights:

Spock.com’s web interface is built on Rails and they use ActiveRecords as their O-R layer for MySQL data access
Spock has around 1,000 web servers using Rails and they connect to MySQL slaves and masters using Spock Proxy
Spock Proxy acts like a normal MySQL engine, except that it transparently talks to other MySQL servers. At spock they use 4 master and 4 slaves each having their own Spock Proxy.
The Web servers each have one connection open to the Spock Proxy while the proxy may have 100s of pooled connections
The Proxy tokenizes a SQL statement and figures out the target shard for the query. The query must have a shard_key. The shard_key is stored in a Universal DB which stores the dictionary of the partitioned tables, shard hostname/user/password, ranges and range for auto_incremented columns
It currently supports only range based partitioning — while a lot of partitioning is done based on hashing, but should not be a big deal to change
The current alpha version is very much suited to meet Spock’s internal needs, but I’m sure people will take this up to generalize
Unsupported query constructs (like inner queries, group by, multi-table joins) may not throw exceptions. DDLs are also not supported

Tags: Mysql, shard, spock proxy

This entry was posted on Tuesday, August 12th, 2008 at 10:38 am and is filed under Mysql. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Pingback: Recent Links Tagged With "exceptions" - JabberTags()