API 2.0: Under the Hood
With the release of Vodspot comes the public beta of our new API service. I’m here to talk a little bit about some of the technology that powers the new API core.
The API is in a unique position in the Vodpod system. The old API was built into our main Vodpod Rails application, which has dozens of required libraries and hefty memory footprint. We wanted the API to be as small, fast, and modular as possible. In fact, since we have no need for templating, URL routing, formatting helpers, or even a full controller architecture, using a large MVC framework could have been overkill.
Ramaze was a natural choice, for three reasons:
- It’s built on Rack, a ruby webserver interface which is quickly becoming the way to glue together ruby web applications. Rack provides pretty much all the request processing we need to handle HTTP endpoints, and allowed us to write dedicated middleware for error handling and statistics.
- Ramaze is compact. We didn’t need to pull in a templating engine, layout system, or file server to get the job done. It doesn’t get in your way when it comes to requiring files or setting up the database, which means we were able to build the API as both a library for inclusion in other applications (such as Laminate), and as a full-fledged web application.
- It provides a slew of useful builtins without much cruft. We took advantage of easy-to-configure Rack middleware, support for basically all Ruby HTTP servers (we chose Thin), centralized logging, controller aspects, and awesome Memcache support.
For database work, we chose Sequel, a Ruby ORM with great support for sharding and other multiple-DB setups. Sequel is a perfect match for our API because it models queries as Datasets: extensible objects encapsulating all the information about a query with chainable modification methods. That lets us take our basic Models and filter, sort, combine, and otherwise transmogrify the SQL queries in several ways, all before any queries are made.
We took advantage of the Sequel plugin system to write custom serializers for XML and JSON support, using the Ruby JSON library and LibXML2 bindings. We also implemented much of the request processing code as methods on Sequel datasets: HTTP and Laminate function parameters like sort and tags are implemented across all our models as dataset methods.
Moreover, Sequel’s powerful query language let us accomplish things that we had to write in SQL by hand using ActiveRecord: many-through-many associations across n tables, filterable unions on multiple copies of the same dataset, and arbitrarily grouped tagging joins. All of this makes for more modular, reusable code.
These models are glued together by the API core, which provides the functions essential to Vodspot: videos(), user(), and so forth. All responses from the core are Sequel Models or Arrays (with a few extensions), supporting #to_json, #to_xml, and #to_hash. The appropriate serialization method is used by the caller, depending on their needs.
Laminate, our templating system for Vodspot, is actually bound directly to the API core via rufus-lua. Return values are transformed into Lua tables with #to_hash and presented to the templating engine for presentation.
HTTP requests are where Ramaze comes in: when configured as an app, the API sets up a Ramaze controller on top of the API core. It essentially transforms URLs like /users/spencer/collection/spencerpod/videos?tags=python&offset=5 into calls like videos('spencer', 'spencerpod', :tags => 'python', :offset => 5). We’re using the aspect helper to handle user identificiation via api_key and auth_key, and to check the cache. We use Ramaze’s provides function to format responses depending on the path extension, and to populate the cache.
Finally, some lightweight Rack middleware responds to errors that are raised anywhere in the application, formatting them based on the request path.
There’s our API architecture in a nutshell. You can see it in action at http://api.vodpod.com/v2.
