Rewriting Thinking Sphinx: Middleware, Glazes and Panes
Time to discuss more changes to Thinking Sphinx with the v3 releases - this time, the much improved extensibility.
There have been a huge number of contributors to Thinking Sphinx over the years, and each of their commits are greatly appreciated. Sometimes, though, the pull requests that come in cover extreme edge cases, or features that are perhaps only useful to the committer. But running your own hacked version of Thinking Sphinx is not cool, and then you’ve got to keep an especially close eye on new commits, and merge them in manually, and… blergh.
So instead, we now have middleware, glazes and panes.
Middleware
The middleware pattern is pretty well-established in the Ruby community, thanks to Rack - but it’s started to crop up in other libraries too (such as Mike Perham’s excellent Sidekiq).
In Thinking Sphinx, middleware classes are used to process search requests. The default set of middleware are as follows:
ThinkingSphinx::Middlewares::StaleIdFilter
adding an attribute filter to hide search results that are known to not match any ActiveRecord objects.ThinkingSphinx::Middlewares::SphinxQL
generates the SphinxQL query to send to Sphinx.ThinkingSphinx::Middlewares::Geographer
modifies the SphinxQL query with geographic co-ordinates if they’re provided via the:geo
option.ThinkingSphinx::Middlewares::Inquirer
sends the constructed SphinxQL query through to Sphinx itself.ThinkingSphinx::Middlewares::UTF8
ensures all string values returned by Sphinx are encoded as UTF-8.ThinkingSphinx::Middlewares::ActiveRecordTranslator
translates Sphinx results into their corresponding ActiveRecord objects.ThinkingSphinx::Middlewares::StaleIdChecker
notes any Sphinx results that don’t have corresponding ActiveRecord objects, and retries the search if they exist.ThinkingSphinx::Middlewares::Glazier
wraps each search result in a glaze if there’s any panes set for the search (read below for an explanation on this).
Each middleware does its thing, and then passes control through to the
next one in the chain. If you want to create your own middleware, your
class must respond to two instance methods: initialize(app)
and
call(contexts)
.
If you subclass from ThinkingSphinx::Middlewares::Middleware
you’ll
get the first for free. contexts
is an array of search context
objects, which provide access to each search object along with the raw
search results and other pieces of information to note between
middleware objects. Middleware are written to handle multiple search
requests, hence why contexts is an array.
If you’re looking for inspiration on how to write your own middleware, have a look through the source
- and here’s an extra example I put together when considering approaches to multi-tenancy.
Glazes and Panes
Sometimes it’s useful to have pieces of metadata associated with each search result - and it could be argued the cleanest way to do this is to attach methods directly to each ActiveRecord instance that’s returned by the search.
But inserting methods on objects on the fly is, let’s face it, pretty damn ugly. But that’s precisely what older versions of Thinking Sphinx do. I’ve never liked it, but I’d never spent the time to restructure things to work around that… until now.
There are now a few panes available to provide these helper methods:
ThinkingSphinx::Panes::AttributesPane
provides a method calledsphinx_attributes
which is a hash of the raw Sphinx attribute values. This is useful when your Sphinx attributes hold complex values that you don’t want to re-calcuate.ThinkingSphinx::Panes::DistancePane
provides the identicaldistance
andgeodist
methods returning the calculated distance between lat/lng geographical points (and is added automatically if the:geo
option is present).ThinkingSphinx::Panes::ExcerptsPane
provides access to anexcerpts
method which you can then chain any call to a method on the search result - and get an excerpted value returned.ThinkingSphinx::Panes::WeightPane
provides the weight method, returning Sphinx’s calculated relevance score.
None of these panes are loaded by default - and so the search results you’ll get are the actual ActiveRecord objects. You can add specific panes like so:
# For every search
ThinkingSphinx::Configuration::Defaults::PANES << ThinkingSphinx::Panes::WeightPane
# Or for specific searches:
search = ThinkingSphinx.search('pancakes')
search.context[:panes] << ThinkingSphinx::Panes::WeightPane
When you do add at least pane into the mix, though, the search result gets wrapped in a glaze object. These glaze objects direct any methods called upon themselves with the following logic:
- If the search result responds to the given method, send it to that search result.
- Else if any pane responds to the given method, send it to the pane.
- Otherwise, send it to the search result anyway.
This means that your ActiveRecord instances take priority – so pane methods don’t overwrite your own code. It also allows for method_missing metaprogramming in your models (and ActiveRecord itself) – but otherwise, you can get access to the useful metadata Sphinx can provide, without monkeypatching objects on the fly.
If you’re writing your own panes, the only requirement is that the initializer must accept three arguments: the search context, the underlying search result object, and a hash of the raw values from Sphinx. Again, the source code for the panes is not overly complex - so have a read through that for inspiration.
I’m always keen to hear about any middleware or panes other people write
- so please, if you do make use of either of these approaches, let me know!