Freelancing Gods 2015

03 Oct 2007

A Thoughtful Sphinx

In one of the projects I’ve been working on lately, I’ve needed to implement a decent search – and so I looked at both Ferret and Sphinx. I ended up choosing the latter, although I’m not sure why – perhaps just to be different (most people I spoke to are using ferret), or perhaps because the setup seemed easier.

The next step was to pick a Sphinx plugin to work with. Ultrasphinx seemed to have a good set of features (particularly pagination), and supported fields from associations within indexes – something critical for what we were doing.

Unfortunately, grabbing fields from associations wasn’t that easy – and the SQL generated for the Sphinx configuration file was overly complex. I could (and did) change the config file manually, but that makes half the usefulness of the plugin worthless.

So, since I had some spare time, I wrote my own plugin. Much like Rails, it favours convention over configuration – perhaps a little too much so at this point, but I do plan to make it more flexible at some point. Installation is the same as any other plugin:

script/plugin install

An example of defining indexes (within a model class):

define_index do |index|
  index.includes(:first_name, :last_name)

To index the data, just use the thinkingsphinx:index rake task (aliased to ts:in) – which will also generate the configuration file on the fly. My goal is to make changing the configuration file manually unnecessary – making the index task build the configuration file helps enforce this.

And to search:

# Searching particular fields => {:name => "Pat"})
# Or searching all fields"Pat")
# Pagination is built in"Pat", :page => (params[:page] || 1))

Paginated results can also be used by the will_paginate helper from the plugin of the same name. Current documentation can be found on this site.

I managed to use ActiveRecord’s join and associations code, which kept my plugin reasonably lean. For interactions with Sphinx’s searchd daemon, I did look at Dmytro Shteflyuk’s Ruby Sphinx Client API, but the non-ruby-like syntax irritated me, so again, I coded my own – heavily influenced by the original though (ie: he did all the hard work, not me).

There’s no support for some way to update the index pseudo-incrementally (something that is a limitation of Sphinx). If I don’t feel like the incremental updating works well enough, then I may switch to Ferret – which might lead to a Thinking Ferret plugin, perhaps. We’ll just have to wait and see.

Nov 14th 2007 – Update: I’ve just released the internal Sphinx client as its own library – Riddle.


4 responses to this article

27 Aug 2008
Hans said:

I am in total awe of this plugin. Nice work! It made my messy ultrasphinx configs moot, and let me get access to weightings and other stuff that ultrasphinx made less easy.

What is stumping me, however, is this:

I have multiple rails apps running on one server, and need to tell each one how to search against their own indices built on their own database. I tried using the sphinx.yml file to specify a port, but that was epic failure.

Each site ends up searching against the same indices, so the search results aren’t accurate for the sites.

How do I configure each client to run against one searchd (or, to run against multiple searchd’s – I have the overhead to handle it on this server)? I am going to presume that it is somewhere in the sphinx.yml, but alas, I am finding myself hamstrung by this. Even RTFM’ing hasn’t helped a ton. :-)


28 Aug 2008
pat said:

Hans: do you have the settings in config/sphinx.yml set into each environment, like database.yml? If so, and still no luck, let’s continue the conversation on the google group.

22 Sep 2008
mauro catenacci said:

Hi Pat,

I’m new to thinking_sphinx. just switched from the ferret engine. Quick question in reference to the delta property. Do you have to add a new boolean field in the database called delta? Also I’ve read one of your posts stating that there might be issues with the delta updates on rails 2.1. Thanks.

26 Sep 2008
pat said:

Hi Mauro

Yes, a boolean column called delta is required to use the delta indexes. As far as I know, it should work without problems in 2.1 now.

Leave a Comment

Comments are formatted using Textile. Please be respectful of others when posting comments. Be nice.

RssSubscribe to the RSS feed

Related Links

Related Posts

About Freelancing Gods

Freelancing Gods is written by , who works on the web as a web developer in Melbourne, Australia, specialising in Ruby on Rails.

In case you're wondering what the likely content here will be about (besides code), keep in mind that Pat is passionate about the internet, music, politics, comedy, bringing people together, and making a difference. And pancakes.

His ego isn't as bad as you may think. Honest.

Here's more than you ever wanted to know.

Open Source Projects

Other Sites

Creative Commons Logo All original content on this site is available through a Creative Commons by-nc-sa licence.