Freelancing Gods 2014

God
27 Dec 2007

Updates for Sphinx 0.9.8r985

Another quick Sphinx post – Riddle is updated to support Sphinx’s latest release (0.9.8r985), and Thinking Sphinx now has that new version of Riddle as well.

I’ve not tested any of this with the recently released Ruby 1.9 yet, though (but it’s on my list of things to do).

Also, thank-you to Joost Hietbrink (again) and Jonathan Conway for their patches to Thinking Sphinx – very much appreciated.

02 Dec 2007

Sphinx-related Updates

Two Sphinx-related tidbits:

Riddle

Riddle now has a tag in SVN for the 0.9.8-r909 release of Sphinx – not that there were really any functional changes compared to r871, besides the two extra match modes (Full Scan and Extended 2 – the latter isn’t going to hang around for long anyway).

Thinking Sphinx

As well as supporting the above version of Sphinx, I’ve now added some brief documentation to Thinking Sphinx that discusses attributes, sorting and delta indexes. To summarise kinda-briefly:

Attributes

Attributes are defined in the define_index block as follows:

define_index do |index|
  index.has.created_at
  index.has.updated_at
  # Field definitions go here
end

They can only be from the indexed model (not associations), and in line with Sphinx’s limitations, must be either integers, floats or timestamps.

Sorting

Ties in closely with attributes – as that’s all Sphinx will let you order by. Use it in the same way as you would in a find call:

Invoice.search :conditions => "expensive", :order => "created_at DESC"

Same approach works for the :include parameter (although this has nothing to do with Sphinx itself).

Delta Indexes

Delta indexes track changes to model records between proper indexes (ie: from the rake task thinking_sphinx:index) – all they require is a boolean field in the model’s table called delta, and for delta indexing to be enabled as follows:

define_index do |index|
  index.delta = true
  # Fields and attributes go here
end

The one catch – at this point, delta indexes are one step off current, as they get indexed before the current transaction to the database is committed. This will get better soon, thanks to some help from Joost Hietbrink and his colleagues at YelloYello – once I find some free time, I’ll get that working much more neatly.

14 Nov 2007

Sphinx's Riddle

Edit: I’ve changed the Subversion reference to Github, and it’s worth noting that Riddle works with Sphinx 0.9.8, 0.9.9 and 1.10-beta at the time of writing (January 2011). Original post continues below:

Built out of the work I’ve done for Thinking Sphinx (which has just got basic support for delta indexes, attributes and sorting – although the documentation doesn’t reflect that), I’ve extracted a new Ruby client that communicates with Sphinx, which I’ve named Riddle.

I’m not going to delve into the code here – because I’m not expecting it to be that useful to many people (and I just wrote examples in the documentation – go read that instead!) – but I’m very happy with how it’s ended up, and it’s got some level of specs to give it a thorough test. It’s also compatible with the most recent release of Sphinx (0.9.8 r871). Should you wish to poke around with it, just clone it from Github:

git clone \
  git://github.com/freelancing-god/riddle.git

It’s also being used in Evan Weaver’s UltraSphinx plugin, which I’m pretty pleased about.

29 Oct 2007

Sphinx Quick Fix

Here’s one small filesystem tweak that’s been handy as I’ve been slowly rebuilding my development environment on Leopard over the last couple of days. It’s to get Sphinx working – there was no problems with compilation or installation, but when I ran searchd or indexer, it complained about not finding the mysql libraries:

dyld: Library not loaded: /usr/local/mysql/lib/mysql/libmysqlclient.15.dylib
  Referenced from: /usr/local/bin/indexer
  Reason: image not found

Now, the expected file path is incorrect – it shouldn’t have the second ‘mysql’. My attempts to change that with various configuration flags didn’t work, so I cheated, and added the folder as a symbolic link:

sudo ln -s /usr/local/mysql/lib /usr/local/mysql/lib/mysql

Suggestions of a cleaner solution always welcome.

03 Oct 2007

A Thoughtful Sphinx

In one of the projects I’ve been working on lately, I’ve needed to implement a decent search – and so I looked at both Ferret and Sphinx. I ended up choosing the latter, although I’m not sure why – perhaps just to be different (most people I spoke to are using ferret), or perhaps because the setup seemed easier.

The next step was to pick a Sphinx plugin to work with. Ultrasphinx seemed to have a good set of features (particularly pagination), and supported fields from associations within indexes – something critical for what we were doing.

Unfortunately, grabbing fields from associations wasn’t that easy – and the SQL generated for the Sphinx configuration file was overly complex. I could (and did) change the config file manually, but that makes half the usefulness of the plugin worthless.

So, since I had some spare time, I wrote my own plugin. Much like Rails, it favours convention over configuration – perhaps a little too much so at this point, but I do plan to make it more flexible at some point. Installation is the same as any other plugin:

script/plugin install
  http://rails-oceania.googlecode.com/svn/patallan/thinking_sphinx

An example of defining indexes (within a model class):

define_index do |index|
  index.includes.email
  index.includes(:first_name, :last_name).as.name
  index.includes.tags.key.as.tags
  index.includes.articles.content.as.article_content
end

To index the data, just use the thinkingsphinx:index rake task (aliased to ts:in) – which will also generate the configuration file on the fly. My goal is to make changing the configuration file manually unnecessary – making the index task build the configuration file helps enforce this.

And to search:

# Searching particular fields
User.search(:conditions => {:name => "Pat"})
# Or searching all fields
User.search("Pat")
# Pagination is built in
User.search("Pat", :page => (params[:page] || 1))

Paginated results can also be used by the will_paginate helper from the plugin of the same name. Current documentation can be found on this site.

I managed to use ActiveRecord’s join and associations code, which kept my plugin reasonably lean. For interactions with Sphinx’s searchd daemon, I did look at Dmytro Shteflyuk’s Ruby Sphinx Client API, but the non-ruby-like syntax irritated me, so again, I coded my own – heavily influenced by the original though (ie: he did all the hard work, not me).

There’s no support for some way to update the index pseudo-incrementally (something that is a limitation of Sphinx). If I don’t feel like the incremental updating works well enough, then I may switch to Ferret – which might lead to a Thinking Ferret plugin, perhaps. We’ll just have to wait and see.

Nov 14th 2007 – Update: I’ve just released the internal Sphinx client as its own library – Riddle.

RssSubscribe to the RSS feed

About Freelancing Gods

Freelancing Gods is written by , who works on the web as a web developer in Melbourne, Australia, specialising in Ruby on Rails.

In case you're wondering what the likely content here will be about (besides code), keep in mind that Pat is passionate about the internet, music, politics, comedy, bringing people together, and making a difference. And pancakes.

His ego isn't as bad as you may think. Honest.

Here's more than you ever wanted to know.

Ruby on Rails Projects

Other Sites

Creative Commons Logo All original content on this site is available through a Creative Commons by-nc-sa licence.