Thinking Sphinx Reborn
So, over the last month or so I’ve been working hard on rewriting Thinking Sphinx – and it’s now time to release those changes publicly. The site’s now got a brief quickstart page and a detailed usage page beyond the rdoc files, and there will be more added over the coming weeks.
A quick overview of what’s shiny and new:
Better index definition syntax
This part reworked many times, finally to something I’m pretty happy with:
define_index do
indexes [first_name, last_name], :as => :name, :sortable => true
indexes email, location
indexes [posts.content, posts.subject], :as => :posts
end
Polymorphic association support in indexes
When you’re drilling down into your associations for relevant field data, it’s now safe to use polymorphic associations – Thinking Sphinx is pretty smart about figuring what models to look at. Make sure you put table indexes on your _type columns though.
MVA Support
Multi-Value Attributes now work nice and cleanly – so you can tie an array of integers to any record.
Multi-Model Searching
Just like before, you can search for records of a specific model. This time around though, you can also search across all your models – and the results still use will_paginate if it’s installed.
ThinkingSphinx::Search.search "help"
Better Filter Support
It was kinda in there to start with, but now it’s much smarter – and it all goes into the conditions hash, just like a find call:
User.search :conditions => {:role_id => 5}
Article.search :conditions => {:author_ids => [12, 24, 48]}
Sorting by Fields
As you may have noticed in the first code block of this post, you can mark fields as :sortable – what this does is it uses Sphinx’s string attributes, and creates a matching attribute that acts as a sort-index to the field. When specifying the search options though, you can just use the field’s name – Thinking Sphinx knows what you’re talking about.
User.search "John", :order => :name
User.search "Smith", :order => "name DESC"
Even More
I’m so eager to share this new release that there’s probably a few things that need a bit more documentation – that will appear both on the Thinking Sphinx site and here on the blog. I’m planning on writing some articles that provide a solid overview to Sphinx in general – which will hopefully be some help no matter what plugin you use – and then dive into some regular ‘recipes’ of Thinking Sphinx usage, and some detailed posts of the cool new features as well.
Also in the pipeline is Merb support – just for ActiveRecord initially, but I’d love to get it talking to DataMapper as well.
Update: Jonathan Conway’s got a branch working in Merb and Rails – needless to say, I’ll be updating trunk with his patch as soon as possible.
Comments
22 responses to this article
Pat, congratulations on releasing a new version of Thinking Sphinx. I’m looking forward to using it in my current project.
Could you give the option for DISTINCT in GROUP_CONCAT?
For example:
SEPARATOR ’ ‘">CAST AS CHAR)
oops:
CAST(GROUP_CONCAT(DISTINCT `vendor_details`.`group_id` SEPARATOR ' ') AS CHAR)
Thuva: Thank you :)
Chris: Great that you’re enjoying the plugin. Why do you need the distinct though? I think I know a pretty easy way for me to add it in, I’m just not (yet) sure why it’s needed.
Does Thinking Sphinx support eager loading of associated models? Similar to the :include option in ActiveRecord find calls?
Rob: Yes, if you specify :include in your search calls, it will be respected.
The one caveat – since Sphinx returns an array of ids, each model has to be loaded in a separate call to the database, so the full speed benefits of :include aren’t quite reached.
ie: Think of it like
User.find(:first, :include => [:posts])instead ofUser.find(:all, :include => [:posts])
I’m loving this plugin! Most excellent work!
I have one “query” related to searching a selected group of indexes.
It would great if you could supply a list of class names (like UltraSphinx does).
For example:
ThinkSphinx::Search.search(‘query’, { :class_names => [‘Article’, ‘Product’, ‘Customer’] })Looking at ThinkSphinx::Search it seems to be capable of handing one index or all indexes, but not a selected group of indexes.
Is this a feature you have planned or thought about? Just curious… ;o)
Oliver: It’s not something I’d ever thought of – but easy enough to add. I’ll see if I can get it working over the next few days.
Thanks for the feedback :)
I’ve been reading about Thinking Sphinx a little and so far it looks promising (especially as a Ferret replacement). I’d suggest you write a comprehensive tutorial on Thinking Sphinx usage in Rails. Similar to the one Rails Envy did for Ferret (with highlighting, field boosting, pagination, etc).
Hi Paul – writing some detailed tutorials is definitely on my list of things to do. I’m hoping to start with a solid introduction to Sphinx first (by this weekend would be nice), and then I’ll move on to how to use Thinking Sphinx.
This looks very interesting. I have narrowed down my options to Thinking Sphinx and Ultrasphinx. Does Ultrasphinx still have more features with this latest release? If so, what does it have the Thinking Sphinx doesn’t? In particular, faceting is very important for my application.
Hi Brian
At this stage, if you need faceting, go with Ultrasphinx. I know Patrick Lenz is working on a patch for faceting with Thinking Sphinx (check out his git tree on GitHub)... I don’t have the time to focus on it myself at the moment, so until he’s done, it looks like Ultrasphinx is what suits your needs best.
Beyond that though, I think the two plugins are pretty close in regards to features.
My needs for faceting are still a few months out, so something in the works might be ok. There must be some fundamental difference between Ultrasphinx and Thinking Sphinx, otherwise there would be no need for multiple plugins. What are the two approaches and what are the benefits/problems with each approach. I’d love to have the time to simply try both of them out and learn for myself, but unfortunately I have too many other things on my plate right now. A brief explanation would be much appreciated. Thanks.
I feel there’s two main differences – the first is how you define indexes. I think my syntax is clean and obvious, and also allows you to get data through associations very easily.
The second is approach – I aim to have everything managed through Ruby and a bit of YAML. That is, you don’t need to edit the Sphinx configuration file yourself – the plugin sets it all up for you. I also aim for a very convention-over-configuration setup – if you don’t want to, you don’t need to worry about file paths – it puts it all in your application’s directory, and handles multiple environments fine.
Caveat with all this: obvious bias, plus I haven’t spent any time using Ultraphinx for a good 8 months or so.
Hi. Just tried your Sphinx plugin (third plugin this last week) and stumbled upon some problems.
1. Your plugin seems to be MySQL only? Found a “mysql” in plugins/thinking_sphinx/lib/thinking_sphinx/configuration.rb and replaced it to pgsql to make it work with my database.
2. ... however, the SQL-string that gets generated won’t work. The postgres doesn’t like your “´” and when deleting those, it gives me the error tbl_archive.collect doesn’t exists. Which is true. Why does it try to get a column that doesn’t exist? What is “collect”?Would love to get some answers :)
Does Eager Loading work with Thinking Sphinx?
Cheri – to some extent it is – see the comments above between myself and Rob Olsen. I do have plans to support it fully though – hopefully they’ll be implemented in the next week or two.
Great plugin! Do all the indexing features of Thinking Sphinx negate the need to add indices to my tables in sql?
Hi Cheri – no, you still need indexes in your database tables. Sphinx uses MySQL to extract the data, and indexes on your tables makes those queries faster.
Also – there are still plenty of times when you’ll need to use normal find calls. Sphinx doesn’t replace that.
Thanks, Pat. So if I have a simple request, such as selecting an entry based on a primary id, it’s better to use find_by_sql and not sphinx, correct?

Subscribe to the RSS feed