- Basic Searching
- Field Conditions
- Attribute Filters
- Application-Wide Search
- Match Modes
- Ranking Modes
- Field Weights
- Search Results Information
- Dynamic Attributes
- Searching for Object Ids
- Returning Raw Sphinx Results
- Search Counts
- Avoiding Nil Results
- Empty multi-value attributes
- Automatic Wildcards
- Batched Searches
- Advanced Options
Sphinx does have some reserved characters (including the @ character), so you may need to escape your query terms.
Please note that Sphinx paginates search results, and the default page size is 20. You can find more information further down in the pagination section.
To focus a query on a specific field, you can use the
:conditions option - much like in ActiveRecord (back before Rails 3, anyway):
You can combine both field-specific queries and generic queries too:
Please keep in mind that Sphinx does not support SQL comparison operators - it has its own query language. The
:conditions option must be a hash, with each key a field and each value a string.
Filters on attributes can be defined using a similar syntax, but using the
Filters have the advantage over focusing on specific fields in that they accept arrays and ranges:
And of course, you can mix and match global terms, field-specific terms, and filters:
If you wish to exclude specific attribute values, then you can specify them using
For matching multiple values in a multi-value attribute,
:with doesn’t quite do what you want. Give
:with_all a try instead:
You can also perform combination AND and OR matches with
:with_all using nested arrays:
You can use all the same syntax to search across all indexed models in your application:
This search will return all objects that match, no matter what model they are from, ordered by relevance (unless you specify a custom order clause, of course). Don’t expect references to attributes and fields to work perfectly if they don’t exist in all the models.
If you want to limit global searches to a few specific models, you can do so with the
If you want to limit searches to specific indices, you can do this with the
:indices option (at both a global and model-specific level):
Standard indices will have the
_core suffix, and there will also be an equivalent with the
_delta suffix if deltas are enabled for the index in question.
Sphinx paginates search results by default. Indeed, there’s no way to turn it off (but you can request really big pages should you wish). The parameters for pagination in Thinking Sphinx are exactly the same as Will Paginate:
The output of search results can be used with Will Paginate’s view helper as well, just to keep things nice and easy.
Pagination can also be used in combination with Kaminari as well.
Thinking Sphinx v3 and newer use Sphinx’s SphinxQL for querying, and that always uses the extended match mode, which is covered in detail in the Sphinx documentation.
Sphinx also has a few different ranking modes (again, the Sphinx documentation is the best source of information on these). They can be set using the
Ranking modes include the following (though the definitive list is in the Sphinx documentation):
The default ranking mode, which combines both phrase proximity and BM25 ranking (see below).
A statistical ranking mode, similar to most other full-text search engines.
No ranking - every result has a weight of 1.
:wordcount (since 0.9.9rc1)
Ranks results purely on the number of times the keywords are found in a document. Field weights are taken into factor.
:proximity (since 0.9.9rc1)
Ranks documents by raw proximity value.
:matchany (since 0.9.9rc1)
Returns rankings calculated in the same way as a match mode of
:fieldmask (since 0.9.9rc2)
Returns rankings as a 32-bit mask with the N-th bit corresponding to the N-th field, numbering from 0. The bit will only be set when any of the keywords match the respective field. If you want to know which fields match your search for each document, this is the only way.
By default, Sphinx sorts by how relevant it believes the documents to be to the given search keywords. However, you can also sort by attributes (and fields flagged as sortable) or custom mathematical expressions.
Sorting expressions are much like SQL’s ORDER BY clause - an attribute followed by a direction:
If you supply an attribute as a symbol, it’s presumed you want them in ascending order:
If you want to use a custom expression to define your sorting order, you need to declare that as a dynamic attribute:
And as shown in the above example, Sphinx’s calculated ranking is available via the
weight() function. If all you want to refer to that directly when sorting, you need to give it an alias:
Sphinx has the ability to weight fields with differing levels of importance. You can set this using the
:field_weights option in your searches:
You don’t need to specify all fields - any not given values are kept at the default weighting of 1.
If you’d like the same custom weightings to apply to all searches, it’s best to set these through a default Sphinx scope. If you’re using a version prior to 3.0, you can specify these defaults in your index definition (see below), but given this is something related to searching rather than indexing, a default scope is a more appropriate option.
Search Results Information
If you’re building your own pagination output, then you can find out the statistics of your search using the following accessors:
It’s possible for Sphinx searches to have generated attributes as part of a request, which can then be used for filtering or grouping (or just returned for use by your own application). This is done with the
:select option - which behaves very similarly to a SQL SELECT clause.
Unless you’re returning raw Sphinx results, you must include all standard attributes (the
"*" at the start of the
:select option) to ensure records can be translated to ActiveRecord instances.
Grouping / Clustering
Sphinx allows you group search records that share a common attribute, which can be useful when you want to show aggregated collections. For example, if you have a set of posts and they are all part of a category and have a category_id, you could group your results by category id and show a set of all the categories matched by your search, as well as all the posts. You can read more about it in the official Sphinx documentation.
For grouping to work, you need to pass in the
Searching posts, for example:
By default, this will return your Post objects, but one per category_id. If you want to sort by how many posts each category contains, you can pass in
Once you have the grouped results, you can enumerate by each result along with the group value, the number of objects that matched that group value, or both, using the following methods respectively:
Sphinx’s SphinxQL syntax only allows for grouping on a single attribute - but that attribute can be generated in the SELECT part of the query itself:
Searching for Object Ids
If you would like just the primary key values returned, instead of instances of ActiveRecord objects, you can use all the same search options in a call to
Returning Raw Sphinx Results
If you’d rather get the raw Sphinx results back from a search call instead of ActiveRecord instances, use the
RAW_ONLY middleware stack:
This is particularly useful when you want computed values from Sphinx without needing to instantiate model instances.
If you just want the number of matches, instead of the matched objects themselves, then you can use the
search_count method (which accepts all the same arguments as a normal
search call). If you’re searching globally, then use the
Avoiding Nil Results
Thinking Sphinx tries its hardest to make sure Sphinx knows when records are deleted, but sometimes stale objects slip through the gaps. To get around this, Thinking Sphinx has the option of retrying searches.
To enable this, you can set
:retry_stale to true, and Thinking Sphinx will make up to three tries at retrieving a full result set that has no nil values. If you want to change the number of tries, set
:retry_stale to an integer.
And obviously, this can be quite an expensive call (as it instantiates objects each time), but it provides a better end result in some situations.
Empty multi-value attributes
When you have a multi-value attribute in an index and you want to find records where those attributes are empty, you need to add a dynamic attribute with the length of the MVA, and then filter by that.
In this example, the Article model can have many authors, and has an MVA for the author ids (either
has author_ids, :type => :integer, :multi => true for real-time indices, or
has authors.id, :as => :author_ids for SQL-backed indices).
If you’d like your search keywords to be wildcards for every search, you can use the
:star option, which automatically prepends and appends wildcard stars to each word.
If you want to manage auto-wildcarding in a more controlled fashion there’s the
It is possible to collect multiple searches together to send to Sphinx in one go, via a
Keep in mind that if you’re testing this in a Rails console, the inspection of a search results set populates the data immediately, which would make this fail. An easy way around this is to add
; "" at the end of lines that involve search calls. For example:
One limitation to note is that there is no way for batched searches to reference each other. The key advantage here is just to save on the roundtrip requests going to Sphinx and back.
If you construct a query that Sphinx cannot understand, or if the connection fails, an instance of
ThinkingSphinx::SphinxError will be raised.
Some specific types of errors are given specific subclass -
ThinkingSphinx::ParseError. The message in any of these errors will give you more detail on what’s gone wrong.
Thinking Sphinx accepts the following advanced Sphinx arguments:
If you want to set additional arguments for the underlying SQL call when translating Sphinx results into ActiveRecord objects (
:order), you can put these within the
And finally - to avoid lazily loading search results and make sure Thinking Sphinx processes the search query immediately, use the
This is particularly useful to ensure exceptions are raised where you expect them to.