Freelancing Gods 2014

God
26 Apr 2008

Sphinx: A Primer

On Thursday night I presented to the Melbourne Ruby Group about Sphinx – first with a non-Ruby perspective, and then using Ruby, and more specifically Rails. I’ll be presenting again at the Sydney group in a couple of weeks, but I am also adapting the talk to a few blog posts – to allow a bit more detail in a few doses.

First up: Sphinx itself. Why should you read this? Because understanding Sphinx will help you use whichever library (Ruby or otherwise) smarter. It might also teach you some things you had no idea about (ie: this is the article I should have read when I started using Sphinx).

What is Sphinx?

Sphinx is a search engine. You feed it documents, each with a unique identifier and a bunch of text, and then you can send it search terms, and it will tell you the most relevant documents that match them. If you’re familiar with Lucene, Ferret or Solr, it’s pretty similar to those systems. You get the daemon running, your data indexed, and then using a client of some sort, start searching.

When indexing your data, Sphinx talks directly to your data source itself – which must be one of MySQL, PostgreSQL, or XML files – which means it can be very fast to index (if your SQL statements aren’t too complex, anyway).

Sphinx Structure

A Sphinx daemon (the process known as searchd) can talk to a collection of indexes, and each index can have a collection of sources. Sphinx can be directed to search a specific index, or all of them, but you can’t limit the search to a specific source explicitly.

Each source tracks a set of documents, and each document is made up of fields and attributes. While in other areas of software you could use those two terms interchangeably, they have distinct meanings in Sphinx (and thus require their own sections in this post).

Fields

Fields are the content for your search queries – so if you want words tied to a specific document, you better make sure they’re in a field in your source. They are only string data – you could have numbers and dates and such in your fields, but Sphinx will only treat them as strings, nothing else.

Attributes

Attributes are used for sorting, filtering and grouping your search results. Their values do not get paid any attention by Sphinx for search terms, though, and they’re limited to the following data types: integers, floats, datetimes (as Unix timestamps – and thus integers anyway), booleans, and strings. Take note that string attributes are converted to ordinal integers, which is especially useful for sorting, but not much else.

Multi-Value Attributes

There is also support in Sphinx to handle arrays of attributes for a single document – which go by the name of multi-value attributes. Currently (Sphinx version 0.9.8rc2) only integers are supported, so this isn’t quite as flexible as normal attributes, but it’s worth keeping in mind.

Filters

Filters are useful with attributes to limit your searches to certain sets of results – for example, limiting a forum post search to entries by a specific user id. Sphinx’s filters accept arrays or ranges – so if filtering by a single value, just put that in an array. The range filters are particularly useful for getting results from a certain time span.

Relevance

Relevancy is the default sorting order for Sphinx. I’ve no idea exactly how it is calculated, but there are a couple of things you can do easily enough in your queries to influence it. The first is index-level weighting, where you give specific indexes higher rankings than others. The other, similar in nature, but at a lower level, is field weightings. Generally these are set before each query, but it will depend on the library you use.

Keeping Your Indexes Updated

One thing that sets Sphinx apart from Ferret and other search engines is that there is no way to update fields for a specific document in your indexes. The main approach around this is having delta indexes – a small index with all the recent changes (which will be super-fast to index), so Sphinx will include that and the main index for its searches. Of the Rails plugins, both Thinking Sphinx and Ultrasphinx have support for this – I’ve no idea for other languages, mind you.

What’s next?

Next is when we’ll dive into some actual code – we’ll go through some of the common tasks for setting up Sphinx with Rails using Thinking Sphinx.

10 Apr 2008

Thinking Sphinx Reborn

So, over the last month or so I’ve been working hard on rewriting Thinking Sphinx – and it’s now time to release those changes publicly. The site’s now got a brief quickstart page and a detailed usage page beyond the rdoc files, and there will be more added over the coming weeks.

A quick overview of what’s shiny and new:

Better index definition syntax

This part reworked many times, finally to something I’m pretty happy with:

define_index do
  indexes [first_name, last_name], :as => :name, :sortable => true
  indexes email, location
  indexes [posts.content, posts.subject], :as => :posts
end

Polymorphic association support in indexes

When you’re drilling down into your associations for relevant field data, it’s now safe to use polymorphic associations – Thinking Sphinx is pretty smart about figuring what models to look at. Make sure you put table indexes on your _type columns though.

MVA Support

Multi-Value Attributes now work nice and cleanly – so you can tie an array of integers to any record.

Multi-Model Searching

Just like before, you can search for records of a specific model. This time around though, you can also search across all your models – and the results still use will_paginate if it’s installed.

ThinkingSphinx::Search.search "help"

Better Filter Support

It was kinda in there to start with, but now it’s much smarter – and it all goes into the conditions hash, just like a find call:

User.search :conditions => {:role_id => 5}
Article.search :conditions => {:author_ids => [12, 24, 48]}

Sorting by Fields

As you may have noticed in the first code block of this post, you can mark fields as :sortable – what this does is it uses Sphinx’s string attributes, and creates a matching attribute that acts as a sort-index to the field. When specifying the search options though, you can just use the field’s name – Thinking Sphinx knows what you’re talking about.

User.search "John", :order => :name
User.search "Smith", :order => "name DESC"

Even More

I’m so eager to share this new release that there’s probably a few things that need a bit more documentation – that will appear both on the Thinking Sphinx site and here on the blog. I’m planning on writing some articles that provide a solid overview to Sphinx in general – which will hopefully be some help no matter what plugin you use – and then dive into some regular ‘recipes’ of Thinking Sphinx usage, and some detailed posts of the cool new features as well.

Also in the pipeline is Merb support – just for ActiveRecord initially, but I’d love to get it talking to DataMapper as well.

Update: Jonathan Conway’s got a branch working in Merb and Rails – needless to say, I’ll be updating trunk with his patch as soon as possible.

06 Apr 2008

Link: New in Rails: a request profiler for profiling your app | redemption in a blog

Old news, but I need to remember it's in there

25 Mar 2008

Link: Formtastic Plugin Documentation

"makes it far easier to create beautiful, semantically rich, syntactically awesome, readily stylable and wonderfully accessible HTML forms in your Rails applications."

16 Mar 2008

RailsCamp #3

If you’re a Ruby developer in or near Australia, I highly recommend attending RailsCamp number 3, which has just opened for registration. The first two were simply amazing, so I’m just a little annoyed that I can’t make it to this one (as I’ll be traveling overseas at the time). I’ve no doubt that this one will be just as fantastic – expect an extended weekend of hacking and talking with a bunch of smart, entertaining and passionate developers, and plenty of drinks and games thrown in for good measure.

You don’t need to be a Rails or Ruby genius to attend – just a desire to discuss, learn, teach and (most importantly) have fun.

Go register now.

14 Mar 2008

Sphinx 0.9.8-rc1 Updates

Another small sphinx-related post.

In line with the first release candidate release of Sphinx 0.9.8 last week, I’ve updated both my API, Riddle, and my plugin, Thinking Sphinx, to support it. Also, for those inclined, you can now get Riddle as a gem.

I’m slowly making progress on some major changes to Thinking Sphinx, so hopefully I’ll have something cool to show people soon. Oh, but some features that aren’t reflected in the documentation: most of Sphinx’s search options can be passed through when you call Model.search – including :group_by, :group_function, :field_weights, :sort_mode, etc. Consider it an exercise for the reader to figure out the details until I get around to improving the docs.

09 Mar 2008

Migrating Code from Rails to Merb

Here’s a collection of notes made while I was working on migrating this blog from Rails to Merb (no, the Merb version isn’t live yet). These are relevant to version 0.5.2 – but Merb’s moving so fast these days, I wouldn’t be surprised if much of this isn’t relevant any more.

Most of this you can find if you look through the documentation, but if you’ve not yet played with Merb, this will hopefully give you some idea of some of the smaller differences.

Filters

  • Use before and after instead of before_filter and after_filter
  • Instead of :except, use :exclude.
  • To kill the chain of filters and not follow on to the action method, you need to use the throw method:
def confirm_user_session
  throw :halt, Proc.new { |controller|
    controller.redirect url(:new_session)
  } if current_user.nil?
end

Oh, and in case you missed it in the above sample, redirect_to becomes redirect.

Routes

Resource routing isn’t actually any different to Rails – but I had trouble finding any documentation for it, so it wasn’t obvious to start with. This code will work in both Merb and Rails (although in Rails r is usually referenced as map)

r.resources :posts do |post|
  post.resources :comments
end

However, using these routes is definitely different. The url_for, route_url and route_path methods aren’t around – you need to use Merb’s magical url method:

url(:post, @post@)
url(:new_session)

View Helpers

Helpers are pretty minimal in Merb – you don’t get any of the inflectors, or the number formatters. And no form helpers – not in the core gem, anyway. You can get some of those from the merb_helpers gem, but they don’t match Rails’ syntax and method naming, so it can take a bit of time to get this switched over, depending on the size of your app.

No support for Form Builders in that gem, by the way.

Some of the default helpers that do exist have different names to their Rails counterparts. A few of the ones I came across:

  • content_tag => tag
  • tag => open_tag
  • content_for => throw_content
  • javascript_include_tag => js_include_tag
  • stylesheet_include_tag => css_include_tag

All your favourite plugins

Because Merb plugins are gems, anything you use as a plugin in Rails is pretty unlikely to be ported over. will_paginate was the main one for me, so I ended up pulling the files into my lib directory. Of course, that was only really useful when using ActiveRecord – DataMapper and Sequel users may have to get hacking into any ActiveRecord-focused plugins they want to use.

Partials

No more render :partial – you want something more like the following:

partial "comments/show", :with => @post.comments.active
partial "post", :with => @posts, :as => :post

Again – you’ll find most/all of this in the documentation, but the only potential show-stopper I found was plugins – the rest isn’t that that big a difference to Rails.

23 Feb 2008

Talking to MYOB with Ruby

Okay, time for another code post. Prompted by a comment on an earlier post about Merb and MYOB, I thought I’d provide some more detail on how to talk to MYOB using Ruby, for any other people stuck in a similar position.

ODBC Bindings

If you like crafting your own SQL, this is the easiest approach. Firstly, you’ll need to download Christian Werner’s ODBC Bindings for Ruby, and install it using the following commands (or something along these lines) from within the source directory:

ruby extconf.rb
make
sudo make install

Documentation is a bit light on examples – and it’s really just a hook into the C libraries for ODBC, so good luck reading the source. Here’s some of the basic things you’ll need to do. (Don’t forget to require 'odbc' first, of course).

Setting up a Connection

You can either go through the DSN collection, or connect using the string name of the DSN (which in this case we’ll assume is ‘MYOB’):

dsn = ODBC.datasources.detect { |source| source.name == "MYOB" }
database = ODBC::Database.connect dsn

# or

database = ODBC::Database.connect "MYOB"

Note that the ODBC.datasources collection only contains system DSNs, not User-level ones. Not sure why, or how to access the latter.

SELECT Statements

statement = database.prepare "SELECT * FROM Customers"
statement.execute
statement.each_hash d |row|
  # results accessed as row["column_name"]
end
statement.drop

Make sure you drop the statement once you’re done with it, otherwise you’ll get complaints when you stop ruby that the statements weren’t all closed and dropped.

There’s other ways to access results – you can go through results as you see fit using fetch or fetch_hash, depending on whether you want an array or hash of the record.

INSERT Statements

With other SQL statements (although there’s no use of UPDATE or DEETE in MYOB connections), you don’t really care about the results, so that makes things a bit simpler, you can just call database.do("INSERT INTO Import_Customer_Cards (...) VALUES (...)"). The do method creates the statement, executes it and then drops it for you. MYOB makes this a little more complex though, as it all must be done in transactions – not that that’s a bad thing.

database.transaction do |env|
  env.do("INSERT INTO Import_Customer_Cards (...) VALUES (...)")
end

This won’t actually do anything though – and it won’t throw up an error or warning. The problem is you need to turn autocommit mode off, because the MYOB ODBC drivers don’t like it (it’s not a problem with Christian’s bindings). So, back when you create your connection:

database = ODBC::Database.connect "MYOB"
database.autocommit = false

ActiveRecord

If you prefer some level of abstraction above the messy SQL, then it might be worth looking at OpenLink’s ActiveRecord ODBCAdapter. I recommend the gem instead of the plugin, as the plugin takes it upon itself to modify your code, which I don’t like.

sudo gem install odbc-rails

Now, I’m not an ODBC expert, but there seems to be different flavours of ODBC connections possible – and MYOB’s is not one supported by this gem. So, if you’re using Rails, put the gem into vendor (and for Merb, into the local gems folder), then modify it with this patch. I make no promises for it being stable or reliable – but I’ve not had any problems yet.

Of course, using ActiveRecord is viable if you’re just reading data out – but if you want to write as well as read, then there’s issues. MYOB has separate write tables (prefixed with ‘Import_’), and they have denormalised schemas compared to the read-equivalents.

For database.yml, you’ll need something like the following:

development:
  adapter: odbc
  dsn: MYOB

Everything Else

In both the situations above, I’ve not put usernames or passwords into the connections – you can, but that is already handled by the ODBC DSN, so I keep my code that bit simpler.

For those who are new to coding for MYOB – it does cost money to get a developer account (several hundred dollars per year), which is the only way to get write access. I think read access is a once off fee of a couple of hundred dollars, but that’s specific to a MYOB file.

And if anyone is wondering – while I was initially using the odbc-rails library, I’ve now switched to constructing the SQL myself and just using the bindings, because of the table issues.

22 Jan 2008

Bring Methods Back From The Dead

Tags:

Today was the first time I’d come across Ruby’s undef_method – it’s used in a few places in Rails, particularly with ActiveRecord’s associations. While I see the point of it, there were a few methods I wanted back – and I’ve figured out how to do it – you need to grab the method definition from the superclass. Here’s an example:

class AntiString
  undef_method to_s
end

AntiString.new.to_s
  #=> NameError: undefined method `to_s' for class `AntiString'

AntiString.send(:define_method, :to_s,
  AntiString.superclass.instance_method(:to_s))

AntiString.new.to_s #=> "#<AntiString:0x364ef8>"

Now, the obvious caveat – if the method was originally defined in that class, not the superclass, then I think you’re out of luck. Although I’m guessing you’ll rarely be in a position where you need to resurrect a method like this anyway.

21 Jan 2008

Mixing Merb and MYOB

For one of the contracts I’m working on at the moment, I’ve been using Merb to construct a web service that interacts with MYOB, and can be consumed with ActiveResource.

The connection to MYOB is ugly, using Christian Werner’s ODBC Bindings and the Rails/ActiveRecord ODBC Adapter, the latter of which had to be hacked slightly. However, the Merb side of things was quite clean. I’m really looking forward to seeing how Merb progresses, especially with their plans for merb_core and merb_more.

One of the rare snippets of code from the Merb app that I think is more verbose than the Rails equivalent is how to go about obtaining the query parameters (as opposed to routing parameters) of a request.

The Rails way:

request.query_parameters

The Merb way:

params.except *request.route_params.keys.collect { |key| key.to_s }

Also, in case you’re as stupid as I am and want to generate Merb controllers on the fly, you can’t use Class.new. The only way is by building the class in a string and eval’ing it:

Object.send(:eval, <<-CODE
class ::Object::#{controller_name} < Application
  # actions and such go here
end
CODE
)

It’s not particularly elegant, but at least it works.

17 Jan 2008

Sphinx 0.9.8r1065

Short post, as befitting the importance of the content: Riddle and Thinking Sphinx have both been updated to support the current version of Sphinx, 0.9.8r1065.

07 Jan 2008

Link: Le-Blog-à-Dam - Page Cache Test - Rails Cache Test Plugin

If I get some spare time, this is something that would be nice to adapt to rspec

02 Jan 2008

Postie - The Gem

An addition to auspostie.com, prompted by Dr Nic’s suggestion in the comments of the earlier blog post – the postie gem. It allows easy parsing of Postie search results, and also provides a command-line tool for searching.

By suburb:

postie Brunswick

By postcode:

postie 3070

To install:

sudo gem install postie

To use within your own ruby code:

require 'postie'

Postie::Locality.find("Melbourne")

Again, extremely simple, but just makes access to the data that little bit easier.

Now, what would be really cool is a Quicksilver plugin that queries the API. Any volunteers to code that up?

01 Jan 2008

2007

Sinfest comic for New Year's Day

I don’t want to bore you all with an extensive recap of 2007, so I’ll keep this footnote of the last year’s highlights relatively brief.

Nullus Anxietas

After a few years planning, we produced the first Australian (and non-UK) Discworld Convention in February – and it was a smashing success. A few hundred attendees, dozens of sessions, a load of fun. We even made a small profit (which is rare for fan conventions) – and we proved the doubters wrong.

Rails and Freelancing

I began the year by switching jobs and finally getting paid to work with Ruby on Rails. Halfway through the year, I started freelancing. I’m really enjoying working from home, on my Mac, using tools and a programming language I enjoy. After far too long wrangling with ASP and ASP.NET, coding is fun again.

RailsCamps

As part of working with Rails, I’ve become involved in the local Ruby communities here in Australia. Through this, there’s been two awesome RailsCamps (and massive props to Ben Askins for leading the way with the first, and helping so generously with the second), and I’ve met a bunch of smart, friendly folk. Networking has become socialising.

Change

The last of my siblings has finished their secondary schooling. My sister’s moving to another state. I’m posting regularly to this blog. Howard’s out – and not much longer to put up with Bush. Climate change is being taken seriously by many governments. There will be a proper apology to the indigenous people of Australia.

What’s next?

For me, 2008 is looking to be a year of travel – to New Zealand for a holiday in a few weeks, and then to Portland for RailsConf, UK and Cambodia to visit friends (and in the case of the former, check out the Edinburgh Fringe Festival), and stops to New York and Istanbul will likely feature in there as well.

Freelancing will continue to be challenging as my first big contract ends and I look for new ones, and there’s already plans for more RailsCamps and a second Discworld Convention. I’m looking forward to all of it.

Endless thanks to my family, friends and peers for their support over the past twelve months – you’re all brilliant.

30 Dec 2007

Postcode API

A couple of weeks ago I quickly coded a basic webservice – using Merb – for Australian Postcode data. Just got hosting for it sorted out last night, so now I can blog about it.

Postie

The emphasis is on simple – you can search by postcode or suburb, and get the data back either as a HTML page, XML, or JSON. That’s pretty much it. The request urls aren’t complicated, either:

Suburb requests use partial matching, so you don’t need the full suburb name. If you want the JSON returned with the MIME type of application/json, use the .json extension.

I’ve no plans at this point on using this, but perhaps it’s useful for someone out there – if so, would love to hear about it.

27 Dec 2007

Updates for Sphinx 0.9.8r985

Another quick Sphinx post – Riddle is updated to support Sphinx’s latest release (0.9.8r985), and Thinking Sphinx now has that new version of Riddle as well.

I’ve not tested any of this with the recently released Ruby 1.9 yet, though (but it’s on my list of things to do).

Also, thank-you to Joost Hietbrink (again) and Jonathan Conway for their patches to Thinking Sphinx – very much appreciated.

13 Dec 2007

Rspec'ing controllers

I’m always trying to find a better way to write specs for my Rails apps with RSpec – it took a while for me to be comfortable with writing model specs, but just recently, I’ve developed a style of controller specs that I feel work well for me.

While it’s not too hard to write methods that automate some of the repetitive side of things, it can be hard to do so in a manner that fits RSpec’s DSL – but I’ve found the key to (my current style of) controller specs is shared behaviours. An example of a few actions from the news controller at ausdwcon.org:

describe NewsController, "index action" do
  before :each do
    @method = :get
    @action = :index
    @params = {}
  end
  
  it_should_behave_like "Public-Access Actions"
  
  it "should paginate the results" do
    @news = []
    News.stub_method(:paginate => @news)
    
    get :index
    
    News.should have_received(:paginate)
    assigns[:news].should == @news
  end
  
  it "should set the title to 'News'" do
    News.stub_method(:paginate => [])
    
    get :index
    
    assigns[:title].should == "News"
  end
end

describe NewsController, "new action" do
  before :each do
    @method = :get
    @action = :new
    @params = {}
  end
  
  it_should_behave_like "Admin-Only Actions"
  
  it "should set the title to 'News'" do
    controller.current_user = User.stub_instance(:admin? => true)
    
    get :new
    
    assigns[:title].should == "News"
  end
end

You can find the full spec for the controller on pastie.caboo.se. The shared behaviors – ‘Public-Access Actions’ and ‘Admin-Only Actions’ – are (at least for the moment) kept in my spec_helper.rb file – a sample of which is below:

describe "Admin-Only Actions", :shared => true do
  it "should not be accessible without authentication" do
    @controller.current_user = nil
    
    send @method, @action, @params
    
    response.should be_redirect
    response.should redirect_to(new_session_url)
  end
  
  it "should not be accessible by a normal user" do
    @controller.current_user = User.stub_instance(:admin? => false)
    
    send @method, @action, @params
    
    response.should be_redirect
    response.should redirect_to(new_session_url)
  end
  
  it "should be accessible by an admin user" do
    @controller.current_user = User.stub_instance(:admin? => true)
    
    send @method, @action, @params
    
    response.should_not redirect_to(new_session_url)
  end
end

Firstly – the not so nice stuff: the use of instance variables to communicate the method, action and parameters of requests to the shared behaviours. It’s not ideal, and apparently there’s plans to add arguments to the it_should_behave_like method, but for the moment it does the job.

I’m using Pete Yandell’s NotAMock for my stubbing – albeit with a few modifications of my own (which may make it back into the plugin itself at some point). I also use my own ActiveMatchers – but that’s more focused on models. It’s also not really feature-complete, but if you like what it offers, feel free to use it.

Oh, and the main caveat? This is my current way of spec’ing controllers – and it’s vastly better than the minimal specs I was writing before this – but it may/will change. I don’t even know if my style is ‘best practice’ – I’m putting them online to get feedback and provoke discussion. So please feel free to critique.

05 Dec 2007

Link: Process title support for Mongrel

"This is a simple module which changes Mongrel's process title to reflect what it's currently doing."

05 Dec 2007

Link: Bamboo Blog - Presenters & Conductors on Rails

"the presenter and conductor; the presenter sitting between the controller and view, and the conductor sitting between the model and controller."

05 Dec 2007

Link: Plain Text Stories: Part III

Examples of the new stories/integration testing in rspec

RssSubscribe to the RSS feed

About Freelancing Gods

Freelancing Gods is written by , who works on the web as a web developer in Melbourne, Australia, specialising in Ruby on Rails.

In case you're wondering what the likely content here will be about (besides code), keep in mind that Pat is passionate about the internet, music, politics, comedy, bringing people together, and making a difference. And pancakes.

His ego isn't as bad as you may think. Honest.

Here's more than you ever wanted to know.

Ruby on Rails Projects

Other Sites

Creative Commons Logo All original content on this site is available through a Creative Commons by-nc-sa licence.