Freelancing Gods 2015

10 Feb 2015

RubyConf AU 2015: Thank You

Last week RubyConf AU 2015 took place in Melbourne. A year prior to that, I’d put my hand up to run it… and over the course of twelve months, had assembled an excellent team, lined up speakers, venues, and a whole bunch of fun.

On Wednesday morning, it became real, as the workshops kicked off. By Saturday evening, it was finished with our after party at the Melbourne Lawn Bowls club in Flagstaff Gardens.

Going by the feedback we’ve received, I think it’s safe to say it was a success – at the very least, I’m thrilled with what we achieved.

But, of course, it would not have been possible without contributions from many, many people. I do want to list them here, even though it’s guaranteed I’ll forget someone and then feel terrible once I realise.

Firstly: to our sponsors, who not only gave us considerable amounts of money (no small thing in itself), but trusted and supported our efforts to grow the Australian Ruby community. Thank you Envato,, Redbubble, reinteractive, Digital Ocean, JobReady, Torii Recruitment, GitHub, Pluralsight, BuildKite, Lookahead Search, EngineYard, Soundcloud and Travis CI.

To our venues: Jasper, Zinc, and Deakin Edge. You provided fantastic spaces for our community to listen, learn, eat and socialise within. A special thank you to the AV team at Deakin Edge: Blake, Wes and Brad, plus our own video recorder Anthony, returning yet again to make sure our talks are captured for future generations.

To our stenographer Rebekah, who provided live captioning of our conference proceedings. She was not only extremely good at her job, but also responded to Keith and Josh’s banter in style.

To the weather gods – Melbourne’s traditionally fickle weather gave us four days of warm sunshine, which was perfect for showing off our fine city.

To the team behind our ticketing system Tito, who helped us with beta features and late night support.

To the Ruby Australia committee, who were super supportive when I first asked about running this conference, and provide essential and appreciated financial and organisational support. You play a massive part in the health and success of our community.

To our event manager Deborah Langley, and her colleague Sam. Engaging Deb to work on our event made our lives a great deal easier, and helped us to achieve great things. Plus, Deb and Sam helped the running of the conference and events purr along smoothly.

To our volunteers, lead by the inestimable Liam Esler and Mel Sherrin, and our stage manager Maxine Sherrin. You took excellent care of our attendees and speakers, kept things running to schedule, and deserve all of the credit for how calmly the conference ran.

To Amanda Neumann and Darcy Laycock, who worked with me to select presenters from our massive selection of proposals. We agonised over which talks made the cut (and there were many excellent choices that missed out), but I think our choices were great ones!

To our local Rubyists: Healesville guide Pete Yandell, and cycling leaders Gareth Townsend & Gus Gollings, who all ensured our attendees from near and far got to experience a different aspect of Melbourne beyond just the conference sessions.

To our fabulous illustrator Dougal MacPherson, who, with his 15 minute drawings hat on, drew a picture of every session (including workshops), which then became lovely gifts for our speakers.

To Tim Lucas, for his tireless work on our slick website, plus the corralling of our beautiful and popular t-shirts – which were designed by Magdalena Ksiezak (for the conference) and Carla Hackett (for the Rails Girls workshops).

To the organisers of the previous RubyConf AU events – Keith Pitty, Martin Stannard_, Michael Koukoullis, Josh Price, Elle Meredith, Jason Crane, Georgina Robilliard, and Steve Gilles. We have only been able to create this event by standing on your shoulders and reaping the rewards of your hard work.

To Ben Askins, who kicked off the bonding of our fantastic Australian Ruby community by organising the very first Rails Camp. That event changed my life.

To the large number of conferences that provided inspiration, including (but certainly not limited to) JSConf US and EU, FutureRuby, NordicRuby, eurucamp, and Web Directions: Code.

To our speakers, workshop presenters, Rails Girls organisers, and our entertaining and excellent MCs Josh Kalderimis and Keith Pitt. We gave you the stage, and you made us so very proud.

To my fellow organisers: Melissa Kaulfuss, Matt Allen, and Sebastian von Conrad. Through our shared vision and skill-set we have crafted a special event, all contributing in different and most definitely valued ways. I really cannot thank you enough.

To our employers: Inspire9, Envato, Lookahead Search and Icelab, who supported us in our endeavour, with time and patience and suggestions.

To our families, who recognised the commitment we had to give to make this real, and looked after us, loved and supported us. You’re the very definition of amazing.

To everyone else who helped in any way – I was inundated with offers of support and assistance over the past year, and while I didn’t have the opportunity to take everyone up on that, the offers themselves are greatly appreciated.

And finally, to everyone who attended the conference, and the broader Ruby community. It feels far more that we’ve done this with you than for you.

Thank you all, so very, very much.

30 Dec 2013

Melbourne Ruby Retrospective for 2013

The Melbourne Ruby community has grown and evolved a fair bit in this past year, and I’m extremely proud of what it has become.

Mind you, I’ve always thought it was pretty special. I first started to attend the meets back when Rails was young and the community in Australia was pretty new, towards the end of 2005. The meets themselves started in January of that year – almost nine years ago! – and have continued regularly since, in many shapes, sizes and venues, under the guiding hands of many wise Rubyists.

Given I’ve been around so long, it’s a little surprising I’d not had a turn convening the meetings on a regular basis (though I’d certainly helped out when other organisers couldn’t be present). After the excellent, recent guidance of Dave Goodlad and Justin French, Mario Visic and Ivan Vanderbyl stepped up – and then Ivan made plans to move to the USA. I was recently inspired by discussions around growing and improving the community at the latest New Zealand Rails Camp, and so I offered to take Ivan’s place. (As it turns out, Ivan’s yet to switch sides of the Pacific Ocean. Soon, though!)

And so, since February, Mario and I have added our own touches to the regular events. Borrowing from both Sydney and Christchurch, we’ve added monthly hack nights – evenings where there’s no presentations, but people of all different experience levels bring along their laptops and get some coding done. If anyone gets stuck, there’s plenty of friendly and experienced developers around to help.

More recently, reInteractive have helped to bring InstallFests from Sydney to Melbourne. They are events to help beginners interested in Ruby and Rails get the tools they need installed on their machines and then go through the process of setting up a basic blog, with mentors on hand to help deal with any teething problems.

For the bulk of Melbourne Ruby community’s life, the meets have been announced through Google groups – first the Melbourne Ruby User Group, then in the broader Ruby or Rails Oceania group. It’d become a little more clear over the past couple of years that this wasn’t obvious to outsiders who were curious about Ruby – which prompted the detailing of meeting schedules on – but there was still room for improvement. reInteractive’s assistance with the InstallFest events was linked to their support with setting up a group on – and almost overnight we’ve had a significant increase in newcomers.

Now, many of us Rubyists are quite opinionated, and I know some find Meetup inelegant and, well, noisy. I certainly don’t think it’s as good as it could be – but it’s the major player in the space, and it’s the site upon which many people go searching for communities like ours. The Google group does okay when it comes to discussions, but highlighting upcoming events (especially if you’re not a regular) is not its forte at all.

We’ve not abandoned the Google group, but now we announce events through both tools – and the change has been so dramatic that, as much as I’m wary of supporting big players in any space, I’d argue that you’d be stupid not to use Meetup. We’ve had so many new faces come along to our events – and while we still have a long way to go for equal gender representation (it’s still predominantly white males), it’s slowly improving.

With the new faces appearing, we held a Newbie Night as one of our presentation evenings (something that’s happened a couple of times before, but certainly not frequently enough). Mario and I were lucky enough to have Jeremy Tennant step up to run this and corral several speakers to provide short, introductory presentations on a variety of topics. (Perhaps this should become a yearly event!)

We’re also blessed to have an excellent array of sponsors – Envato, Inspire9, Zendesk, reInteractive and Lookahead Search have all provided a mixture of money, space and experienced minds. We wouldn’t be where we are now without you, your support is appreciated immensely.

Mario and I have also spent some time thinking a bit deeper about some of the longstanding issues with tech events, and tried to push things in a healthier direction:

At many of the last handful of meetings for this year, instead of pizza, we’ve had finger food from the ASRC Catering service, tacos from The Taco Guy, and a few pancakes as well. In each case we’ve ensured there’s vegetarian, gluten-free and lactose-free options. This trend shall certainly continue!

The drinks fridge at Inspire9 (our wonderful hosts for the past couple of years) now have plenty of soft drinks and sparkling mineral water alongside the alcoholic options – and we’ve been pretty good at making sure jugs of tap water are available too. There’s also tea and coffee, though we need to be better at highlighting this.

We’ve also adopted Ruby Australia’s Code of Conduct for all Melbourne Ruby events. This is to both recognise that our community provides value and opportunity to many, and to make it clear we want it to continue to be a safe and welcoming place, offline and online.

We’re by no means perfect, and I’m keen to help this community grow stronger and smarter over the coming year – but we’ve got some great foundations to build on. The Melbourne Ruby community – and indeed, the broader Australian Ruby community – is growing from strength to strength, and a lot of that is due to the vast array of leaders we have, whose shoulders we are standing on.

Alongside the regular city meets, there are Rails Camps twice a year, RailsGirls events becoming a regular appearance on the calendar, and the second RubyConf Australia is in Sydney this coming February. I’m looking forward to seeing what 2014 brings – thanks to all who’ve been part of the ride thus far!

22 Jul 2013

Rewriting Thinking Sphinx: Introducing Realtime Indices

The one other feature in the rewrite of Thinking Sphinx that I wanted to highlight should most certainly be considered in beta, but it’s gradually getting to the point where it can be used reliably: real-time indices.

Real-time indices are built into Sphinx, and are indices that can be dynamically updated on the fly – and are not backed by a database sources. They do have a defined structure with fields and attributes (so they’re not a NoSQL key/value store), but they remove the need for delta indices, because each record in a real-time index can be updated directly. You also get the benefit, within Thinking Sphinx, to refer to Ruby methods instead of tables and columns.

The recent 3.0.4 release of Thinking Sphinx provides support for this, but the workflow’s a little different from the SQL-backed indices:

Define your indices

Presuming a Product model defined just so:

class Product < ActiveRecord::Base
  has_many :categorisations
  has_many :categories, :through => :categorisations

You can put together an index like this:

ThinkingSphinx::Index.define :product, :with => :real_time do
  indexes name

  has category_ids, :type => :integer, :multi => true

You can see here that it’s very similar to a SQL-backed index, but we’re referring to Ruby methods (such as category_ids, perhaps auto-generated by associations), and we’re specifying the attribute type explicitly – as we can’t be sure what a method returns.

Add Callbacks

Every time a record is updated in your database, you want those changes to be reflected in Sphinx as well. Sometimes you may want associated models to prompt a change – hence, these callbacks aren’t added automatically.

In our example above, we’d want after_save callbacks in our Product model (of course) and also our Categorisation model – as that will impact a product’s category_ids value.

# within product.rb
after_save ThinkingSphinx::RealTime.callback_for(:product)

# within categorisation.rb
after_save ThinkingSphinx::RealTime.callback_for(:product, [:product])

The first argument is the reference to the indices involved – matching the first argument when you define your index. The second argument in the Categorisation example is the method chain required to get to the objects involved in the index.

Generate the configuration

rake ts:configure

We’ve no need for the old ts:index task as that’s preloading index data via the database.

Start Sphinx

rake ts:start

All of our interactions with Sphinx are through the daemon – and so, Sphinx must be running before we can add records into the indices.

Populate the initial data

rake ts:generate

This will go through each index, load each record for the appropriate model and insert (or update, if it exists) the data for that into the real-time indices. If you’ve got a tonne of records or complex index definitions, then this could take a while.

Everything at once

rake ts:regenerate

The regenerate task will stop Sphinx (if it’s running), clear out all Sphinx index files, generate the configuration file again, start Sphinx, and then repopulate all the data.

Essentially, this is the rake task you want to call when you’ve changed the structure of your Sphinx indices.

Handle with care

Once you have everything in place, then searching will work, and as your models are updated, your indices will be too. In theory, it should be pretty smooth sailing indeed!

Of course, there could be glitches, and so if you spot inconsistencies between your database and Sphinx, consider where you may be making changes to your database without firing the after_save callback. You can run the ts:generate task at any point to update your Sphinx dataset.

I don’t yet have Flying Sphinx providing full support for real-time indices – it should work fine, but there’s not yet any automated backup (whereas SQL-backed indices are backed up every time you process the index files). This means if a server fails it’d be up to you to restore your index files. It’s on my list of things to do!

What’s next?

I’m keen to provide hooks to allow the callbacks to fire off background jobs instead of having that Sphinx update part of the main process – though it’s certainly not as bad as the default delta approach (you’re not shelling out to another process, and you’re only updating a single record).

I’m starting to play with this in my own apps, and am keen to see it used in production. It is a different way of using Sphinx, but it’s certainly one worth considering. If you give it a spin, let me know how you go!

11 Jul 2013

Gutentag: Simple Rails Tagging

The last thing the Rails ecosystem needs is another tagging gem. But I went and built one anyway… it’s called Gutentag, and perhaps worth it for the name alone (something I get an inordinate amount of happiness from).

My reasons for building Gutentag are as follows:

A tutorial example

I ran a workshop at RailsConf earlier this year (as a pair to this talk), and wanted a simple example that people could work through to have the experience of building a gem. Almost every Rails app seems to need tags, so I felt this was a great starting point – and a great way to show off how simple it is to write and publish a gem.

You can work through the tutorial yourself if you like – though keep in mind the focus is more on the process of building a gem rather than the implementation of this gem in particular.

A cleaner code example

Many gems aren’t good object-oriented citizens – and this includes most of the ones I’ve written. They’re built with long, complex classes and modules, are structured in ways that Demeter does not like, and aren’t particularly easy to extend cleanly.

I have the beginnings of a talk on how to structure gems (especially those that work with Rails) sensibly – but I’ve not yet had the opportunity to present this at any conferences.

One point that will definitely feature if I ever do get that opportunity: more and more, I like to avoid including modules into ActiveRecord and other parts of Rails – and if you peruse the source you’ll see I’m only adding the absolute minimum to ActiveRecord::Base, plus I’ve pulled out the logic around the tag names collection and resulting persistence into a separate, simple class.

I got a nice little buzz when I had Code Climate scan the source and give it an A rating without me needing to change anything.

Test-driven design

I started with tests, and wrote them in a way that made it clear how I expected the gem to behave – and then wrote the implementation to match. If you’re particularly keen, you can scan through each commit to see how the gem has evolved – I tried to keep them small and focused.

Or, just have a read through of the acceptance test files – there’s only two, so it won’t take you long.


There are a large number of other tagging gems out there – and if you’re using one of those already, there’s no incentive at all to switch. I’ve used acts-as-taggable-on many times without complaints.

But Gutentag certainly works – the README outlines how you can use it – and at least people might smile every time they add it to a Gemfile. But at the end of the day, if it’s just used as an example of a simple gem done well, I’ll consider this a job well done.

09 Jul 2013

Rewriting Thinking Sphinx: Middleware, Glazes and Panes

Time to discuss more changes to Thinking Sphinx with the v3 releases – this time, the much improved extensibility.

There have been a huge number of contributors to Thinking Sphinx over the years, and each of their commits are greatly appreciated. Sometimes, though, the pull requests that come in cover extreme edge cases, or features that are perhaps only useful to the committer. But running your own hacked version of Thinking Sphinx is not cool, and then you’ve got to keep an especially close eye on new commits, and merge them in manually, and… blergh.

So instead, we now have middleware, glazes and panes.


The middleware pattern is pretty well-established in the Ruby community, thanks to Rack – but it’s started to crop up in other libraries too (such as Mike Perham’s excellent Sidekiq).

In Thinking Sphinx, middleware classes are used to process search requests. The default set of middleware are as follows:

  • ThinkingSphinx::Middlewares::StaleIdFilter adding an attribute filter to hide search results that are known to not match any ActiveRecord objects.
  • ThinkingSphinx::Middlewares::SphinxQL generates the SphinxQL query to send to Sphinx.
  • ThinkingSphinx::Middlewares::Geographer modifies the SphinxQL query with geographic co-ordinates if they’re provided via the :geo option.
  • ThinkingSphinx::Middlewares::Inquirer sends the constructed SphinxQL query through to Sphinx itself.
  • ThinkingSphinx::Middlewares::UTF8 ensures all string values returned by Sphinx are encoded as UTF-8.
  • ThinkingSphinx::Middlewares::ActiveRecordTranslator translates Sphinx results into their corresponding ActiveRecord objects.
  • ThinkingSphinx::Middlewares::StaleIdChecker notes any Sphinx results that don’t have corresponding ActiveRecord objects, and retries the search if they exist.
  • ThinkingSphinx::Middlewares::Glazier wraps each search result in a glaze if there’s any panes set for the search (read below for an explanation on this).

Each middleware does its thing, and then passes control through to the next one in the chain. If you want to create your own middleware, your class must respond to two instance methods: initialize(app) and call(contexts).

If you subclass from ThinkingSphinx::Middlewares::Middleware you’ll get the first for free. contexts is an array of search context objects, which provide access to each search object along with the raw search results and other pieces of information to note between middleware objects. Middleware are written to handle multiple search requests, hence why contexts is an array.

If you’re looking for inspiration on how to write your own middleware, have a look through the source – and here’s an extra example I put together when considering approaches to multi-tenancy.

Glazes and Panes

Sometimes it’s useful to have pieces of metadata associated with each search result – and it could be argued the cleanest way to do this is to attach methods directly to each ActiveRecord instance that’s returned by the search.

But inserting methods on objects on the fly is, let’s face it, pretty damn ugly. But that’s precisely what older versions of Thinking Sphinx do. I’ve never liked it, but I’d never spent the time to restructure things to work around that… until now.

There are now a few panes available to provide these helper methods:

  • ThinkingSphinx::Panes::AttributesPane provides a method called sphinx_attributes which is a hash of the raw Sphinx attribute values. This is useful when your Sphinx attributes hold complex values that you don’t want to re-calcuate.
  • ThinkingSphinx::Panes::DistancePane provides the identical distance and geodist methods returning the calculated distance between lat/lng geographical points (and is added automatically if the :geo option is present).
  • ThinkingSphinx::Panes::ExcerptsPane provides access to an excerpts method which you can then chain any call to a method on the search result – and get an excerpted value returned.
  • ThinkingSphinx::Panes::WeightPane provides the weight method, returning Sphinx’s calculated relevance score.

None of these panes are loaded by default – and so the search results you’ll get are the actual ActiveRecord objects. You can add specific panes like so:

# For every search
ThinkingSphinx::Configuration::Defaults::PANES << ThinkingSphinx::Panes::WeightPane

# Or for specific searches:
search ='pancakes')
search.context[:panes] << ThinkingSphinx::Panes::WeightPane

When you do add at least pane into the mix, though, the search result gets wrapped in a glaze object. These glaze objects direct any methods called upon themselves with the following logic:

  • If the search result responds to the given method, send it to that search result.
  • Else if any pane responds to the given method, send it to the pane.
  • Otherwise, send it to the search result anyway.

This means that your ActiveRecord instances take priority – so pane methods don’t overwrite your own code. It also allows for method_missing metaprogramming in your models (and ActiveRecord itself) – but otherwise, you can get access to the useful metadata Sphinx can provide, without monkeypatching objects on the fly.

If you’re writing your own panes, the only requirement is that the initializer must accept three arguments: the search context, the underlying search result object, and a hash of the raw values from Sphinx. Again, the source code for the panes is not overly complex – so have a read through that for inspiration.

I’m always keen to hear about any middleware or panes other people write – so please, if you do make use of either of these approaches, let me know!

24 Jun 2013

Rewriting Thinking Sphinx: Loading Only When Necessary

I’ve obviously been neglecting this blog – even a complete rewrite of Thinking Sphinx hasn’t garnered a mention yet! Time to remedy this…

There’s plenty to focus on with Thinking Sphinx v3 (released just under six months ago), because a ton has changed – but that’s pretty well covered in the documentation. I’m going to cover one thing per post instead.

First up: index definitions are now located in their own files, located in the app/indices directory. Given they can get quite complex, I think they’re warranted to have their own files – and besides, let’s keep our concerns separate, instead of stuffing everything into the models (yes, I’m a firm proponent of skinny everything, not just skinny controllers).

So, instead of this within your model:

class User < ActiveRecord::Base
  # ...

  define_index do
    indexes first_name, last_name, country

  # ...

You now create a file called user_index.rb (or whatever, really, as long it ends with .rb) and place it in app/indices:

ThinkingSphinx::Index.define :user, :with => :active_record do
  indexes first_name, last_name, country

You’ll note the model is now specified with a symbol, and we’re providing an index type via the :with option. At the moment, the latter is always :active_record unless you’re using Sphinx’s real-time indices (which are definitely beta-status in Thinking Sphinx). The model name as a symbol, however, represents one of the biggest gains from this shift.

In previous versions of Thinking Sphinx, to discover all of the index definitions that existed within your app, the gem would load all of your models. Initial versions did this every time your app initialised, though that later changed so they the models and index definitions were loaded only when necessary.

Except, it was necessary if a search was being run, or even just if a model was modified (because updates to Sphinx’s index files could be required) – which is the majority of Rails requests, really. And yes, this information was cached between requests like the rest of Rails, except – like the rest of Rails – in your development environment.

Loading all your models is quite a speed hit – so this could be pretty painful for applications with a large number of models.

There were further workarounds added (such as the indexed_models option in config/sphinx.yml), but it became clear that this approach was far from ideal. And of course, there’s separation of concerns and skinny models and so on.

This gives some hint as to why we don’t provide the model class itself when defining indexes – because we don’t want to load our models until we absolutely have to, but we do get a reference to them. The index definition logic is provided in a block, which means it’ll only be evaluated when necessary as well.

This doesn’t get around the issue of knowing when changes to model instances occur though, so this got dealt with in two ways. Firstly: delta index settings are now an argument at the top of the index, not within the logic block:

  :user, :with => :active_record, :delta => true
) do
  # ...

And attribute updating is no longer part of the default set of features.

This means Thinking Sphinx can now know whether deltas are involved before evaluating the index definition logic block – and thus, the callbacks are lighter and smarter.

The end result is thus:

  • Thinking Sphinx only loads the index definitions when they’re needed;
  • They, in turn, only load the models and definition logic when required;
  • Each index now gets its own file;
  • Your models stay cleaner; and
  • Request times are faster.

Overall, a vast improvement.

19 Oct 2011

A Sustainable Flying Sphinx?

In which I muse about what a sustainable web service could look like – but first, the backstory:

A year ago – almost to the day – I sat in a wine bar in Sydney’s Surry Hills with Steve Hopkins. I’d been thinking about how to get Sphinx working on Heroku, and ran him through the basic idea in my head of how it could work. His first question was “So, what are you working on tomorrow, then?”

By the end of the following day, I had some idea of how it would work. Over the next few months I had a proof of concept working, hit some walls, began again, and finally got to a point where I could launch an alpha release of Flying Sphinx.

In May, Flying Sphinx became available for all Heroku users – and earlier today (five months later), I received my monthly provider payment from Heroku, with the happy news that I’m now earning enough to cover all related ongoing expenses – things like AWS for the servers, Scalarium to manage them, and Tender for support.

Now, I’m not rolling in cash, and I’m certainly not earning enough through Flying Sphinx to pay rent, let alone be in a position to drop all client work and focus on Flying Sphinx full-time. That’s cool, either of those targets would be amazing.

And of course, money isn’t the be all and end all – even though this is a business, and I certainly don’t want to run at a loss. I want Flying Sphinx to be sustainable – in that it covers not only the hosting costs, but my time as well, along with supporting the broader system around it – code, people and beyond.

But what does a sustainable web service look like, particularly beyond the standard (outmoded) financial axis?

Sustainable Time

Firstly (and selfishly), it should cover the time spent maintaining and expanding the service. Flying Sphinx doesn’t use up a huge amount of my time right at the moment, but I’m definitely keen to improve a few things (in particular, offer Sphinx 2.0.1 alongside the existing 1.10-beta installation), and there is the occasional support query to deal with.

This one’s relatively straight-forward, really – I can track all time spent on Flying Sphinx and multiply that by a decent hourly rate. If it turns out I can’t manage all the work myself, then I pay someone else to help.

It certainly doesn’t look like I’m going to need anyone helping in the near future, mind you – nor am I drowning in support requests.

Sustainable Software

Ignoring the time I spend writing code for Flying Sphinx (as that’s covered by the previous section), pretty much every other piece of software involved with the service is open source. Front and centre among these is Sphinx itself.

I certainly don’t expect to be paid for my own open source contributions, but it certainly helps when there’s some funds trickling in to help motivate dealing with support questions, fixing bugs and adding features. It can also provide a stronger base to build a community as well.

With this in mind, I’m considering setting aside a percentage of any profit for Sphinx development – as any improvements to that help make Flying Sphinx a stronger offering.

(I could also cover my time spent on Thinking Sphinx either with a percentage cut – either way it would end up in my pocket though.)

Sustainable Hardware

This is where things get a little trickier – we’re not just dealing with bits and electrons, but also silicon and metals. The human race is pretty bad at weaning itself off of limited (as opposed to renewable) resources, and the hardware industry certainly is going to hit some limits in the future as certain metals become harder to source.

Of course, the servers use a lot of energy, so one thing I will be doing is offsetting the carbon. I’ve not yet figured out the best service to do this, but will start by looking at Brighter Planet.

From a social perspective, there’s also questions about how those resources are sourced. We should be considering the working conditions of where the metals are mined (and by whom), the people who are soldering the logic boards, and those who place the finished products into racks in data centres.

As an example, let’s look at Amazon. Given the recent issues raised with the conditions for staff in their warehouses, I think it’s fair to seek clarification on the situation of their web service colleagues. And what if there were significant ethical issues for using AWS? What then for Flying Sphinx, which runs EC2 instances and is an add-on for Heroku, a business built entirely on top of Amazon’s offerings?

I could at least use servers elsewhere – but that means bandwidth between servers and Heroku apps starts to cost money – and we introduce a step of latency into the service. Neither of those things are ideal. Or I could just say that I don’t want to support Amazon at all, and shut down Flying Sphinx, remove all my Heroku apps, and find some other hosting service to use.

Am I getting a little too carried away? Perhaps, but this is all hypothetical anyway. I’m guessing Amazon’s techs are looked after decently (though I’d love some confirmation on this), and am hoping the situation improves for their warehouse staff as well.

I am still searching for answers for what truly sustainable hardware – and moreso, sustainable web services – financially, socially, environmentally, and technically. What’s your take? What have I forgotten?

24 Sep 2011

Versioning your APIs

As I developed Flying Sphinx, I found myself both writing and consuming several APIs: from Heroku to Flying Sphinx, Flying Sphinx to Heroku, the flying-sphinx gem in apps to Flying Sphinx, Flying Sphinx to Sphinx servers, and Sphinx servers to Flying Sphinx.

None of that was particularly painful – but when Josh Kalderimis was improving the flying-sphinx gem, he noted that the API it interacts with wasn’t that great. Namely, it was inconsistent with what it returned (sometimes text status messages, sometimes JSON), it was sending authentication credentials as GET/POST parameters instead of in a header, and it wasn’t versioned.

I was thinking that given I control pretty much every aspect of the service, it didn’t matter if the APIs had versions or not. However, as Josh and I worked through improvements, it became clear that the apps using older versions of the flying-sphinx gem were going to have one expectation, and newer versions another. Versioning suddenly became a much more attractive idea.

The next point of discussion was how clients should specify which version they are after. Most APIs put this in the path – here’s Twitter’s as an example, specifying version 1:

However, I’d recently been working with Scalarium’s API, and theirs put the version information in a header (again, version 1):

Accept: application/vnd.scalarium-v1+json

Some research turned up a discussion on Hacker News about best practices for APIs – and it’s argued there that using headers keeps the paths focused on just the resource, which is a more RESTful approach. It also makes for cleaner URLs, which I like as well.

How to implement this in a Rails application though? My routing ended up looking something like this:

namespace :api do
  constrants do
    scope :module => :v1 do
      resource :app do
        resources :indices

  constraints do
    scope :module => :v2
      resource :app do
        resources :indices

The ApiVersion class (which I have saved to app/lib/api_version.rb) is where we check the version header and route accordingly:

class ApiVersion
  def initialize(version)
    @version = version

  def matches?(request)
    versioned_accept_header?(request) || version_one?(request)


  def versioned_accept_header?(request)
    accept = request.headers['Accept']
    accept && accept[/application\/vnd\.flying-sphinx-v#{@version}\+json/]

  def unversioned_accept_header?(request)
    accept = request.headers['Accept']
    accept.blank? || accept[/application\/vnd\.flying-sphinx/].nil?

  def version_one?(request)
    @version == 1 && unversioned_accept_header?(request)

You’ll see that I default to version 1 if no header is supplied. This is for the older versions of the flying-sphinx gem – but if I was starting afresh, I may default to the latest version instead.

All of this gives us URLs that look like something like this:

My SSL certificate is locked to – if it was wildcarded, then I’d be using a subdomain ‘api’ instead, and clean those URLs up even further.

The controllers are namespaced according to both the path and the version – so we end up with names like Api::V2::AppsController. It does mean you get a new set of controllers for each version, but I’m okay with that (though would welcome suggestions for other approaches).

Authentication is managed by namespaced application controllers – here’s an example for version 2, where I’m using headers:

class Api::V2::ApplicationController < ApplicationController
  skip_before_filter :verify_authenticity_token
  before_filter :check_api_params

  expose(:app) { App.find_by_identifier identifier }


  def check_api_params
    # ensure the response returns with the same header value
    headers['X-Flying-Sphinx-Token'] = request.headers['X-Flying-Sphinx-Token']
    render_json_with_code 403 unless app && app.api_key == api_key

  def api_token

  def identifier
    api_token && api_token.split(':').first

  def api_key
    api_token && api_token.split(':').last

Authentication, in case it’s not clear, is done by a header named X-Flying-Sphinx-Token with a value of the account’s identifier and api_key concatenated together, separated by a colon.

(If you’re not familiar with the expose method, that’s from the excellent decent_exposure gem.)

So where does that leave us? Well, we have an elegantly namespaced API, and both versions and authentication is managed in headers instead of paths and parameters. I also made sure version 2 responses all return JSON. Josh is happy and all versions of the flying-sphinx gem are happy.

The one caveat with all of this? While it works for me, and it suits Flying Sphinx, it’s not the One True Way for API development. We had a great discussion at the most recent Rails Camp up at Lake Ainsworth about different approaches – at the end of the day, it really comes down to the complexity of your API and who it will be used by.

02 Sep 2011

Combustion - Better Rails Engine Testing

I spent a good part of last month writing my first Rails engine – although it’s not yet released and for a client, so I won’t talk about that too much here.

Very quickly in the development process, I was looking around on how to test Rails engines. It seemed that, beyond some basic unit tests, having a full Rails application within your test or spec directory was the accepted approach for integration testing.

That felt kludgy and bloated to me, so I decided to try something a little different.

The end goal was full stack testing in a clear and manageable fashion – writing specs within my spec directory, not a bundled Rails app’s spec directory. Capybara’s DSL would be nice as well.

This, of course, meant having a Rails application to test through – but it turns out you can get away without the vast majority of files that Rails generates for you. Indeed, the one file a Rails app expects is config/database.yml – and that’s only if you have ActiveRecord in play.

Enter Combustion – my minimal Rails app-as-a-gem for testing engines, with smart defaults for your standard Rails settings.

Setting It Up

A basic setup is as follows:

  • Add the gem to your gemspec or Gemfile.
  • Run the generator in your engine’s directory to get a small Rails app stub created: combust (or bundle exec combust if you’re referencing the git repository instead).
  • Add Combustion.initialize! to your spec/spec_helper.rb (currently only RSpec is supported, but shouldn’t be hard to patch for TestUnit et al).

Here’s a sample spec_helper, mixing in Capybara as well:

require 'rubygems'
require 'bundler'

Bundler.require :default, :development

require 'capybara/rspec'


require 'rspec/rails'
require 'capybara/rails'

RSpec.configure do |config|
  config.use_transactional_fixtures = true

Putting It To Work

Firstly, you’ll want to make sure you’re using your engine within the test Rails application. The generator has likely added the hooks we need for this. If you’re adding routes, then edit spec/internal/config/routes.rb. If you’re dealing with models, make sure you add the tables to spec/internal/db/schema.rb. The README covers this a bit more detail.

And then, get stuck into your specs. Here’s a really simple example:

# spec/controllers/users_controller_spec.rb
require 'spec_helper'

describe UsersController do
  describe '#new' do
    it "runs successfully" do
      get :new

      response.should be_success

Or, using Capybara for integration:

# spec/acceptance/visitors_can_sign_up_spec.rb
require 'spec_helper'

describe 'authentication process' do
  it 'allows a visitor to sign up' do
    visit '/'

    click_link 'Sign Up'
    fill_in 'Name',     :with => 'Pat Allan'
    fill_in 'Email',    :with => ''
    fill_in 'Password', :with => 'chunkybacon'
    click_button 'Sign Up'

    page.should have_content('Sign Out')

And that’s really the core of it. Write the specs you need to test your engine within the context of a full Rails application. If you need models, controllers or views in the internal application to fully test out your engine, then add them to the appropriate location within spec/internal – but only add what’s necessary.

Rack It Up

Oh, and one of my favourite little helpers is this: Combustion’s generator adds a file to your engine, which means you can fire up your test application in the browser – just run rackup and visit http://localhost:9292.


As already mentioned, Combustion is built with RSpec in mind – but I will happily accept patches for TestUnit as well. Same for Cucumber – should work in theory, but I’m yet to try it.

It’s also written for Rails 3.1 – it may work with Rails 3.0 with some patches, but I very much doubt it’ll play nicely with anything before that. Still, feel free to investigate.

And it’s possible that this could be useful for integration testing for libraries that aren’t engines. If you want to try that, I’d love to hear how it goes.

Final Notes

So, where do we stand?

  • You can test your engine within a full Rails stack, without a full Rails app.
  • You only add what you need to your Rails app stub (that lives in spec/internal).
  • Your testing code is DRYer and easier to maintain.
  • You can use standard RSpec and Capybara helpers for integration testing.
  • You can view your test application via Rack.

I’m not the first to come up with this idea – after I had finished Combustion, it was pointed out to me that Kaminari’s test suite does a similar thing (just not extracted out into a separate library). It wouldn’t surprise me if others have done the same – but in my searching, I kept coming across well-known libraries with full Rails apps in their test or spec directories.

If you think Combustion could suit your engine, please give it a spin – I’d love to have others kick the tires and ensure it works in a wider set of situations. Patches and feedback are most definitely welcome.

30 May 2011

Searching with Sphinx on Heroku

Just over two weeks ago, I released Flying Sphinx – which provides Sphinx search capability for Heroku apps. I’ll talk more about how I built it and the challenges faced at some point, but right now I just want to introduce the service and how you may go about using it.

Why Sphinx?

Perhaps you’re not familiar with Sphinx and how it can be useful. For those who are new to Sphinx, it’s a full-text search tool – think of your own personal Google for within your website. It comes with two main moving parts – the indexer tool for interpreting and storing your search data (indices), and the searchd tool, which runs as a daemon accepting search requests, and returns the most appropriate matches for a given search query.

In most situations, Sphinx is very fast at indexing your data, and connects directly to MySQL and PostgreSQL databases – so it’s quite a good fit for a lot of Rails applications.

Using Sphinx in Rails

I’ve written a gem, Thinking Sphinx, which integrates Sphinx neatly with ActiveRecord. It allows you to define indices in your models, and then use rake tasks to handle the processing of these indices, along with managing the searchd daemon.

If you want to install Sphinx, have a read through of this guide from the Thinking Sphinx documentation – in most cases it should be reasonably painless.

Installing Thinking Sphinx in a Rails 3 application is quite simple – just add the gem to your Gemfile:

gem 'thinking-sphinx', '2.0.5'

For older versions of Rails, the Thinking Sphinx docs have more details.

I’m not going to get too caught up in the details of how to structure indices – this is also covered within the Thinking Sphinx documentation – but here’s a quick example, for user account:

class User < ActiveRecord::Base
  # ...
  define_index do
    indexes name, :sortable => true
    indexes location
    has admin, created_at
  # ...

The indexes method defines fields – which are the textual data that people can search for. In this case, we’ve got the user names and locations covered. The has method is for attributes – which are used for filtering and sorting (fields can’t be used for sorting by default). The distinction of fields and attributes is quite important – make sure you understand the difference.

Now that we have our index defined, we can have Sphinx grab the required data from our database, which is done via a rake task:

rake ts:index

What Sphinx does here is grab all the required data from the database, inteprets it and stores it in a custom format. This allows Sphinx to be smarter about ranking search results and matching words within your fields.

Once that’s done, we next start up the Sphinx daemon:

rake ts:start

And now we can search! Either in script/console or in an appropriate action, just use the search method on your model: 'pat'

This returns the first page of users that match your search query. Sphinx always paginates results – though you can set the page size to be quite large if you wish – and Thinking Sphinx search results can be used by both WillPaginate and Kaminari pagination view helpers.

Instead of sorting by the most relevant matches, here’s examples where we sort by name and created_at: 'pat', :order => :name 'pat', :order => :created_at

And if we only want admin users returned in our search, we can filter on the admin attribute: 'pat', :with => {:admin => true}

There’s many more options for search calls – the documentation (yet again) covers most of them quite well.

One more thing to remember – if you change your index structures, or add/remove index defintions, then you should restart and reindex Sphinx. This can be done in a single rake task:

rake ts:rebuild

If you just want the latest data to be processed into your indices, there’s no need to restart Sphinx – a normal ts:index call is fine.

Using Thinking Sphinx with Heroku

Now that we’ve got a basic search setup working quite nicely, let’s get it sorted out on Heroku as well. Firstly, let’s add the flying-sphinx gem to our Gemfile (below our thinking-sphinx reference):

gem 'flying-sphinx', '0.5.0'

Get that change (along with your indexed model setup) deployed to Heroku, then inform Heroku you’d like to use the Flying Sphinx add-on (the entry level plan costs $12 USD per month):

heroku addons:add flying_sphinx:wooden

And finally, let’s get our data on the site indexed and the daemon running:

heroku rake fs:index
heroku rake fs:start

Note the fs prefix instead of the ts prefix in those rake calls – the normal Thinking Sphinx tasks are only useful on your local machine (or on servers that aren’t Heroku).

When you run those rake tasks, you will probably see the following output:

Sphinx cannot be found on your system. You may need to configure the
following settings in your config/sphinx.yml file:
  * bin_path
  * searchd_binary_name
  * indexer_binary_name

For more information, read the documentation:

This is because Thinking Sphinx doesn’t have access to Sphinx locally, and isn’t sure which version of Sphinx is available. To have these warnings silenced, you should add a config/sphinx.yml file to your project, with the version set for the production environment:

  version: 1.10-beta

Push that change up to Heroku, and you won’t see the warnings again.

For the more curious of you: the Sphinx daemon is located on a Flying Sphinx server, also located within the Amazon cloud (just like Heroku) to keep things fast and cheap. This is all managed by the flying-sphinx gem, though – you don’t need to worry about IP addresses or port numbers.

Also: the same rules apply with Flying Sphinx for modifying index structures or adding/removing index definitions – make sure you restart Sphinx so it’s aware of the changes:

heroku rake fs:rebuild

The final thing to note is that you’ll want the data in your Sphinx indices updated regularly – perhaps every day or every hour. This is best done on Heroku via their Cron add-on – since that’s just a rake task as well.

If you don’t have a cron task already, the following (perhaps in lib/tasks/cron.rake) will do the job:

desc 'Have cron index the Sphinx search indices'
task :cron => 'fs:index'

Otherwise, maybe something more like the following suits:

desc 'Have cron index the Sphinx search indices'
task :cron => 'fs:index' do
  # Other things to do when Cron comes calling

If you’d like your search data to have your latest changes, then I recommend you read up on delta indexing – both for Thinking Sphinx and for Flying Sphinx.

Further Sources

Keep in mind this is just an introduction – the documentation for Thinking Sphinx is pretty good, and Flying Sphinx is improving regularly. There’s also the Thinking Sphinx google group and the Flying Sphinx support site if you have questions about either, along with numerous blog posts (though the older they are, the more likely they’ll be out of date). And finally – I’m always happy to answer questions about this, so don’t hesitate to get in touch.

12 Mar 2010

Using Thinking Sphinx with Cucumber

While I highly recommend you stub out your search requests in controller unit tests/specs, I also recommend you give your full stack a work-out when running search scenarios in Cucumber.

This has gotten a whole lot easier with the ThinkingSphinx::Test class and the integrated Cucumber support, but it’s still not perfect, mainly because generally everyone (correctly) keeps their database changes within a transaction. Sphinx talks to your database outside Rails’ context, and so can’t see anything, unless you turn these transactions off.

It’s not hard to turn transactions off in your features/support/env.rb file:

Cucumber::Rails::World.use_transactional_fixtures = false

But this makes Cucumber tests far more fragile, because either each scenario can’t conflict with each other, or the database needs to be cleaned before and after each scenario is run.

Pretty soon after I added the inital documentation for this, a few expert Cucumber users pointed out that you can flag certain feature files to be run without transactional fixtures, and the rest use the default:

Feature: Searching
  In order to find things as easily as possible
  As a user
  I want to search across all data on the site

This is a good step in the right direction, but it’s not perfect – you’ll still need to clean up the database. Writing steps to do that is easy enough:

Given /^a clean slate$/ do
  Object.subclasses_of(ActiveRecord::Base).each do |model|
    next unless model.table_exists?
    model.connection.execute "TRUNCATE TABLE `#{model.table_name}`"

(You can also use Database Cleaner, as noted by Thilo in the comments).

But adding that to the start and end of every single scenario isn’t particularly DRY.

Thankfully, there’s Before and After hooks in Cucumber, and they can be limited to scenarios marked with certain tags. Now we’re getting somewhere!

Before('@no-txn') do
  Given 'a clean slate'

After('@no-txn') do
  Given 'a clean slate'

And here’s a bonus step, to make indexing data a little easier:

Given /^the (\w+) indexes are processed$/ do |model|
  model = model.titleize.gsub(/\s/, '').constantize
  ThinkingSphinx::Test.index *model.sphinx_index_names

So, how do things look now? Well, you can write your features normally – just flag them with no-txn, and your database will be cleaned up both before and after each scenario.

My current preferred approach is adding a file named features/support/sphinx.rb, containing this code:

require 'cucumber/thinking_sphinx/external_world'

Before('@no-txn') do
  Given 'a clean slate'

After('@no-txn') do
  Given 'a clean slate'

And I put the step definitions in either features/step_definitions/common_steps.rb or features/step_definitions/search_steps.rb.

So, now you have no excuse to not use Thinking Sphinx with your Cucumber suite. Get testing!

03 Jan 2010

A Month in the Life of Thinking Sphinx

It’s just over two months since I asked for – and received – support from the Ruby community to work on Thinking Sphinx for a month. A review of this would be a good idea, hey?

I’m going to write a separate blog post about how it all worked out, but here’s a long overview of the new features.

Internal Cucumber Cleanup

This one’s purely internal, but it’s worth knowing about.

Thinking Sphinx has a growing set of Cucumber features to test behaviour with a live Sphinx daemon. This has made the code far more reliable, but there was a lot of hackery to get it all working. I’ve cleaned this up considerably, and it is now re-usable for other gems that extend Thinking Sphinx.

External Delta Gems

Of course, it was my own re-use that was driving that need: I wanted to use it in gems for the delayed job and datetime delta approaches.

There was a clear need for removing these two pieces of functionality from Thinking Sphinx: to keep the main library as slim as possible, and to make better use of gem dependencies, allowing people to use whichever version of delayed job they like.

So, if you’ve not upgraded in a while, it’s worth re-reading the delta page of the documentation, which covers the new setup pretty well.

Testing Helpers

Internal testing is all very well, but what’s much more useful for everyone using Thinking Sphinx is the new testing class. This provides a clean, simple interface for processing indexes and starting the Sphinx daemon.

There’s also a Cucumber world that simplifies things even further – automatically starting and stopping Sphinx when your features are run. I’ve been using this myself in a project over the last few days, and I’m figuring out a neat workflow. More details soon, but in the meantime, have a read through the documentation.

No Vendored Code for Gems

One of the uglier parts of Thinking Sphinx is the fact that it vendors Riddle and AfterCommit (and for a while, Delayed Job), two essential libraries. This is not ideal at all, particularly when gem dependencies can manage this for you.

So, Thinking Sphinx no longer vendors these libraries if you install it as a gem – instead, the riddle and after_commit gems will get brought along for the ride.

The one catch is that they’re still vendored for plugin installations. I recommend people use Thinking Sphinx as a gem, but there are valid reasons for going down the plugin path.

Default Sphinx Scopes

Thanks to some hard work by Joost Hietbrink of the Netherlands, Thinking Sphinx now supports default sphinx scopes. All I had to do was merge this in – Joost was the first contributor to Thinking Sphinx (and there’s now over 100!), so he knows the code pretty well.

In lieu of any real documentation, here’s a quick sample – define a scope normally, and then set it as the default:

class Article < ActiveRecord::Base
  # ...
  sphinx_scope(:by_date) {
    {:order => :created_at_}
  default_sphinx_scope :by_date
  # ...

Thread Safety

I’ve made some changes to improve the thread safety of Thinking Sphinx. It’s not perfect, but I think all critical areas are covered. Most of the dynamic behaviour occurs when the environment is initialised anyway.

That said, I’m anything but an expert in this area, so consider this a tentative feature.

Sphinx Select Option

Another community-sourced patch – this time from Andrei Bocan in Romania: if you’re using Sphinx 0.9.9, you can make use of its custom select statements: 'pancakes',
  :sphinx_select => '*, @weight + karma AS superkarma'

This is much like the :select option in ActiveRecord – but make sure you use :sphinx_select (as the former gets passed through to ActiveRecord’s find calls).

Multiple Index Support

You can now have more than one index in a model. I don’t see this as being a widely needed feature, but there’s definitely times when it comes in handy (such as having one index with stemming, and one without). The one thing to note is that all indexes after the first one need explicit names:

define_index 'stemmed' do
  # ...

You can then specify explicit indexes when searching: 'pancakes',
  :index => 'stemmed_core' 'pancakes',
  :index => 'article_core,stemmed_core'

Don’t forget that the default index name is the model’s name in lowercase and underscores. All indexes are prefixed with _core, and if you’ve enabled deltas, then a matching index with the _delta suffix exists as well.

Building on from this, you can also now have indexes on STI subclasses when superclasses are already indexed.

While the commits to this feature are mine, I was reading code from a patch by Jonas von Andrian – so he’s the person to thank, not me.

Lazy Initialisation

Thinking Sphinx needs to know which models have indexes for searching and indexing – and so it would load every single model when the environment is initialised, just to figure this out. While this was necessary, it also is slow for applications with more than a handful of models… and in development mode, this hit happens on every single page load.

Now, though, Thinking Sphinx only runs this load request when you’re searching or indexing. While this doesn’t make a difference in production environments, it should make life on your workstations a little happier.

Lazy Index Definition

In a similar vein, anything within the define_index block is now evaluated when it’s needed. This means you can have it anywhere in your model files, whereas before, it had to appear after association definitions, else Thinking Sphinx would complain that they didn’t exist.

This feature actually introduced a fair few bugs, but (thanks to some patience from early adopters), it now runs smoothly. And if it doesn’t, you know where to find me.

Sphinx Auto-Version detection

Over the course of the month, Thinking Sphinx and Riddle went through some changes as to how they’d be required (depending on your version of Sphinx). First, there was separate gems for 0.9.8 and 0.9.9, and then single gems with different require statements. Neither of these approaches were ideal, which Ben Schwarz clarified for me.

So I spent a day or two working on a solution, and now Thinking Sphinx will automatically detect which version you have installed. You don’t need any version numbers in your require statements.

The one catch with this is that you currently need Sphinx installed on every machine that needs to know about it, including web servers that talk to Sphinx on a separate server. There’s an issue logged for this, and I’ll be figuring out a solution soon.

Sphinx 0.9.9

This isn’t quite a Thinking Sphinx feature, but it’s worth noting that Sphinx 0.9.9 final release is now available. If you’re upgrading (which should be painless), the one thing to note is that the default port for Sphinx has changed from 3312 to 9312.


If you want to grab the latest and greatest Thinking Sphinx, then version 1.3.14 is what to install. And read the documentation on upgrading!

28 Oct 2009

Funding Thinking Sphinx

Update: I’ve now hit my target. If you want to donate more, I won’t turn you away, but perhaps you should send those funds to other worthy open source projects, or a local charity. A massive thank you to all who have pitched in to the pledgie, your generosity and support is amazing.

Over the past two years, Thinking Sphinx has grown massively – in lines of code, in the numbers of users, in complexity, in time required to support it. I’m regularly amazed and touched by the recommendations I see on Twitter, and the feedback I get in conversations. The fact that there’s been almost one hundred contributors is staggering.

It’s not all fun and games, though… there’s still plenty of features that can be added, and bugs to be fixed, and documentation to write. So, what I’d really like to do is spend November working close to full-time on just Thinking Sphinx. I have a long task list. All I need is a bit of financial help to cover living expenses.

I have an existing pledgie tied to the GitHub project, currently sitting on $600. If I can get another $2000, then I won’t have to worry at all about how I’m going to pay bills or rent for November. Even $1400 will make it viable for me, albeit maybe with some help from my savings.

If you or your workplace can make a donation, that would be very much appreciated. I’m happy to provide weekly updates on where things are at if people request it – but of course, watching the GitHub projects for Thinking Sphinx itself and the documentation site is the most reliable way to keep an eye on my progress.

I’m hoping to get Thinking Sphinx to a point where the documentation is by far the best place for support, and it’s only the really tricky problems (and bug reports) that end up in my inbox.

I want it to be a model Ruby library that doesn’t get in your way, is as fast as possible, and plays nicely with other libraries.

I want the testing suite to be rock-solid. I’ve been much better at writing tests first over the last six months, and using Cucumber has made the test suite so much more reliable, but there’s still some way to go.

This is not a rewrite – it’s polishing.

I’ve been toying with this idea for a while, and it’s time to have a stab at it. Hopefully you can provide some assistance to do this.

27 Sep 2009


This morning I decided to get Nginx and Passenger set up in my local dev environment. I needed an easier way to test of Thinking Sphinx in such environments, but also, I find Nginx configuration syntax so much easier than Apache.

And of course, if I’ve got these components there, it would be great to use them to serve my development versions of rails applications, much like script/server. So I’ve got a script/nginx file that manages that as well. Sit tight, and let’s run through how to make this happen on your machine.

Be Prepared to Think

Firstly, a couple of notes on my development machine – I’m running Snow Leopard, and I compile libraries by source. No MacPorts, no custom versions of Ruby (yet). So, you may need to tweak the instructions to fit your own setup.

Installing Passenger

Before we get to Nginx, you’ll want the Passenger gem installed first.

sudo gem install passenger

You’ll also need to compile Passenger’s nginx module (keep an eye on the file path below – yours may be different):

cd /Library/Ruby/Gems/1.8/gems/passenger-2.2.5/ext/nginx
sudo rake nginx

Installing Nginx

Nginx requires the PCRE library, so that adds an extra step, but it’s nothing too complex. Jump into Terminal or your shell application of choice, create a directory to hold all the source files, and step through the following commands (initially sourced from instructions by Wincent Colaiuta):

curl -O \
tar xjvf pcre-7.9.tar.bz2
cd pcre-7.9
make check
sudo make install

That should be PCRE taken care of – I didn’t have any issues on my machine, hopefully it’s the same for you. Next up: Nginx itself. Grab the source:

curl -O \
tar zxvf nginx-0.7.62.tar.gz
cd nginx-0.7.62

Let’s pause for a second before we configure things.

Even though the focus is having Nginx working in a local user setting, not system-wide, I wanted the default file locations to be something approaching Unix/OS X standards, so I’ve gone a bit crazy with configuration flags. You may want to alter them to your own personal tastes:

./configure \
  --prefix=/usr/local/nginx \
  --add-module=/Library/Ruby/Gems/1.8/gems/passenger-2.2.5/ext/nginx \
  --with-http_ssl_module \
  --with-pcre \
  --sbin-path=/usr/sbin/nginx \
  --conf-path=/etc/nginx/nginx.conf \
  --pid-path=/var/nginx/ \
  --lock-path=/var/nginx/nginx.lock \
  --error-log-path=/var/nginx/error.log \

And with that slightly painful step out of the way, let’s compile and install:

sudo make install

And just to test that Nginx is happy, run the following command:

nginx -v

Do you see the version details? Great! (If you don’t, then review the last couple of steps – did anything go wrong? Do you have the passenger module path correct?)

Configuring for a Rails App

The penultimate section – let’s create a simple configuration file for Rails applications, which can be used by our script/nginx file. I store mine at /etc/nginx/rails.conf, but you can put yours wherever you like.

daemon off;

events {
  worker_connections  1024;

http {
  include /etc/nginx/mime.types;
  # Assuming path has been set to a Rails application
  access_log            log/nginx.access.log;
  client_body_temp_path tmp/nginx.client_body_temp;
  fastcgi_temp_path     tmp/nginx.client_body_temp;
  proxy_temp_path       tmp/nginx.proxy_temp;
  passenger_root /Library/Ruby/Gems/1.8/gems/passenger-2.2.5;
  passenger_ruby /usr/bin/ruby;
  server {
    listen      3000;
    server_name localhost;
    root              public;
    passenger_enabled on;
    rails_env         development;


The final piece of the puzzle – the script/nginx file, for the Rails app of your choice:

#!/usr/bin/env bash
nginx -p `pwd`/ -c /etc/nginx/rails.conf \
  -g "error_log `pwd`/log/nginx.error.log; pid `pwd`/log/;";

Don’t forget to make it executable:

chmod +x script/nginx

If you run the script right now, you’ll see a warning that Nginx can’t write to the global error log, but that’s okay. Even with that message, it uses a local error log. I’ve granted full access to the global log just to avoid the message, but if you know a Better Way, I’d love to hear it.

sudo chmod 666 /var/nginx/error.log

Head on over to localhost:3000 – and, after Passenger’s warmed up, your Rails app should load. Success!

Known Limitations

  • The environment is hard-coded to development. If this is annoying, the easiest way around it is to create multiple versions of rails.conf, one per environment, and then use the appropriate one in your script/nginx file.
  • You can’t specify a custom port either. Patches welcome.
  • You won’t see the log output. Either tail log/development.log when necessary, or suggest a patch for script/nginx. I’d prefer the latter.

Beyond that, it should work smoothly. If I’m wrong, that’s what the comments form is for.

Also, you can find all of my config files, as well as other details of how I’ve set up my machine since installing Snow Leopard, on

14 Jul 2009

Rails Camps - Coming to a Country Near You

This weekend, there’s going to be a Rails Camp. In October, there’s going to be a Rails Camp. Then in November, there’s going to be a Rails Camp. That in itself is pretty freaking cool. What’s even cooler is that they’re in Maine, England and Australia respectively.


If you’re not quite sure what Rails Camps are – they’re unconference style events, held away from cities, generally without internet, on a weekend from Friday to Monday. The venues are usually scout halls or similar, so the name is slightly inaccurate – most people don’t bring tents, but sleep in dorm rooms instead.

Getting Down to Business

Also, they are events for Rubyists of all level of experience – and not just focused on Rails either. Anything related to Ruby and development in general is a welcome topic for discussion.

Communal Hacking

The weekends are made up of plenty of hacking, socialising, talks, and partying. Alcohol and guitar hero usually feature. A ton of fun ensues.

Making Pizzas

Rails Camp New England

A quick rundown in chronological order: first up, from the 17th to 20th of July, is Rails Camp New England. This will (as far as I know) be the first Rails Camp in North America. We’ll be up in the middle of Maine, at the MountainView House (a bit different from most Rails Camp venues) in Bryant Pond.

Unfortunately, if you want to come to this camp, we’re all sold out. Let me know anyway, just in case someone drops out (although it is late notice).

Rails Camp UK 2

Building on the success of last year’s first UK Rails Camp, a second one has been put together by Tom Crinson out in Margate, Kent.


If you’re anywhere in the UK, or even Europe, you really should be keeping the weekend of the 16th to 19th of October free. In fact, go book your spot right now.

Rails Camp Australia 6

Last on this list is the original Rails Camp, that started back in June 2007, run by the inimitable Ben Askins. We’re returning to Melbourne (the host of the second camp, in November 2007), but this time we’re down by the beach in Somers.

John showing us how it's done

November 20th to 23rd are the dates for this, and going by the names of confirmed attendees, alongside what looks to be an fantastic venue, it’s going to rock just as much as the last five (and quite possibly even more). Feel like booking your place?

For all of these events, you should beg, borrow or steal to get your hands on a ticket. The energy, intelligence and passion of past camps has been amazing (which is why I do my best to spread the word), and they are a breath of fresh air compared to the staid and structured setup of RailsConf and most other technical conferences.

Thanks to John Barton, Max Muermann, and Jason Crane for the photos above.

22 Apr 2009

A Visit to Vegas

In case you’ve not booked your ticket yet, but were considering coming and needed one more reason: I’ll be giving a tutorial on Sphinx at RailsConf in Las Vegas next month. To get 15% discount, use the code ‘RC09FOS’. (O’Reilly seem to hand out codes everywhere. If you paid full price this time, keep that in mind for next year.)

If you will be there, I’m more than happy to have a chat – about Sphinx or anything else – so let me know if you want to meet up.

Crowdsourced Research

For those of you who know Sphinx and Thinking Sphinx already, I’d love to hear about the things you found a bit difficult when learning. What are the topics I should make sure I cover in my tutorial, that are maybe lacking in documentation?

12 Mar 2009

Link: How-To Setup a Linux Server for Ruby on Rails - with Phusion Passenger and GitHub - Hack'd

Can skip some of this thanks to Sprinkle, but it's a useful reference nonetheless.

09 Jan 2009

Link: Thinking Sphinx in Arabic/Unicode | ExpressionLab

"here is what to do to support Arabic (Unicode) search."

06 Jan 2009

Thinking Sphinx Delta Changes

There’s been a bit of changes under the hood with Thinking Sphinx lately, and some of the more recent commits are pretty useful.

Small Stuff

First off, something neat but minor – you can now use decimal, date and timestamp columns as attributes – the plugin automatically maps those to float and datetime types as needed.

There’s also now a cucumber-driven set of feature tests, which can run on MySQL and PostgreSQL. While that’s not important to most users, it makes it much less likely that I’ll break things. It’s also useful for the numerous contributors – just over 50 people as of this week! You all rock!

New Delta Possibilities

The major changes are around delta indexing, though. As well as the default delta column approach, there’s now two other methods of getting your changes into Sphinx. The first, requested by some Ultrasphinx users, and heavily influenced by a fork by Ed Hickey, is datetime-driven deltas. You can use a datetime column (the default is updated_at), and then run the thinking_sphinx:index:delta rake task on a regular basis to load recent changes into Sphinx.

Your define_index block would look something like the following:

define_index do
  # ... field and attribute definitions
  set_property :delta => :datetime, :threshold =>

If you want to use a column other than updated_at, set it with the :delta_column option.

The above situation is if you’re running the rake task once a day. The more often you run it, the lower you can set your threshold. This is a bit different to the normal delta approach, as changes will not appear in search results straight away – only whenever the rake task is run.

Delayed Reaction

One of the biggest complaints with the default delta structure is that it didn’t scale. Your delta index got larger and larger every time records were updated, and that meant each change got slower and slower, because the indexing time increased. When running multiple servers, you could get a few indexer processes running at once. That ain’t good.

So now, we have delayed deltas, using the delayed_job plugin. You’ll need to have the job queue being processed (via the thinking_sphinx:delayed_delta rake task), but everything is pushed off into that, instead of overloading your web server. It means the changes take slightly longer to get into Sphinx, but that’s almost certainly not going to be a problem.

Firstly, you’ll need to create the delayed_jobs table (see the delayed_job readme for example code), and then change your define_index block so it looks something like this:

define_index do
  # ... field and attribute definitions
  set_property :delta => :delayed

Riddle Update

As part of the restructuring over the last couple of months, I’ve also added some additional code to Riddle, my Ruby API for Sphinx. It now has objects to represent all of the configuration elements of Sphinx (ie: settings for sources, indexes, indexer and searchd), and can generate the configuration file for you. This means you don’t need to worry about doing text manipulation, just do everything in neat, clean Ruby.

Documentation on this is non-existent, mind you, but the source shouldn’t be too hard to grok. I also need to update Thinking Sphinx’s documentation to cover the delta changes – for now, this blog post will have to do. If you get stuck, check out the Google Group.

Sphinx 0.9.9

One more thing: Thinking Sphinx and Riddle now both have Sphinx 0.9.9 branches – not merged into master, as most people are still using Sphinx 0.9.8, but you can find both code sets on GitHub.

30 Dec 2008

Freelancing Tips via Rails Camp 4


The fourth Australian Rails Camp happened back in the middle of November – and it was unsurprisingly and extremely enjoyably awesome, just like the previous four. Ryan and Anthony did a sterling job with putting it all together.

I probably talked a bit too much – I certainly felt I had more than my fair share of peoples’ focus – and while I rabbited on about Sphinx and Ginger, the topic I really enjoyed ranting about was freelancing, because it became far less about me, and far more about sharing the wealth of everybody’s experiences. I provided a few starting points, and then wise RORO minds added their own thoughts and opinions.

I can’t reproduce all that here, though. I wouldn’t do it justice. What I can do is go over the same notes I had then, and you can add your 2 cents (or five dollars) in the comments.

Freelancing Maths

One of the first things you need to be aware of, when you start freelancing, is how much to charge. I didn’t have a clue, but some more business-minded friends put me on the right track, so I’m sharing their advice here – don’t give me any credit for it.

So, let’s assume you want to start freelancing, and you have a target of earning $80,000 over the year (yes, some of you may say that’s too low, but others will say it’s too high – it’s just an example, okay?). You can use this as a basis for figuring out an hourly rate. There’s 52 weeks in a year, 5 days in a week, and 8 hours in a day…

 52 weeks
x 5  days
x 8 hours
x ?  rate

But wait a second – are you really going to work all of those 52 weeks? I doubt it. You’ll need time off for annual leave, sick leave and public holidays – the times when an employer would still pay you when you’re not slaving away. Australian annual leave is four weeks, sick leave is usually two, let’s add in another one for public holidays, and that brings us down to 45.

 45 weeks
x 5  days
x 8 hours
x ?  rate

What are the odds you’re going to have work all the time though, and are you really going to have eight billable hours each day? Unless you’re some sort of machine, the answer’s no, trust me. So lets drop eight down to six.

     45 weeks
x     5  days
x     6 hours
x 59.25  rate

One thing we’ve missed in our calculations is superannuation. Again, using Australia as the example (because it’s all I can reliably comment on), you’re supposed to be putting away 9% of your income into your super account. Let’s factor that in:

     45 weeks
x     5  days
x     6 hours
x 64.59  rate

Okay, so we can get an hourly rate of about $65 from that maths. And that could be fine… but maybe you’ve been eyeing off RailsConf or RubyConf or other such events. They’re not cheap – and hopefully employers would normally fork out the cash to get you there. You’re the employer now, so how are you going to afford it? Add an allowance into your calculations.

Again, due to the remoteness of Australia, it’s extra expensive to get to any of the major Ruby conferences. If we assume you’ll get to two of them (again, could be extravagant for some of you, but this is all hypothetical), then I’m adding a touch over $12,000 – flights, hotels, insurance, the conference tickets – to bring us to a nice round $100,000 target.

Also, I’ve dropped the number of weeks down another two – it’s not like you’ll be getting anything done for your clients as you jet around.

     43 weeks
x     5  days
x     6 hours
x 77.52  rate

Okay, our final hourly rate is about $77.50.

I know a lot of the more experienced developers are looking at that value and thinking it’s pretty low – and going by market rates (for Ruby developers), it’s definitely below average. Some say a good ballpark figure for a decent Rails developer is $100/hour – USD or AUD (remember when the two currencies were almost on par?). This doesn’t mean you should charge that much (or that little) – but it should factor into your thinking.

All that said, you need to be comfortable with what you’re billing your time at, but don’t be afraid to charge what you’re worth. If the idea of having more cash than you expect scares you, there’s plenty of charities who would like to be your friend. Or, you could just work less, and spend the extra time on cool things (and they don’t even have to involve code!)

Freelancing Profile

Knowing what to charge is useful, but it’s not going to bring in the clients. Being known will help that problem, though – and there’s a few things you can do to help that.


Interpret how you will – a normal blog, twitter, tumblelog, even gists and pasties – sharing your ideas and knowledge is a great way to get your name known to others. It also helps build some human connections, via comments, emails or directed tweets. If it is valuable, they will find you (and if you think they need help, use a site like RubyFlow or RubyCorner to bring in some eyeballs).


If there’s a neat bit of code you’ve found, library you’ve come across (or written), or knowledge you think is valuable to others, offer to talk about it! It can be at your local Ruby group, or at something like a Rails Camp or BarCamp, or if you’re really comfortable up on stage, think about applying for a RailsConf or RubyConf slot.

I’m not a natural public speaker – but my confidence has grown in leaps and bounds from giving talks to fellow developers. Granted, I need to build up a bigger repertoire of topics, but I’m a bit less nervous about standing up and announcing my thoughts and opinions to others. It all started with an email from Tim Lucas asking what I was going to talk about at the first Rails Camp – and now Rails Camp folks are probably sick of hearing my voice.

They know who I am, though, and they know what code I’ve written. And that’s led to a referral or two for Rails work (usually Sphinx-related).


Networking is a dirty word – and I can see how building connections with others for the purpose of connections, instead of meeting cool people, is a bit dirty. The much more fun alternative is to socialise – go out to social events, find those drinks happening in the evenings of conferences, have a conversation with a person you’ve not met before at your local Ruby meet.

Down the track, you will find these people may throw work your way – or maybe you’ll just learn cool new ways to code, or share some of your own knowledge, or make a good friend. All chalked up as wins in my book.

Release Code

Releasing your own code – from snippets to plug ins to full-blown applications – is a great way to show peers that you know what you’re talking about. It also shows potential clients that too, and reaffirms that you’re worth the rate you’re charging, and that you can be creative.

In my own case, I’ve done the occasional bit of Sphinx consulting due to my work on Thinking Sphinx.

Coincidentally, doing all these things are rewarding in and of themselves. I don’t do them to bring in work, I do them because they’re fun and I meet awesome people, which is (I think) the best approach. The opportunities they lead to are just an added bonus.

Your Turn

So, what’s your advice to a budding freelancer? Is there anything here that’s a bit Ruby or developer-centric? Any more general suggestions to keep in mind?

Also, please keep in mind I’m not an expert. I think the above advice is useful, but it is just advice. There’s no hard and fast rules that should be followed.

And the name of this blog has nothing to do with my work lifestyle, but the idea of deities who freelance for each other. Don’t take it as an indication of my ego. Honest.

RssSubscribe to the RSS feed

About Freelancing Gods

Freelancing Gods is written by , who works on the web as a web developer in Melbourne, Australia, specialising in Ruby on Rails.

In case you're wondering what the likely content here will be about (besides code), keep in mind that Pat is passionate about the internet, music, politics, comedy, bringing people together, and making a difference. And pancakes.

His ego isn't as bad as you may think. Honest.

Here's more than you ever wanted to know.

Ruby on Rails Projects

Other Sites

Creative Commons Logo All original content on this site is available through a Creative Commons by-nc-sa licence.