SSL Client Authentication with Python and pycurl

Posted Thursday, April 14th at 7:51a.m.

If you are using Python to make requests over SSL then most likely you'll have run into the limitations of urllib2 if you are using a web service that requires client authentication. I found some recipes that "kind of" solved the problem but didn't bother to check server certificates, which kind of defeats the point if you are using two-way SSL to prevent man-in-the-middle attacks.

It was frustrating since most simple things are relatively simple to do in Python. In cURL, two-way SSL verification can be as easy as:

curl -cacert CA.crt -E client.pem https://myservice.com:443/secure/page

Obviously here we are doing verification with a Certificate Authority we created ourselves. When I vented to a colleague about the difficulty of implementing similar functionality in Python, he wondered why I hadn't bothered using pycurl. pycurl is a C-binding and the documentation is not the greatest, nor do the examples explain how to do simple SSL never mind two-way. But after digging around the source, here is one recipe that works:

import pycurl

class Response(object):
    """ utility class to collect the response """
    def __init__(self):
        self.chunks = []
    def callback(self, chunk):
        self.chunks.append(chunk)
    def content(self):
        return ''.join(self.chunks)

res = Response()

curl = pycurl.Curl()
curl.setopt(curl.URL, "https://yourservice/path")
curl.setopt(curl.WRITEFUNCTION, res.callback)
curl.setopt(curl.CAINFO, "/path/to/CA.crt")
curl.setopt(curl.SSLCERT, "/path/to/client.pem")
curl.perform()

print res.content()

And that's it. One other option you might use when testing your SSL implementation with curl is the -k flag, which skips host verification of the server. The option to set in pycurl to do the same is:

curl.setopt(curl.SSL_VERIFYHOST, 0)

Hopefully this saved you digging around the pycurl mailing list and source code like I had to.

Filed under Work, Python - comment

Looking forward to Berlin Buzzwords

Posted Thursday, May 13th at 9:14a.m.

Just a quick post to point out that a really exciting high-scalability conference is coming to Berlin in June. Berlin Buzzwords is due to take place on June 7th and 8th and features a variety of talks focusing on free open source software solutions to scalability problems.

It is particularly interesting for me because for the last few months I've been experiencing first hand just how many open source technologies have stepped up to the plate in the last few years to make the task of creating distributed and scalable systems accessible, even to mere mortals like myself. 

Some of the technologies that will be covered are:

1) Apache Hadoop - an open source version of Google's MapReduce technology, there is almost an entire day of talks dedicated to showcasing the kind of problems Hadoop is a perfect fit for. 

2) Apache Lucene and Solr - Lucene and Solr are open source search engines built for enterprise-level performance. Solr in particular gives you a whole bunch of distribution and scalability features out of the box and also lets you play with Lucene without needing to get your hands dirty with Java. In the schedule there are talks on doing geosearch and how Lucene aims to make fuzzy searching scalable in the future.

3) NoSQL technologies - there are talks on a variety of key-value stores such as Apache Cassandra, CouchDB and MongoDB and how to use these technologies for real problems rather than to simply drink the kool-aid.

The full schedule is online, so if you're in the Berlin area or are just interested in building scalable systems, check out the Berlin Buzzwords ticket page. I will be attending the conference with some of my colleagues at Nokia so if you're coming through for the conference and fancy having a coffee or beer, feel free to drop me a message at david<at>dmclaughlin.com or just send me a message on twitter.

Filed under Work, Scalability, Conferences - 2 comments

Chained accessors in Moose

Posted Friday, May 15th at 4:02a.m.

I've created my second repository on Github - MooseX::ChainedAccessors. The code is extremely simple, but very powerful: it provide a Moose attribute trait that allows you to method chain via write operations on your accessor methods. If you already know what this means and just want to grab the code, then you can do so from the github download page (I'm still waiting on my CPAN account being set up before I can upload it there).

Method chaining

Method chaining is a simple yet powerful way of creating concise APIs for objects which have lots of small, individual operations where we don't need to know about the return value. One of the most popular examples of method chaining in action is in the jQuery API:

var $mydiv = $('#my-div');
$mydiv.html('hello, world!');
$mydiv.slideUp();

// becomes...

$('#my-div').html('hello, world!').slideUp();

To achieve this, a new jQuery object is created and returned by the selector query and then each subsequent method call returns that same object. For example, a crude version of html might look like:

function html(html_as_str)
{
this.innerHTML = html_as_str;
return this;
}

And that one line - return this - is all there is to it. Regardless of the language, OO method chaining is achieved in the same way: in PHP it would be return $this, in Python return self, in Perl return $self, etc.

Accessor chaining

If method chaining is as simple as that, then adding chaining to accessors (methods which abstract read/write operations on class attributes) will be as simple as adding return $self to the end of write accessors. This is easy if you're manually writing your accessors, but Moose handles accessor creation for you using the has sugar subroutine. If you want method chaining on accessors, you'll need to write those accessors yourself - or extend Moose.

Why go to all the trouble? Well, the driving case for me was that I had created a Moose role which allowed me to display debug info on a per-instance basis. A simplified version of this Role is:

package MyApp::Roles::Debug;
use Moose::Role;

has 'debug' => (
is => 'rw',
isa => 'Bool',
default => sub { return 0; },
);

sub debug_message
{
my ($self, $message) = @_;
print $message . "\n" if $self->debug;
}

1;

This allowed me to attach my role to any complex class which required debugging when tests failed or strange things happened and I wanted an overview of what was happening. Typical usage was:

package MyApp::Model;
use Moose::Role;

with 'MyApp:Roles::Debug';

sub complex_method
{
my $self = shift;

# complex stuff in here.. lots of potential to go wrong
$self->debug_message("Here is some info about the current state");
# more complex stuff
return;
}


my $model = MyApp::Model->new(debug => 1);
$model->complex_method();

This worked fine, until I started trying to use this role with some of the APIs I had designed that took full advantage of method chaining and composition. In a previous post I showed some code from the Sugar::ORM project I'm currently working on. Well, the query interface (which is heavily based on the Django ORM) for that ORM looks like this:

my @model_objs = Model->query->filter(name__like => 'Example%');

The call to Model->query returns a Sugar::ORM::ResultSet instance which has a bunch of methods like filter, exclude, order_by and load_related which return $self unless called in list context. This allows us to build up complex filters based on criteria (like a query string from a GET request) or apply sorting and pagination all in one nice line:

my @bands = Band->query->filter(genre__name => 'Rock')->order_by(name => 'ASC');

# OR

my $query = Band->query->filter(events__date__gte => $now);
if(my @genres = CGI->param('genre'))
{
$query->filter(genre__id__in => [@genres]);
}
if(my $location = CGI->param('location'))
{
$query->filter(location__name => $location);
}
if(my $order_by = CGI->param('sort'))
{
$query->order_by(name => 'ASC');
}
my @bands = $query->limit(10);

In practice this has work really well, but an ORM is a fairly complex code base so there have been many edge cases that have led to subtle bugs. This has meant a lot of debugging to be done deep down in the ORM engine classes. I can add my debugging role to my ResultSet class but to turn on debugging on has meant splitting up the chained filters in test scripts:

my @results = Band->query->filter(genre__name => 'Rock')->order_by(name => 'ASC');

# becomes...

my $query = Band->query->filter(genre__name => 'Rock');
$query->debug(1);
my @results = $query->order_by(name => 'ASC');

This quickly became tedious, what I really wanted to do was set my attribute without breaking the chain.

MooseX::ChainedAccessors

Now, there are a couple of other ways I could have solved this problem: debug could have been passed into the call to Model->query or the debug attribute in my Role could have been changed to a method which set up chaining. But both solutions would have to be repeated any time I wanted chained accessors in the future. In the spirit of DRY, I wanted a more elegant and easier to implement solution. After a quick glance at the Moose docs and long look at how has works in Moose, the Chained trait was born. The debug attribute in my Role now needed one extra line:

package MyApp::Roles::Debug;
use Moose::Role;

has 'debug' =>
(
traits => ['Chained'],
is => 'rw',
isa => 'Bool',
default => sub { 0; },
);

# .. etc.

And I could now debug my ORM queries with one simple change:

Band->query->load_related->get(name => 'Mesa Verde');

# becomes ...

Band->query->debug(1)->load_related->get(name => 'Mesa Verde');

And that's all there is to it.

Installing

The package isn't on CPAN yet, so in the meantime you can install it manually by downloading from github, unpacking the tar into a temporary directory and running these commands:

perl Makefile.PL
make test
make install

That's it. The only dependency is, of course, Moose. Oh, and Module::Install (thanks Tim).

Filed under Perl, Moose, Code - 7 comments

Progressive enhancement with PerlTemplates

Posted Sunday, April 26th at 1:29p.m.

I've just created my first repository on github: PerlTemplates. PerlTemplates is a JavaScript template engine which uses the same syntax as the Perl template engine HTML::Template. Why would you need something like this?

Well hopefully you are aware of the importance of progessive enhancement; and if, like me, you work on web applications where you just can't ignore those who have disabled JavaScript in their browsers then you'll also be aware of the banality of creating both JavaScript and non-JavaScript versions of your slick AJAX interfaces.

Some of the common ways of getting round this that I've seen from other developers have been to return HTML (as opposed to returning XML or JSON) and inject that directly into the DOM, or to create the DOM elements in the Javascript code itself. Both methods have their drawbacks: as soon as you've got HTML creation in the Javascript code you now need to maintain markup in two seperare locations. Returning HTML doesn't have this problem, but for complex situations you rarely want just presentation back - you need data.

Using PerlTemplates has allowed us to follow the principles of DRY, and it has made progressive enhancement quick and simple. To illustrate, the 'output' of a search results script (which is a prime candidate for AJAXification) might look like this:

        
my $template = HTML::Template->new('/path/to/search/results.tmpl');
$template->param(%values); # where values contains the template data structure
print CGI->header, $template->output();

Well, to make it PerlTemplates-ready, we simply add a condition which returns JSON in the event of an AJAX request (which can be detected in various different ways):

if($ajax_request)
{
print CGI->header, JSON->new->encode(%values);
}
else
{
my $template = HTML::Template->new('/path/to/search/results.tmpl');
$template->param(%values);
print CGI->header, $template->output();
}

Then in our Javascript, we have code that looks like this (using jQuery):

$.getJSON('/search/results', function(json_data) {
var tmpl = new PerlTemplates({url:'results.tmpl', data: json_data, target: 'search-results'});
tmpl.render();
}

That's all there is to it. The one drawback to this approach is that your templates must be made available over HTTP, which might be tricky if your templates are stored outside of your root web directory (or you're a fan of security by obscurity).

Download

PerlTemplates has no dependencies and the minified version comes in at just over 5kb. You can see it in action here and download the package from here. PerlTemplates has been tested in all major browsers, and is currently in use on sites serving millions of pages per month.

Filed under Perl, Code, Javascript - comment

Ugly Perl: A lesson in the importance of API design

Posted Sunday, April 19th at 6:33p.m.

One of the most challenging aspects of my role as a Software Architect has been trying to standardise our development practices in a language as flexible as Perl. Powered by the culture of There is More Than One Way to Do It and a previous lack of technical leadership, we have ended up in a situation where almost every conceivable style of code has been utilised across all of our different mod_perl web applications. It's a maintenance nightmare.

Last year I started to turn that around by advocating object oriented programming and model view controller architecture using modern Perl techniques powered by Moose. Something I noticed though when we sat down and evaluated the state of our code was that some of the easiest and most reliable systems we had were the procedural ones, and by far the most painful systems to work with on a daily basis were those where the lead developer had went on an architectural expedition and built a tightly-coupled OOP monstrosity.

What we took away from that evaluation process is that, more than choice of programming language, more than choice of development paradigm and more than choice of development methodology, good API design is the single most important factor in how developers perceive the quality of code. Introducing the code review process to our team only confirmed this: almost every review I've taken part in has revolved around naming conventions and style rather than performance or implementation.

And it's no coincidence that by focussing less on concepts like OOP and MVC and more on creating good APIs, we've really started to turn the corner in the land of maintenance programming.

Read more...
Filed under Perl - 25 comments

How I learnt to love Perl

Posted Sunday, December 7th at 5:15p.m.

I love the programming reddit. This week was an especially  good one in terms of relevant content for me, with discussions on the death of Perl 5, the difficulties of hiring Perl programmers and how stupid it is to write your own framework. By strange coincidence, I just finished the alpha of a new in-house Perl 5 framework for my development team at work. And we're hiring!

The two Perl articles were interesting because for three years I have worked with the language and up until earlier this year, I loathed it. This was mostly because of its awful support for Object Oriented Programming, but the fact that anyone I've worked with that called themselves a Perl programmer turned out to be a total amateur hack played a part too. After seeing no less than four guys come and go from our team this year after being 'found out' on their first big project, this quote from use.perl.org particularly resonated with me:

We had no trouble finding Perl programmers. They were a dime a dozen. People write a couple of admin scripts in Perl and they put Perl on their CV. Now don't get me wrong, I have Java on my CV, but I would never dream of sitting down for a Java interview without their knowing up front that my knowledge is pre 1.5 and I don't know any of the modern tools. It's only there because I want employers to know that I have some exposure to it, not that I think I know it.

Now, maybe I'm just being naive, but somehow people with Perl on their CV think they can do this job, but they can't. Not even close.

My disdain for Perl and the quality of code that was being written was so great that I wrote a proposal for us to move our entire code base over to PHP. The scary thing is that the technical management accepted it, and I even got as far as completing an alpha PHP port of our (basic) in house framework before the whole project was abruptly cancelled by upper management for reasons unknown. When the suits put the kybosh on my PHP framework, I decided to go covert and just start doing things right without them knowing. What I've never really discussed is what, exactly, this involved.

Read more...
Filed under Work, Perl, Moose - 29 comments

Obama's maintenance nightmare

Posted Wednesday, November 5th at 8:13p.m.

On the 1st of this month, my promotion at work became official and I am now an official "Software Architect." It follows three full years of advocating the upgrading and modernising of the way we write code on our "corporate" development team. In my first post in this blog back in July, I wrote about how I was trying to save my career through a process called guerilla refactoring -- basically going rogue and fixing code and making it right without telling anyone. In the four months since that post, I became so confident that my work spoke for itself that I adopted a much simpler position - back me completely on the changes I want to make or I'm leaving.

Thankfully, my employers chose to get on board the change train.

For me I saw it as a personal victory. I don't like to quit something just because it's tough and my decision to stick it out and fight for change has been justified. Now I have to repay their faith by making the right changes and being patient through the long road ahead. I have enough experience in trying to implement small changes to know that trying to revamp the fundamentals of the way people work will be by far the most difficult thing I will ever embark on, but I wouldn't have demanded the promotion if I didn't believe I was man enough for the job.

We have several hundreds of thousands of lines of legacy code, mostly procedural Perl, that powers all of our critical products which bring in millions of pounds a month. These systems started small but grew in complexity to become bloated and monolithic, bug-ridden products with severe dependencies that limited what could be done to improve them. But I believe I can change that and bring us into the 21st century. This means Object Oriented Programming. Code reviews. Unit Testing. Refactoring. Model View Controller and all manners of proven techniques for writing maintainable code.

Over the next few months, I'll try to post regularly about the changes I attempt to make and the experiences of my most notable successes and failures. I'm sure there will be many of both.

With all this talk of change, I'd just like to spare a thought as well to this American guy that I know who also won a promotion recently, although I doubt he had to fight as hard to get it as I did mine. Still, on the 20th of January 2009 Barack Obama will become the 44th President of the United States of America, and he's taking up probably the worst maintenance job in history. The last guy in charge made a complete mess of the existing system and it's going to be this guy's job to go in there and clean it up. Personally I wouldn't touch the job for love nor money, but he seems pretty confident that he can turn things around.

And I wish him all the luck in the world.

Filed under Work - 2 comments

The Django hype - it's not just its features

Posted Wednesday, November 5th at 10:20a.m.

I just finished relaunching my band website. In the spirit of most of my personal projects recently, I have been determined to make a conscious effort to practice programming. To that end, my main goal was to use an MVC framework for the first time. As it turns out, I ended up going a lot further than that. Here is a list of the technologies or projects I've gained experience with through this pretty small project:

  • Django Framework
  • Python
  • Blueprint CSS framework
  • Setting up a server stack on a Virtual Private Server
  • General UNIX systems administration
  • nginx lightweight server platform

All in, some good experience and a great advert for making a conscious effort to pick up new things. By far the most exciting 'new thing' for me was the Django Framework, and most of the UNIX and server experience came as a result of needing to create a suitable deployment environment for the django applications I wrote.

Now, I have attempted to use many of other MVC frameworks (Zend Framework, CakePHP, Catalyst and Ruby on Rails to be precise) before I tried django but ultimately found them an overly complicated or frustrating experience. Why then did django hook me in to the point where I went from a cheap shared hosting solution to a fresh Ubuntu install costing more than double in monthly costs?

Well, there's the obvious killer feature - the automatic admin module. You really have to see this in action to find out how unbelievable it is - from the automatic validation on the forms to the slick JavaScript interfaces for handling Many to Many relationships. They really thought of everything. The model and ORM system is fantastic too, as is template inheritance and the comment/RSS frameworks. All of these features (and more) are fantastic, but in my opinion the widespread adoption of django owes a lot to two of the other killer features of django - the documentation and the development server.

Documentation

Having to write documentation is the worst part of creating reusable modules of code that you want other people to use, but it's also one of the most important parts when it comes to adopting a framework. When I tried to do a small project in Ruby on Rails, I went ahead and downloaded all the bits I needed from links on the official site. Then when I went to the "quick start" tutorials from that same site, they were all for RoR 1.3 rather than the backwards incompatible 2.0 release I had downloaded from the site. I eventually found a tutorial for 2.0 on a brazilian website that helped me create my small app but the whole experience was extremely disappointing.

With Zend Framework 1.5, the situation was the same - the documentation was there but the sheer wealth of it was overwhelming. There was documentation for each compoment in the framework and constant stressing that everything was decoupled if you needed it to be - but no quick tutorial on what to do if you just wanted to use the framework as a whole to see what it offered. In their defence, as of 1.6 this has now been fixed and a quick start link on the framework homepage now links to a quick start tutorial.

Django's documentation, by contrast, is by far the best I've ever used. Not only has the framework had the (regularly updated) standard documentation to supplement the pre 1.0 releases, but you also have the free django book if that's how you like to do your learning. With the wealth of features django offers, it would be extremely easy for the documentation to become bloated and overwhelming, but the reality is that it does a great job of presenting the information you need, when you need it. The examples they use to show off features in the documentation almost double up as an FAQ, every time I wanted to do something a bit different I typed django and the feature name into a google and a page from this documentation came up as a result. As an absolute beginner, the introductory tutorial is right there with plenty of supplementary information about related technologies and anticipated problems, and it takes you through the basics gradually - introducing complex parts only when there is a good reason.

Everyone involved in writing and editing this information should pat themselves on the back, it has been superbly well written and organised and sets a new benchmark that puts the competition to shame.

Development Server

The built-in development server was by far the most useful feature in getting me on board using django. Even when I tried to set up a "mock live" environment on my Windows Vista machine after doing the brunt of my "learning" on the development server, I had great difficulties with environmental paths and the general leap in difficulty deploying django over Apache compared to one-click installer PHP development solutions like xampp and Wampserver. And this is after I was sold on the framework!

Whoever came up with the idea of this development server, clearly understood that when you download a framework you just want to learn the framework, not how to set up a server stack that lets you run python web applications. The latter should become a problem only once you're sold on the framework and have created something you want to deploy. For Windows users with zero to limited UNIX experience, this really will reel in a whole bunch of people who would have otherwise had to wait until Wampserver for django came along.

Conclusion

The Django Project is a fantastic example of how the little things in your project can make all the difference. It would have been so easy for the team to just target open source enthusiasts who are already extremely familiar with UNIX and for whom setting up a django server platform would have been pretty straight forward, but putting the development server right in there completely eliminates the major barrier to entry for a whole other demographic of native Windows/shared hosting users.

Likewise, the documentation for django is ridiculously good. For such a young project (that only recently hit 1.0) it would also have been so easy for them to neglect the mundane part of writing about an ever-evolving code base and just concentrate on getting things right on that side of things. But the documentation from the very early versions has always been great and it paid off big.

When you see the effort they've put in to getting the details right, you can only imagine the quality of the code underneath.

Filed under Django, Python - comment

Complete guide to deploying django on Ubuntu with nginx, FastCGI and MySQL

Posted Monday, November 3rd at 5:38p.m.

I recently deployed my first django project and it was an eye-opening experience to say the least. Coming from Windows, PHP, Wampserver and the cPanel shared hosting culture, starting with a fresh Ubuntu install and having to build the stack myself was something of a culture shock. There were of course a wealth of tutorials out there to help me get it up and running, but the reason I'm writing this is because the information I needed to get from brand new VPS to fully functional django app was split across many different tutorials, some of which were better than others. For my own reference (and for when I inevitably break something and need to start from scratch!), here is how I did it in ten simple steps:

Read more...
Filed under Django, Ubuntu, Nginx, Python - 11 comments

I Got 99 Problems But a Manager Ain't One

Posted Sunday, July 13th at 3:08p.m.

There's a few predictable ways people deal with unhappiness in life. They can dwell on it and make everyone around them miserable. They can suck it up and say "that's life." Or they can actually do something about it.

Me? I tend to go through a cycle of all three. One thing I realised from going on a few job interviews and looking at my list of complaints about my current work environment was that the problems I had were a veritable checklist of every working environment out there. Legacy code? Which established software company doesn't have that? A standard set of tools and practices for common problems? I'd hope so! Changing something that isn't broke?

There's nothing worse than reinventing the wheel for no reason.

Except when you're being asked to make major changes to a bunch of spaghetti procedural code full of inconsistent architectural decisions and quick-fix hacks for the numerous bugs that appeared since launch. Oh and could you improve performance whilst you're at it? And by the way we're already behind schedule. No manager wants to hear "complete rewrite" when they're adding something that took the guy before you two or three weeks to hack in there.

So I hacked it in just that one time. And the next. And again. And in no time at all that horrible legacy code I used to blame on the other guy is now 50% mine. Oops! And those archaic practices and standards now dominate my CV .

Not my managers CV. Or their managers CV. My CV.

Read more...
Filed under Refactoring, Work - 2 comments