Caffeinated Simpleton

Archive for the ‘Development’ Category

Cobra is Ready to Release

I did a little work on Cobra today and pushed the code around a little bit. Things that changed:

  • Reorganized into a single closure. This helps keep internal details internal.
  • Minimize with Closure Compiler
  • JSLint now passes
  • Works in node.js (and most likely any server-side JavaScript implementation)
  • Works in Internet Explorer.
    • I had never really tested this, but after fixing the things that JSLint found, the tests just passed.
    • Still need to test in IE 6. I don’t have a machine with IE 6 unfortunately.
  • Added to documentation to clarify a few things.

At this point, I think the library’s safe for others to use. I tagged a 0.5 version, and I’ll be calling it 1.0 after it gets into a few more projects. Feel free to try it out!

You can find Cobra on github, or you can download version 0.5 now (tgz).

An Introduction to JavaScript’s “this”

JavaScript is an amazing little language, but it’s got some quirks that turn a lot of people off. One of those quirks is this, and how it’s not necessarily what you expect it to be. this isn’t that complicated, but there are very few explanations of how it works on the internet. I find myself constantly re-explaining the concept to those who are new to JavaScript development. This article is an attempt to explain how this works and how to use it properly.

What is this?!

this is the current object. People that are used to object oriented languages use objects constantly, and this is how you access your object in JavaScript. This differs from Java and C++. This is best demonstrated by example, so let’s take a look.

class HotDog {
public:
     string getCondiments();
private:
     string condiments;
}
 
string HotDog::getCondiments() {
    return condiments; // The compiler knows to find "condiments" in the current HotDog instance
}

Ruby and Python don’t work this way. Instead, you must say “look for this variable in the current instance”

class HotDog:
    condiments
    def getCondiments(self):
         return self.condiments # "self" is a reference to the current instance of HotDog.
  class HotDog
         @condiments
         def getCondiments
              return @condiments # @condiments is an instance variable of an instance of HotDog
         end
    end

JavaScript has more in common with Ruby and Python than with Java and C++. The same thing in JavaScript is:

function HotDog() {
    this.condiments = "mustard, ketchup";
    this.getCondiments = function() {
         return this.condiments; //this is expected to be a reference to the current instance
    }
}

That’s what this is expected to be, anyway. It’s expected to be a reference to the current instance of whatever object it’s defined within.

this Gets Tricky

However, let’s say we wanted to find out what condiments were on our hotdog in 30 seconds. Assuming the HotDog class from above, that code might look like this:

var myHotDog = new HotDog();
// Call the getCondiments function in 3 seconds
setTimeout(myHotDog.getCondiments, 3000);

Many JavaScript beginners are surprised to learn that this code won’t work. It’ll give an error saying that this doesn’t have a member called condiments, even though it clearly does. What happened?!

As it turns out, this in the getCondiments function is set to the global object, window. This is because there is no binding of functions to instances in JavaScript. Whenever the instance isn’t explicitly stated, then this becomes window (at least in the browser). Writing this as myHotDog.getCondiments() indicates that you want this to be myHotDog, so it works correctly. The setTimeout function, however, just has a reference to that function. When it calls it, it’s not aware of myHotDog, so JavaScript sets this to window

“Fixing” this

There are several ways of forcing this to be what you want, and many of them touch on some of the more interesting features available in JavaScript. The first is apply. apply is a member of every function. It says “call this function with this object as this and these arguments. An example might help.

// These two are equivalent
myHotDog.getCondiments();
myHotDog.getCondiments.apply(myHotDog);
 
// You could also do this
var yourHotDog = new HotDog();
myHotDog.getCondiments.apply(yourHotDog);
 
// The above line does the same thing as:
yourHotDog.getCondiments();
 
// The above two calls will return the condiments off of your hot dog, not mine.

That doesn’t solve our setTimeout problem, however. apply actually calls the function, and we just want a reference to the function that uses the correct this. Luckily this is easily done, but let’s talk about some background first.

Lexical Scoping

In many languages (all the good ones, if you ask me), lexical scoping is supported. Lexical scoping basically allows you to access local variables in a parent function. If a function is defined within another function, it can access its own local variables as well as those of the function it was defined within. Time for another example.

function drinkBeer() {
    var beer = "Big Daddy IPA";
    function pour() {
         var glass = "Pint Glass";
         return "Poured " + beer + " into a " + glass;
   }
   function drink() {
         beer = null;
         return "Beer is all gone";
   }
   pour();
   drink();
}
drinkBeer();

This works just fine. drinkBeer cannot access glass, but pour and drink can access beer.

Lexical scoping is extremely powerful. If this small example doesn’t explain it, I encourage you to look around other examples on the internet until it’s more clear. While writing JavaScript, look out for ways this can make your life easier. Once you’re looking, they’re all over the place.

Solving our setTimeout Problem

With lexical scoping, we can easily solve our setTimeout problem.

var myHotDog = new HotDog();
setTimeout(function() {
    myHotDog.getCondiments();
}, 3000);

See what we did there? We created a new function that we passed to setTimeout. Our new function can access myHotDog, so it just applies it to the getCondiments function. Pretty slick.

Defending against this abuse

As somebody writing the HotDog class, it might be upsetting to you that people constantly need to keep this in mind when accessing your class. It would be nicer if there was a way to make your class work all the time. Fortunately, there is!

function HotDog() {
    var my = this; // my references the current this, which is correct.
    my.condiments = "mustard, ketchup";
    my.getCondiments = function() {
         return my.condiments; //my is guaranteed to be a reference to the original "this"
    }
}

HotDog is a constructor. You instantiate a new instance of HotDog by writing “new HotDog()“. In constructors, this is always your instance. So we created a new variable, my, that references the HotDog instance. This allows you to always refer to the HotDog instance, no matter how the getCondiments function is called.

Bind

Another method of “fixing” this is adding a bind attribute to every function. This allows you to pass in an object that will be your this. Many JavaScript libraries, such as Prototype, do this.

var boundFunction = myHotDog.getCondiments.bind(myHotDog);

Now boundFunction will always call getCondiments with this set to myHotDog. If you’ve been paying attention, it should be fairly obvious how bind is written.

function bind(thisObject) {
    var my = this; // this is the function here
    return function() {
        my.apply(thisObject);
    }
}

Cobra

My own class library, cobra, solves the this problem in a different way. It passes a reference to the original this to every function, which allows you to use some fancy features like prototypal inheritance while still not worrying about binding and the like. If you’re interested, you can find the code on bitbucket.

Whew

That’s pretty much all you need to know about this.

  • this will be window whenever your function is indirectly called.
  • Use apply to force this to be what you want
  • You can use lexical scoping to make sure this behaves in a predictable manner

I hope that’s clear, comment if it’s not!

Meetup.com webOS Client Part 1: Services

Palm’s Mojo SDK has just been released to the public this past week. I thought I would take the opportunity to show off some of the awesome things the SDK can do. The Mojo SDK is exceptionally good at tying your life together with the web services you use every day, so my first series will be on building a simple Meetup.com client. At first, we’ll just sync down meetups using the api key that’s associated with every individual account. Another day, we’ll add OAuth authentication to make it generally useful for anybody.

This article assumes that you have some knowledge of how to create an app, new scenes, and how to debug. If you aren’t somewhat comfortable with the SDK yet, check out the Hello World tutorial.

Setup

At first, we’ll just create a big button that puts all our meetups into the calendar. Nothing fancy. To create the scene, run:

$ palm-generate -t new_scene -p "name=sync"

Then in the stage-assistant.js file, put:

this.controller.pushScene("sync");

Creating a Calendar

The first thing that we will do is create a new calender for Meetup.com. This calendar will appear in the calendar app right next to your Google Calendar or Exchange calendar using the Palm Synergy APIs. This is great because it allows you to deliver new data to your users without having to write yet another way of presenting it. All contacts and calendars can be plugged straight into the core webOS applications.

To create a calendar, you first need to make an account:

self.accountServiceId = "palm://com.palm.accounts/crud/";
 
 /* Retrieves account if it exists, otherwise creates it */
setupAccount: function(self, k) {
    self.controller.serviceRequest(self.accountServiceId, {
        method: 'listAccounts',
        parameters: {},
        onSuccess: function(list) {
            Mojo.Log.info("Got account list: %j", list);
            if (list.list && list.list.length > 0) {
                self.account = list.list[0];
                k();
            }
            else {
                self.account = {
                    username: "justin",
                    domain: "meetup.com",
                    displayName: "Meetup.com",
                    dataTypes: ["CALENDAR"],
                    isDataReadOnly: true,
                    icons: {largeIcon: '', smallIcon: ''}
                };
                self.controller.serviceRequest(self.accountServiceId, {
                    method: 'createAccount',
                    parameters: self.account,
                    onSuccess: function(response) {
                        Mojo.Log.info("Got %j for %j", response, self.account);
                        self.account.accountId = response.accountId;
                        k();
                    }
                });
            }
        },
        onFailure: function() {
            Mojo.Controller.errorDialog("Failed to create account");
        },
        onError: function(error) {
            Mojo.Controller.errorDialog("Error creating account");
        }
    })
},

You’ll notice a few different things about this code, the odd things which are just my style. Using self instead of this is an idiom from Cobra, a JavaScript class library I wrote. k is a continuation, or the function that should be called after the account is created. k I believe is an idiom for Continuation Passing Style in Scheme.

Beyond that, this is very standard Mojo code. Services are identified as “palm://com.palm.” and all methods support onSuccess, onFailure, and onError callbacks. This code checks to see if there is already an account associated with this appId, and if there is, calls its continuation. If there is not, it creates the account and calls its continuation.

After the account is created, we can create the calendar.

self.calendarServiceId = "palm://com.palm.calendar/crud/";
 
/* Retrieves calendar if it exists, otherwise creates it */
setupCalendar: function(self, k) {
    self.controller.serviceRequest(self.calendarServiceId, {
        method: 'listCalendars',
        parameters: {
            accountId: self.account.accountId
        },
        onSuccess: function(calList) {
            Mojo.Log.info("Got calendar list");
            if (calList.calendars.length > 0) {
                self.calendar = calList.calendars[0];
                k();
            }
            else {
                self.calendar = {
                    name: "Meetup.com"
                }
                self.controller.serviceRequest(self.calendarServiceId, {
                    method: 'createCalendar',
                    parameters: {
                        accountId: self.account.accountId,
                        calendar: self.calendar
                    },
                    onSuccess: function(response) {
                        self.calendar.calendarId = response.calendarId
                        k();
                    },
                    onFailure: function(error) {
                        Mojo.Log.error("Creating calendar failed: %j", error);
                        Mojo.Controller.errorDialog("Failed to create calendar");
                    },
                    onError: function(error) {
                        Mojo.Log.error("Creating calendar failed: %j", error);
                        Mojo.Controller.errorDialog("Error creating calendar");
                    }
                });
            }
        },
        onFailure: function() {
            Mojo.Controller.errorDialog("Failed to create calendar");
        },
        onError: function(error) {
            Mojo.Controller.errorDialog("Error creating calendar");
        }
    });
},

This is almost identical to the account creation code, except it’s creating a calendar.

Now calling these functions is easy:

self.setupAccount(function() {
    self.setupCalendar(function() {
        self.buttonModel.disabled = false;
        self.controller.modelChanged(self.buttonModel)
    });
});

Pulling down events

To get the events that are associated with the account, I am using the Meetup.com JavaScript client. The client requires jQuery, but since it only requires jQuery to do JSONP, we replace that line with a Mojo call. There is no reason that we couldn’t use jQuery, but jQuery is overkill for just doing JSONP.

jQuery.getJSON(urlprefix + call_type + url, params, function(json){callback(json)})

Becomes:

var query = $H(params).toQueryString();
url = urlprefix + call_type + url + query;
Mojo.loadScriptWithCallback(url, Mojo.doNothing);

After we have the Meetup client library, getting the events is easy, but a little indirect. It takes three API calls, one to get the member id, one to get every group associated with the member id, and one to get every event in all those groups. You can see the details on Meetup’s API page.

self.client = new MeetupApiClient(Meet.Auth.apiKey);
syncCalendar: function(self) {
    // Gets my member id
    Mojo.Log.info("Syncing calendar");
    self.client.get_members({
        relation: "self"
    }, self._getGroups);
},
 
_getGroups: function(self, members) {
    Mojo.Log.info("Got members");
    var memberId = members.results[0].id;
    self.client.get_groups({
        member_id: memberId
    }, self._getEvents);
},
 
_getEvents: function(self, groups) {
    Mojo.Log.info("Got events");
    groups = groups.results;
    var groupString = groups[0].id;
    var today = new Date();
    for (var i = 1; i < groups.length; i++) {
        groupString += "," + groups[i].id;
    }
    self.client.get_events({
        group_id: groupString,
        after: today.getMonth() + today.getDay() + today.getFullYear()
    }, self._saveEvents);
},

Now self._saveEvents will receive a list of events that are coming up after today. All we need to do is loop over the list, format them as Palm calendar events, and pass them to the calendar service.

_saveEvents: function(self, events) {
    Mojo.Log.info("Saving events");
 
    self.numEventsProcessed = 0;
    self.events = events;
 
    events.results.each(function(meetupEvent) {
        var time = new Date(meetupEvent.time).getTime();
        if (meetupEvent.myrsvp != "no") {
            self.controller.serviceRequest(self.calendarServiceId, {
                method: 'createEvent',
                parameters: {
                    calendarId: self.calendar.calendarId,
                    event: {
                        eventId: meetupEvent.id,
                        subject: meetupEvent.name,
                        startTimestamp: time,
                        endTimestamp: time + 3600000, // 1 hour in ms
                        allDay: false,
                        note: self._formatNote(meetupEvent),
                        location: meetupEvent.lat + ", " + meetupEvent.lon,
                        alarm: 'none',
                    }
                },
                onSuccess: self._createdEvent,
                onError: self._errorCreatingEvent,
                onFailure: self._failureCreatingEvent
            });
        }
    });
},
 
_createdEvent: function(self, response) {
    self._checkIfSyncFinished();
},
 
_errorCreatingEvent: function(self, response) {
    Mojo.Log.error("Could not create event: %j", response);
    self._checkIfSyncFinished();
},
 
_failureCreatingEvent: function(self, response) {
    Mojo.Log.error("Failed to create event: %d, %j", self._eventsReturned, response);
    self._checkIfSyncFinished();
},
 
_checkIfSyncFinished: function(self) {
    self.numEventsProcessed++;
    if (self.numEventsProcessed == self.events.meta.count) {
 
        if (self.events.meta.next) {
            Mojo.Log.info("Fetching the next page of results...");
            self.client.nextPage(self._saveEvents);
        }
        else {
            Mojo.Log.info("Fetched all the events");
            self.buttonModel.disabled = false;
            self.controller.modelChanged(self.buttonModel);
            self.controller.get("sync-button").mojo.deactivate();
        }
    }
},

And that’s basically it! With the code above, all your Meetup.com events can be inserted into your webOS calendar. All you need is an event handler for your button.

self.buttonModel = {
    buttonLabel: 'Sync',
    disabled: true
};
 
self.controller.setupWidget('sync-button', {
    type: Mojo.Widget.activityButton
}, self.buttonModel);
 
Mojo.Event.listen($('sync-button'), Mojo.Event.tap, function() {
    self.buttonModel.disabled = true;
    self.controller.modelChanged(self.buttonModel)
    self.controller.get("sync-button").mojo.activate();
    self.syncCalendar();
});

This works assuming a view containing:

<div x-mojo-element="Button" id="sync-button"></div>

Debugging Tips

Creating a file called framework_config.json allows you to change the logging level. That will permit JavaScript messages to be output into /var/log/messages on the device. This is especially valuable if you’re working on a 64-bit linux machine where the inspector is currently not supported.

{
    "logLevel": 99 // 0 means no logging, 99 will max it out
}

Removing /var/luna/data/dbdata/PalmDatabase.db3 removes all data you may have inserted. This allows you to start over fresh, but I wouldn’t recommend doing this on your actual device. You’ll lose all your data!

Todo

This program isn’t really complete, but it’s a good start. A couple of things we still have to do are:

  • Add Authentication with OAuth
  • Keep track of already inserted appointments so we don’t insert duplicates
  • Add automatic synchronization in the background
  • Add push updates (this is impossible without server side support)
  • Make things look nice

Stay tuned for articles on some or all of these exciting new features! Until then, the code is available on github.

Java, please stop ruining my fun.

I don’t like Java. I haven’t learned Java well because I don’t enjoy using it. I don’t enjoy using it because it’s verbose, for one, but mostly because it’s constantly making things hard for me to do. I know there are ways to do what I want, after all, millions of people use Java successfully every day, but I don’t know what they are. Furthermore, finding out what they are is excruciatingly painful.

I recently did a series of articles on a project I was doing to learn Clojure. It kind of petered out for a number of reasons, but one constant annoyance in learning Clojure was dealing with the Java-isms. Java has given Clojure a vast library of high quality software essentially for free, but it’s also brought on a lot of the pain, much of which I think needs to be fixed before Clojure can have the nice feel of my favorite dynamic languages.

Installing Clojure

The first thing one has to do is install Clojure. It’s not a package in Ubuntu yet, but it’s young, so that’s ok cause we’re veterans and don’t need no stinking packages. To compile, we just download the source and type “ant”.

And that’s it. There’s no install process that makes a nice pretty “clojure” command that takes us to the REPL or executes scripts that are passed to it. To run clojure, you need to run it using Java:

$ java -cp clojure.jar clojure.lang.Repl

That is a lot to type just to get a Repl, and getting a usable command line is even harder. After installing JLine ConsoleRunner, you need to get the library into your classpath (a rant on which is upcoming) and then run

$ java -cp jline-0.9.91.jar:clojure.jar jline.ConsoleRunner clojure.lang.Repl

Not exactly intuitive, but whatever. We put it in a bash script, put it in our path, and head off to the races. After a while, we have a few lines of a quality script we would like to save and run. How do we do that?

Obviously, it’s:

$ java -cp clojure.jar clojure.lang.Script my-script.clj

This assumes that clojure.jar is in the same directory as the script you want to run. If you don’t have clojure.jar there, you must provide a specific path to the jar file. There is no idea of a default directory where Java will look for jar files. You must provide every single jar file to Java at runtime.

Contrast this with the Python install process:

$ sudo apt-get install python
$ python
... Have fun in the interpreter
... Write a script
$ python my_script.py

Simple.

The Classpath

First of all, I’m no expert on the classpath, but it seems like an unholy abomination thrust upon us by invisible powers that must be extinguished at all costs. It would appear, and again, I am no expert, but it would appear that every single dependency of a program must be explicitly passed to Java at the time you run your program. I wrote a bash script to automate the process, but viewing the command line for running my simple Compojure-based webapp is apalling:

java -Djava.library.path=/usr/local/lib -cp :/mnt/data/Users/justin/bin/compojure/compojure.jar:/mnt/data/Users/justin/bin/compojure/deps/clojure-contrib.jar:/mnt/data/Users/justin/bin/compojure/deps/clojure.jar:/mnt/data/Users/justin/bin/compojure/deps/fact.jar:/mnt/data/Users/justin/bin/compojure/deps/jetty-6.1.14.jar:/mnt/data/Users/justin/bin/compojure/deps/jetty-util-6.1.14.jar:/mnt/data/Users/justin/bin/compojure/deps/re-rand.jar:/mnt/data/Users/justin/bin/compojure/deps/servlet-api-2.5-6.1.14.jar:/mnt/data/Users/justin/lib/clj-http-client.jar:/mnt/data/Users/justin/lib/clojure-contrib.jar:/mnt/data/Users/justin/lib/clojure.jar:/mnt/data/Users/justin/lib/commons-codec-1.3.jar:/mnt/data/Users/justin/lib/commons-httpclient-3.1.jar:/mnt/data/Users/justin/lib/commons-io-1.4-javadoc.jar:/mnt/data/Users/justin/lib/commons-io-1.4-sources.jar:/mnt/data/Users/justin/lib/commons-io-1.4.jar:/mnt/data/Users/justin/lib/commons-logging-1.1.1-javadoc.jar:/mnt/data/Users/justin/lib/commons-logging-1.1.1-sources.jar:/mnt/data/Users/justin/lib/commons-logging-1.1.1.jar:/mnt/data/Users/justin/lib/commons-logging-adapters-1.1.1.jar:/mnt/data/Users/justin/lib/commons-logging-api-1.1.1.jar:/mnt/data/Users/justin/lib/commons-logging-tests.jar:/mnt/data/Users/justin/lib/compojure.jar:/mnt/data/Users/justin/lib/jline-0.9.94.jar:/mnt/data/Users/justin/lib/tokyo-cabinet-clj.jar:/mnt/data/Users/justin/lib/tokyo-cabinet.jar:/mnt/data/Users/justin/lib/tokyocabinet.jar:/mnt/data/Users/justin/lib/tokyotyrant-0.6.jar clojure.lang.Script index.clj

That is bad. That is not correct, that is not how software should be designed, I object. Every other language I can think of off the top of my head (except JavaScript) has some structured way of finding its dependencies, and most have a way of adding additional rules to that search should the defaults not be adequate. While this can lead to “DLL hell”, I do not see how the Java situation is any better when everybody just ends up with scripts to automate the process and then those scripts pick up the wrong things and you can’t figure out why.

The classpath makes me very upset. If Clojure can find a way to mask it, I would appreciate it very much.

Maven

First of all, what the hell is Maven? A quick trip to their site reveals a huge chunk of text with hundreds of links and an initial sentence that describes it as:

Maven, a Yiddish word meaning accumulator of knowledge, was originally started as an attempt to simplify the build processes in the Jakarta Turbine project.

I went to the site with some hope that it would provide some relief to my dependency issues (All I want is “pip install”, or “gem install”), and I get greeted with a dense paragraph of history combined with some mumbo-jumbo about “best practices”.

After reading a bit I find that Maven downloads and builds dependencies and installs them in a local repository, along with the library you are trying to compile. Perfect! Sounds like exactly what I want. However, it doesn’t mention anything about the classpath. Am I still responsible for dealing with all that muck, even though it’s tucking my libraries in a hidden directory (implying that it’s responsible for managing them)?

To answer that question I need to wade through dozens of other pages that alternately describe how to accomplish basic tasks and lecture me on software engineering. Finally I come to the conclusion that while Maven does indeed find dependencies for you, it does not actually help you execute programs with those dependencies in place. This means you either need a script that automatically passes your entire maven local repository to Java, or you need to know the dependencies that Maven was conveniently supposed to hide from you. To top it off, it doesn’t play well with Clojure. Completely useless.

(For the record, there is a Maven extension that does exactly this.)

The Last Word

Dependency management is a hard problem that all languages must learn to deal with. Higher level languages have an even harder time in that they must not only deal with whatever dependencies they have written in their own language, but also with extensions written in other languages. Clojure, which is still very young, suffers tremendously from the godawful environment that Java has ensconsed itself in. I am largely a veteran of the *nix world, which seems quite different from the world Java developers have built around themselves. They have their own tools, their own build systems, their own set of “best practices”, and the Apache foundation. What I have seen in my brief saunter over the wall has appalled me. It has appalled me far more than similar saunters into the somewhat exciting world of Microsoft and .NET. It strikes me very much as a world in need of fixing, and I hope that Clojure (or Scala) can do it. Heck, I may even do my part to help.

But probably I’ll just run back to Python.

Exploring memcmp

Yesterday I had a problem. I had a chunk of binary data saved in a file that I had extracted from another file. I needed to find where this chunk of data had come from. I wanted to know if it appeared in the original file more than once, and if so, where. I also wanted to see if it appeared in a number of other files.

There is no command in Linux (that I know of) that does this, so I set out to write my own. I used the obvious and naive implementation. I memory mapped the file containing the chunk I was looking for (the needle) and the file containing all the data I was looking through (the haystack). Then I walked through the haystack byte by byte and ran memcmp to see if it was the same.

int
compare(char *needle_pa, int needle_size, char *haystack_pa, int haystack_size) {
    int i;
    int found = 0;
    for (i = 0; i < haystack_size-needle_size; i++) {
        if (memcmp(needle_pa, haystack_pa + i, needle_size) == 0) {
            printf("Found needle at byte offset %x\n", i);
            found++;
        }
    }
    return found;
}

This was fast enough for my purposes, but intellectual curiosity lead me to try and optimize this function. There are a number of ways to speed this up, but the one that seemed most fun was to compare more bits at once. This lead me to explore in internals of memcmp.

glibc’s memcmp

Glibc comes with a memcmp implementation, of course. You can view the source here. It’s quite simple. If the passed buffer is not aligned on a word boundary, iterate byte by byte doing comparison via subtraction until you get to a word boundary. Then loop through word by word comparing with the equality operator (==). Once you get to the end of all the full words, compare the final few bytes by subtraction again.

This implementation is pretty good. It’s fast on most platforms as it maximizes aligned comparisons. It’s very portable as it does not take advantage of any processor specific features. However, the fact that it does not take advantage of processor features means that it could be faster.

SIMD (Single Instruction Multiple Data)

Many platforms have SIMD instructions. SIMD instructions are very specific vector instructions that allow you to manipulate large chunks of data with a single instruction. A common operation found in SIMD instruction sets is string comparison, which is what essentially what memcmp does. For instance, in the x86 architecture, there are instructions to compare strings 16 bytes at a time. That’s 4x the throughput of the glibc implementation.

Taking advantage of SIMD instructions is a bit tricky. They’re not supported on all platforms, even in the same processor family. Intel, for instance, has SSE1, SSE2, and SSE3 instruction sets, which all add new instructions not found in the previous iteration. Generally there’s no reason to go through the pain of using these instructions unless you absolutely know you need the speed boost. Today, we’re just doing it for fun.

GCC built-in functions

GCC comes with a number of SIMD optimized replacements for common libc functions. Memcmp is one of those functions that GCC optimizes. However, turning on this optimization is a bit tricky. Again, since every platform supports different instructions, you have to be very specific about what features are supported in order to get the maximum optimization. GCC will always create the most compatible binary it can unless specifically told otherwise, which is the right thing to do.

To compile with GCC’s optimized memcmp, we run the following:

gcc -Wall findneedle.c -O3 -march=opteron -o findneedle

This tells gcc to optimize as much as it can (-O3) and that it can expect to be running on a processor with the same features as the Opteron (x86-64, SSE1-3). I’m running this on an Intel Core Duo, which supports the same feature set. This gives us a version of the findneedle program that does not use the standard memcmp implementation, but instead uses GCC’s SIMD optimized version.

The Results

My test was simple: I timed how long it took to a 16K needle in a 328MB haystack.

At first I was doing this on a virtual machine. The most interesting result of any of this is that the program running in a vmware image slowed down significantly. It went from an average of ~4 seconds to an average of ~10 seconds. Thinking that the virtual machine might be emulating the SIMD instructions, I compiled on my mac itself. In that situation I saw the expected speedup. The average search time went from ~4.4 seconds to ~4.1 seconds. That’s a 7% speedup!

Clearly, these results show that optimizing your program to take advantage of processor features does not always help very much. However, if you’re compiling your own programs (I’m looking at you, gentoo fanatics), it does help to make sure that your processor options are set correctly. Just switching a few flags can cause core functionality to be improved, and that makes the whole experience that much better.

Also, assume your virtual machine is a 486. It’s way faster that way.

Update:
There was a good debate about optimizing the algorithm itself over on hacker news. jws was nice enough to do a rundown on a number of different algorithms. Not surprisingly, the one demonstrated above is indeed the most naive.