Caffeinated Simpleton

Archive for September, 2008

Moving to San Francisco

My brief stint between Champaign and life has ended, and tomorrow I will be leaving from Stow, OH with my trusty Honda Civic packed to the brim. I’ll stop by my grandparents’ for a night and then head to Champaign to watch my dear Fighting Illini take on the fearsome Louisiana-Lafayette Ragin’ Cajuns.

After what promises to be a fun-filled weeked, I’ve got several days to wander through the west in search of San Franciso. Stop back throughout the next week for updates on my progress and, of course, pictures of the beautiful American west.

Fun with Electoral Maps!

At Real Clear Politics, you can play with electoral maps based on current poll numbers and what you think is going to happen. Try it now, it’s lots of fun!

So what do I think is going to happen? I think Obama is going to win. Naive, perhaps, but for the past four years, I’ve been surrounded by university liberals in Illinois, no less. How does he do it? Well, it’s ridiculously close. I think Indiana and Florida are going to go McCain. In that case, Obama needs to grab both New Mexico and Colorado. Assuming he can do that, Virginia, Pennsylvania, Ohio, and Michigan are going to be big battlegrounds. If Obama can win two of those, he’ll win the election. I think he’s going to do it.

A Threading Model Overview

I noticed in a story on Hacker News that many people do not understand that differences in threading implementations between different programming languages. In the single processor days, understanding the threading model that you were working with was not that important. With more than one core, it is a good thing to know. This is an overview.

The Beginning (C and Native Threads)

The first threading model we will look at is the standard OS level thread. Every modern OS has support for this, though the APIs change from OS to OS. Basically, a thread is a process that can run on its own processor, is scheduled by the OS scheduler, and can block. It acts just like its own process except that it shares resources with every other thread in the process. This mainly means that memory and file descriptors are shared between all threads in a process. This is what people mean by “native threading”. From C on linux, you can use these threads by linking with the pthread library. BSDs generally support pthreads as well, and Windows does its own thing that is very similar.

Java and Green Threads

When Java came out, it introduced a different type of threading model to the world called green threads. Green threads are essentially simulated threads. The Java virtual machine would take care of switching between different green threads, but the virtual machine itself would only run in one OS thread. This generally has some advantages. OS threads have almost as much overhead as a process on most POSIX systems. It is also usually slower to switch between native threads than it is between green threads.

This can mean that in some situations, green threads are much preferable to native threads. A system can usually support a much higher number of green threads than OS threads. For instance, it would be practical to spawn a new green thread for every new connection on a web server, but it is not generally practical to spawn a new native thread for every incoming HTTP connection.

There are disadvantages, however. The biggest is that you cannot have two threads running at the same time. There is only one native thread, so it is the only thread that gets scheduled. Even if there are multiple CPUs and multiple green threads, only one CPU will be running any given green thread at any given time. This is because it all looks like one thread to the OS scheduler.

Java has supported native threading since version 1.2, and it has been the default for some time now.

Python

Python is one of my favorite scripting languages, and was one of the first scripting languages to offer threading. Python exposes a threading module that manipulates native threads. This means that Python can benefit from all the advantages of true native threading, except for one catch.

Python has a global interpreter lock (GIL). This lock is necessary to keep Python threads from corrupting the global state of the interpreter. This means that no two Python instructions can be running simultaneously. The GIL gets released every 100 Python instructions or so and another Python thread is free to acquire the lock and begin executing.

On the face of it, this seems like a major flaw. However, in practice it is not that big of a deal. Any thread that blocks will generally release the GIL. C extensions can also release the GIL whenever they are not interacting with the Python/C api, so CPU intensive operations can be carried out in C without blocking the executing Python threads. The only situation in which the GIL proves problematic is when you have more than one CPU bound thread written in Python on a multi-core machine.

Stackless python is an implementation of Python that brings “tasklets” (essentially green threads) to Python. The greenlet module is derived from their work and is compatible with the standard cPython implementation.

Ruby

Ruby’s threading model is and always has been in a state of flux. Ruby’s original implementation only supported cooperative green threads. These work fine in many situations, but they do not take advantage of multiple processors.

JRuby mapped Ruby’s threads straight to Java’s threads, which are generally OS native threads. This doesn’t work. Since Ruby’s threads are cooperative, there is no need to synchronize between the threads. Every thread can be assured that no other thread is accessing a resource while it is accessing it. This breaks down in JRuby, since native threads are generally preemptive, meaning any thread could be accessing any shared data at any time.

Because of the mismatches and the desire for native threading from the C Ruby folks, it was decided that Ruby would move to a native threading in Ruby 2.0. In Ruby 1.9, a different interpreter was swapped into the standard Ruby distribution. 1.9 adds a threading model it calls Fibers, which as far as I know are a more efficient implementation of green threads.

In short, Ruby’s threading model is a poorly documented mess.

Perl

Perl has an interesting threading model, one which Mozilla borrowed for SpiderMonkey if I’m not mistaken. Instead of having a global interpreter lock like Python, Perl makes all global state thread local and spawns off a new interpreter with each new thread. This allows for true native threading. There are two catches though.

First, you must explicitly make variables available to threads outside your own. This is the nature of everything being thread local. The values must then be kept up to date across threads.

The second catch is that every new thread is very expensive to create. The interpreter is not small, and duplicating it with every thread makes for a lot of overhead.

Erlang, JavaScript, C# and so on

There are a lot of other models out there that people play with from time to time. Erlang, for instance, has a shared nothing architecture that forces you to use lightweight, user-land processes over threading. This is actually an outstanding architecture for parallel programming since it takes out all of the headaches involved with synchronizing memory, and the processes are so lightweight you can generally just spawn as many of them as you want.

JavaScript is usually not thought of as a language that supports threading, but it needs to support it for a browser implemented largely in JavaScript like Mozilla. Its threading model is very similar to that of Perl’s.

C# uses native threads.

Well, I hope that makes the whole threading picture a little bit clearer. Please let me know if anything is confusing or if I messed anything up. I don’t know everything, after all.

Social Values are a Poor Substitute for Policy

I am a Republican. I may or may not vote McCain in this election, but I consider myself to be a Republican. That being said, I really wish the Republicans would shut up about social values.

The party has been transformed under president Bush. Republicans have always been about family values, but since Bush took office, it’s become a central plank in the platform. With McCain I thought we had found a candidate who was going to return to talking about size of government, fiscal responsibility, and corruption. Then he picked Sarah Palin as his running mate and suddenly the media is buzzing about abortion and gay rights again.

I do not care about those issues.

The bible does not say anything about when a fetus becomes a human. It says that we should not kill, but does not go so far as to define the exact moment we should regard a fetus as a human being. People can argue the issue however they please, but in the end it’s always a matter of personal belief. I am personally against abortion in the most broad sense, but if my neighbor’s daughter gets pregnant at 16 and does not think that of abortion as taking a life but instead saving her own, then she is free to believe that and act on that as she deems fit. It is not my place to impose my beliefs.

Unfortunately, abortion has become a huge issue in national politics. There are few pro-life candidates who have gone as far as to say they will only nominate judges who will overturn Roe vs. Wade. This is because saying that would be reckless and irresponsible. Judges are supposed to be non-partisan, and to the extent that they are not, they are still supposed to hold up the precedents of the previous courts unless new evidence comes to light. A judge who is hell bent on overturning Roe vs. Wade would probably not make a good supreme court judge. The decision is not going to be overturned anywhere in the near future. Instead, the topic has just become a litmus test to see if candidates believe what you believe. If a candidate does not specifically say he or she is pro-life, then they are immediately ostracized by the evangelical right even though it will most likely make absolutely no difference in their governance.

A similar issue is gay marriage. Homosexuals sometimes make me uncomfortable when displaying affection, and the bible speaks out against them in a number of places. It never, however, says that it is the place of society to judge them. Instead, it says quite the opposite in “judge not lest you be judged yourself”. It says that it is God’s duty to judge, and I hope in this case he is kind. What homosexuals do in their own home is of absolutely no consequence to me, and I could not care less whether they get a tax break or not. If they regard themselves as married, who is the government to say that they are not? Does the government question the legitimacy of a shotgun marriage that is the result of an accidental pregnancy?

The point of this post is not to say that these issues are not important. They are important. However, they are also personal. The government has very little business in dealing with them, and the fact that they enter our elections as much as the state of the economy or foreign affairs bothers me to no end. Why oh why can’t we just let this go?

The Web OS

There has been some speculation about Chrome being the central piece of Google’s web OS. There has also been some entertaining, if not entirely constructive, disagreement. It got me to thinking about how the web as a platform does work and how it should work before it can be a viable desktop replacement.

The web as a platform has some tremendous advantages. Software is always up to date, it’s connected to your friends and family, it’s consistent across machines, and there’s a low barrier to entry. Users don’t need to install anything, designers don’t need to learn how to program, and programmers can work in fun, high level languages. Awesome.

There are a lot of problems, however. The first is the obvious one. The web was never intended to be anything besides a publishing platform, and so web applications have to push browsers to their limits to achieve the kind of functionality users are accustomed to. Even with the amazing techniques and hacks that have been developed, the experience still does not compare. Web applications cannot easily access local storage. They cannot do heavy calculations. They cannot offer the vast array of interactive components that we expect in feature laden desktop apps.

This has caused a number of good things to come about. There is an emphasis on simple applications that do one thing very well. These applications (such as those made by 37signals, gmail, and meebo), all use a bit of JavaScript to improve the experience, but keep their core functionality to a minimum. This keeps the interface responsive, and makes it easy for users to learn. There has also been a number of great JavaScript frameworks developed, many of which try to take out the painful browser discrepancies while keeping the inherent flexibilty of the web platform. JavaScript interpreters have gained a lot in the past year, and stand to improve a lot more with innovations like V8 and the sophisticated interpreters being developed by the WebKit and Mozilla teams.

This is all great, and bodes well for the browser having the capabilities to deliver ever more sophisticated applications over the web. However, there are some basic limitations that make it exceptionally difficult to move all applications to the web. They are mostly centered around consistency and accessibility.

Consistency is vitally important to usability on a real platform. On my Mac, every program I go into has menus in the same spot. All the buttons look the same. All the windows look the same. All the fonts are the same. All the shortcuts are the same. On my Windows box, all these things are different than my mac, but they are generally similar and they are consistent across Windows applications. This makes these applications usable as a platform.

Web applications do not offer this. Every menu is custom designed and can be placed anywhere. Buttons do not look the same. There is a mismatch on what should be a link and what should be a button. Keyboard shortcuts exist on some sites, but they are so drastically different from the ones that I’m used to and other sites that they are essentially worthless. The customized, expressive power of the web is at once its greatest strength and its greatest weakness.

Until there is a standard, web applications will be less usable than their desktop couterparts. When every application is an entirely new experience, the user cannot begin at some basic building blocks for learning a program. Without this initial state, there is a limit to how complex a program can be. It is very difficult for a web program to be more complex than what you can learn from scratch in 5 minutes, because if it is, nobody will use it. This puts a ceiling on the capabilities of web applications that will not be solved by the fastest JavaScript interpreter.

Accessibility is another issue. By accessibility, I don’t mean optimizing a site for the disabled, though that’s an important limitation too. I mean having a set of applications that everybody can and does use. When a person buys a Windows PC, they can boot it up and use Microsoft Mail with whatever email address they have. They kind of know how to use it since it is the same as every other email client they have ever used. There’s a program for pictures, there is a browser, there is a calendar, there is a chat client, there is a music player. There is probably at least a basic word processor. All of these things are great because the user can just use whatever came with their machine.

What happens when you take away the machine defaults and give them a browser instead? First off, many users will not know where to go unless some defaults are offered. Assuming there should be defaults, how do we choose them? There is no web site that I know of that lets you work with any email address. The computer will need to figure out which site should be your email program as part of its setup process. Which word processor should be used? Which search service? Which music site? Which calendar? If the user used Flickr before switching and the browser’s default photo application is PhotoBucket, will they understand that they can use Flickr? That they can change the “Photos” shortcut to point to Flickr?

The answer to the accessibility problem is easier than the answer to the consistency problem. The defaults will be chosen by whoever ships the browser, and for the sake of example we will assume that’s Google. That makes Google the same as Microsoft, and that’s terrifying in many ways. However glum that might be, it’s not an interesting problem. We either have a dominant player or we don’t.

Solving the consistency issue is interesting. Some standards have started to arise, but they vary so much they’re hardly recognizable. Generally, the user name associated with the current user is in the top right of the screen, along with account settings. The top bar is usually branded and contains some broad navigation elements. Program functions are listed on the left or right side. However, any convention that has been established is loose and is only specific to a very small number of very common elements. Convention, if it’s working at all, is going at such a slow pace I would call it irrelevant.

If we cannot wait for convention to solve the consistency problem, then what can we do? The answer I see is to draw up a standard. The standard would be very specific; it should specify where menus should be placed, what keyboard shortcuts should be used to do what, how dialog boxes should behave, how forms should be presented, what should be a button, etc, etc. It would be possible to create tags to enforce the specification, but I think it would be useful now if it was just the established way of writing web applications. By updating JavaScript libriaries and templating systems to support the specification, it would be as easy for the developer to support as if there were new tags made available. A developer would not need to be intimately familiar with the specification if the libraries they used to work with their web UI was.

Agreeing on a specification is another issue, and perhaps a pipe dream. There should probably be user interface experts involved in creating it, and even with that, nobody will agree. Perhaps the best solution is just for somebody (I hope not me) to draw something up and discuss it until it stabilizes a bit. In the end, a set of specific guidelines for how to design user interfaces in web pages will never be useful until the big producers of web applications and the major JS and templating libraries support it.

Will that ever happen? We’ll see. However, until there’s some consistency, there can never be a web OS.