The fighting's been fun and all, but it's time to shut up and get along
February 10th, 2010, at 11:37 a.m.
About once a week, I get an email in my mailbox that reads like this:
Hey, Kiln looks neat, but Git is totally the bee’s knees, so why the fuck are you using Mercurial?
Note that these emails are rarely (if ever) actually interested in why Kiln chose Mercurial; what they’re instead interested in is trying to piss me off enough that I get into a flamewar about why Mercurial is going to bring about Nirvana while Git causes people to eat babies using nothing but A1 sauce and a spork.
This is stupid.
Mercurial and Git are both DAG-based DVCSes. They use the same patch format. They both handle directories implicitly. They both can autodetect file deletions and renames. They both run on just about every platform I can think of. They both do a bunch of “cool” stuff, like rebasing, editing history, signing changesets, and serving as the inspiration for Ferris Bueller. They both have nearly identical performance characteristics, they both have social code sharing sites, they both are used by really big projects, they both have really good documentation, and they both got compared to James Bond or MacGyver or something like that in an analogy I didn’t really follow.
So why is there so much hating? I think what’s going on is that people are coming to these tools from Subversion or CVS, have their massive epiphany on how totally awesome DVCSes are, and then assume that only their tool can have this level of awesome, so they begin evangelizing. The problem, of course, is that the other guy feels the same way, and is also evangelizing, so the Git and the Mercurial guy end up in a locker-room-style temper-tantrum over whose tool has the best performance or whatnot, instead of how much more awesome their tools are than the competition.
This has to stop.
Mercurial’s enemy is not Git. Git’s enemy is not Mercurial.
Their enemy is Subversion.
For example, take a look at this tripe. Anyone who has seriously used Mercurial or Git for any length of time is going to spend most of the video alternating between laughing, peeing their pants, and trying to explain that they merely spilled a bunch of water on their crotch, honest. While there’s some truth to the video—a lot of people do ditch Subversion for off-line commits or for shelving or the like—the video also totally misses why people love and stay with DVCSes, which is that they make branching and merging actually work like they’re supposed to, and make source control so fast and seamless that you find you’re suddenly using it for everything.
But if you show that video to a Subversion user, they’re going to nod. And there’s really no reason for them not to: if you don’t grok what the DAG gives you, if you’ve never kicked an entire branch back-and-forth across a LAN, if you’ve never used a site like Bitbucket or GitHub, then the only tangible benefits you see from DVCSes are…well, partial commits, and the fact that their “checkouts” aren’t littered with .svn directories all over the place. And these are problems that Subversion’s upcoming versions actually might solve.
So it’s time to focus on the real enemy: the holdouts still using centralized systems. The way I see it, there are three parts to this:
- Git and Mercurial need to do a better job handling one thing that Subversion is still better at: binary files. I know there are Git projects that are working on improving the transfer and storage of binary files within Git’s existing repository format, such as git-bigfiles. Mercurial is taking a slightly different tack through projects such as bfiles, which aim to deliberately move large binaries out of the store. We need to improve these workflows so that “DVCSes are totally better, unless” becomes simply “DVCSes are totally better.” (For our part, Fog Creek is helping fund development of bfiles via UCOSP, a semester-long student project.)
- Git and Mercurial advocates need to remember that they need to be converting Subversion, CVS, and Perforce users, not each other. The fundamental evil here is centralized, branchless version control systems that effectively encourage all development to occur in trunk, and which make propagating bug fixes and features properly borderline impossible for all but the most disciplined shops.
- Git and Mercurial advocates need to remember that anyone going to either system is a win for the both communities. It’s easy, in the yin/yang of Hacker News and proggit, to forget that most developers are not even aware of what DVCSes are or what they do. Yeah. Sounds crazy, I know, but trust me on this. The goal right now, if you honestly believe that the DVCS workflow is better—and I do—should be to get the mindset out there, to make more people aware of what DVCSes have to offer and why they should be using them. I for one definitely do not care whether you end up deciding that Mercurial is better than Git for you or not (well, I kind of do, because I want you to use Kiln, but otherwise…), but I get a warm fuzzy feeling knowing that, if I ever have to work with your code, I’ll be able to use a sane version control tool.
So let’s do it. No more fighting. Git, I hereby acknowledge you rock. And Mercurial, you rock so much I helped build an entire product around you. You’re both awesome. So you two shake hands…very nice. Now, see that other dude over there? The one with no real tagging or branching support?
Let’s get him.
Kiln's Evolution, Part 2: From Prototype to Beta
February 10th, 2010, at 8:12 a.m.
This article is a continuation of Kiln’s Evolution, Part 1: DVCS as Code Review.
In the fall of 2008, Joel was getting increasingly adamant that FogBugz needed source control integration, and most people in the company seemed to think Subversion would probably be the best SCM to make that happen. Tyler and I disagreed, believing strongly that we should use a DVCS instead, and that our code review tool gave a really compelling example of why DVCS was better that any software shop would instantly “get.” But to convince the rest of the company, we’d have to show them a version of our tool that was more polished and usable than what we’d submitted to Django Dash.
And so we began a skunkworks project.
Unable to use time at work on much besides Copilot, I instead used my week of Thanksgiving vacation cleaning up the prototype’s user interface and functionality, named the result Kiln, and gave it its first logo. Tyler spent evenings in December making Kiln a proper, pluggable Django application, made the UI actually usable, and fixed a pile of bugs that would have blown up in our face if we’d tried showing Kiln to anyone else. By January, 2009, Kiln was ready to demo.
Kiln’s interface, directly after the winter skunkworks changes.
After lunch on a cold winter day, Tyler and I dragged everyone out into the kitchen and demoed the current state of Kiln. We showed repository management and the FogBugz-inspired code review workflow, and then made the case that this, or something very similar, should be FogBugz’ source control system.
And an amazing thing happened: somehow, everybody basically agreed. Sure, some people thought Kiln should be in C# or Wasabi instead of Django, no one could agree on whether Kiln should be a direct part of FogBugz or be an independent product, and Tyler and I argued strongly for Kiln as a hosted-only solution to a bunch of people who knew their bread-and-butter came from licensed applications, but everyone agreed that the basic of idea of a Mercurial-powered SCM with DVCS-backed code review made for a compelling product. And so Kiln was born.
For Kiln to develop into an adult, though, we had to assemble a Kiln team. Tyler and I were still working on Copilot, and the newly appointed team lead, Ben Kamens, was busy with the FogBugz 7 release. Even if all three of us started work immediately, we couldn’t possibly turn the project from prototype to beta by our target date of August 2009, and starting immediately seemed…well, optimistic, at best.
But we work at Fog Creek, and if there’s one thing Fog Creek knows how to do, it’s how to help interns churn out awesome products over the course of a single summer. After all, Tyler and I started at Fog Creek by developing all of Copilot in the summer of 2005; why not go for broke and try for a repeat? What we therefore decided to do was to bet the farm and put all of our summer interns on Kiln. The three of us would try to wrap up the work we had to do on our current projects as quickly as possible, and, as we transitioned off, we’d focus purely on building up enough Kiln infrastructure that the interns could immediately be productive when they arrived. Meanwhile, until the three of us could start work on Kiln, we’d have our project managers figure out the details of the user experience so that, once we finally could work on Kiln, we’d be able to focus as much as possible on coding instead of decision-making meetings.
As we moved ever closer to June, we made several key decisions about the design of the product:
- Kiln could launch hosted-only, but we’d need to ensure that its design was amenable to on-site installation.
- Kiln would depend on FogBugz for user management and bug-tracking integration, but would otherwise be its own code base.
- Kiln’s website would be written in C# and ASP.NET MVC, completely freeing it from the FogBugz legacy code base.
- The part of Kiln that needed to talk directly to the DVCS would be a separate component so that we target different (or even multiple) SCMs without changing the website.
- Code reviews on branches would be eliminated in favor of arbitrary discussions on files and changesets.
By the time the first interns arrived, we had a beautiful set of specs with lovely Balsamiq Mockups put together by Jason and Dan, and we’d managed to cobble together a basic framework that supported repository hosting and FogBugz integration, and that learned as much as possible from our best example of a known-good ASP.NET MVC code base, StackOverflow.
Our plan paid off ridiculously quickly. A week into the internship, the interns had already managed to get key pieces of Kiln limping along. By the end of the second week, they had enough ownership they were starting to challenge us when they felt the user specs or the engineering didn’t make sense. Their strong focus on core Kiln freed Tyler, Ben and me to focus on performance, billing, On Demand integration, and all the other things that absolutely must get done for a real product, but that no one would otherwise ever do.
Just over a month into the summer, Kiln might not win any speed or beauty awards, but nearly all of its features were working in one way or another, and it was usable for its intended purpose. In other words, we’d hit pre-alpha. With great fanfare, we decided that Kiln was ready for dogfooding, and Kiln development moved to Kiln itself.
There’s a slightly unfortunate thing about dogfooding, though: features that looked great on paper, and even worked perfectly in the prototype, end up not being what you want in the real product. Some interfaces end up not scaling the way you want. Some end up too complicated, or end up solving one particular problem at the expense of all others. It’s a testament to our PMs that we had comparably few of these occur, but sometimes the difference between the prototype and what we ended up shipping was massive. For example, compare the Balsamiq mockup of the code review system, which was actually used by the pre-alpha:

with the version that we ended up actually shipping:

Or take a look at the original specification for the Kiln Dashboard:

compared to the shipping equivalent, the Activity Feed:

(I apologize about using the mockups, rather than screenshots, for the earlier versions; trying to get Kiln circa June 2009 running at this point proved a royal pain in the butt, and I don’t honestly think that it makes a big difference.)
What’s not obvious in these two screenshots is that the change from the pre-alpha interface to the shipping interface frequently happened over the course of just a couple of weeks. Everyone was very vocal about what they liked and didn’t like, and the interns were happy to go through several iterations rapid-fire to find one that everyone liked. In that way, the weakest parts of Kiln ended up getting the most attention, and rapidly matured into some of its strongest features.
In a massive code sprint at the end of the summer, Kiln matured into something resembling a fully grown product. One of our interns made a beautiful JavaScript renderer for Kiln’s DAG, the novel repository management our PMs designed fully matured into beautiful JavaScripty goodness, our FogBugz/Kiln workflow became increasingly seamless, and we further loosened review requirements so that you could review arbitrary discontiguous changesets. We transitioned Copilot and FogBugz to be hosted on Kiln as well, got mostly positive feedback from the rest of the team, and worked on swiftly addressing their complaints. Despite these feature additions, Kiln’s performance went from tolerable, to better, to fast. We knew we had a winner on our hands. We prepared the FogBugz On Demand environment to become Kiln On Demand, made our first deployment, and turned the switch for our first batch of beta users.
And while you might expect this part of the story to be about everything going haywire and all hell breaking loose, what actually happened is that, against all odds, everything basically worked. The beta was quite boring: while there were a lot of bugs to fix at first, and some of them were extremely tough (for example, supporting very large repositories, or making history views faster, or legitimately supporting Internet Explorer) or really ticked off our customers (Kiln at once point let you rename and move repositories without breaking URLs, which sounded like an absolutely great idea when I helped hammer it through the design committee, but which completely blew up in one of our client’s faces a few weeks later), everything basically worked. Our beta testers seemed increasingly excited about the product. All of our gambles seemed to have paid off as we readied Kiln 1.0 for its November launch date.
Except, of course, that Kiln did not ship in November, 2009. That’s because, just a week or two before it was supposed to ship, we went to the Business of Software conference in San Francisco.
You see, that was where we realized we were doing it all wrong.
To be continued…
Firing Up Kiln
January 13th, 2010, at 8:48 a.m.
As Kiln draws ever closer to release, I realized that we have long since passed the point where I should move all of my personal projects to it.
So as of today, I have.
If you’re interested in grabbing the most recent version of FogBugz Middleware, my fork of Kiln Backup, or any of my other public projects, they’re now all at https://bqb.kilnhg.com. Log-in with the user guest and the password anonymous, and you’ll have full read-only access to the entire site.
Even if you’re not that interested in the stuff I’ve written, you still may be interested in checking out the site: by enabling read-only guest logins on my FogBugz account, I’ve made it very easy for you to take a look around a real, active Kiln and FogBugz install, without setting things up or filling out any annoying forms. If you just wanted to a get a quick feel for what Kiln looked and felt like, now’s your chance.
So go ahead and check it out. Feel free to post question or comments here, or (if appropriate) over at the Kiln StackExchange, and I’ll be happy to answer them.
On Being Good
January 12th, 2010, at 9:42 p.m.
Google’s motto is, “Don’t be evil.”
I’ve always found that motto disturbing for two reasons. First, a company that can differentiate itself—successfully, no less—from its competitors merely by promising not to be evil implies that the average company is ridiculously corrupt. A person who announced, “My motto is, ‘don’t shoot people’” would be notable because no one thinks you should shoot people, making the promise weird and redundant—not because the promise represented some great sacrifice. Yet Google’s promise to do no evil somehow hits people, especially those in the tech industry with fresh memories of Microsoft in the 90s and the specter of Oracle in the 2000s, as a breath of fresh air. Great for Google, but pathetic for our industry.
But the second reason, and the more important one for me, is that “Don’t be evil” is not the same as “Do the right thing.” A person who watches idly while a bully beats someone up isn’t being evil, but they are being a coward, and they are not doing the right thing. Their interference could save a poor victim a world of pain and suffering, probably at minimal risk. Instead, they simply watch the bully, knowing that they themselves would not do the same thing. This may not be doing evil, but it’s also not the moral high ground. Knowing you would never beat someone up is not the same as protecting those weaker than you.
Google chose its motto carefully; if its motto were instead, “Do the right thing,” then it would have no presence in China. For all the corruption that people accuse our government of perpetrating, our government does not censor the Internet, does not shoot and incarcerate those who disagree with it, does not deny its citizens the right to vote, and does not persecute religious minorities as a matter of state policy. China does. And until today, while Google may not have been evil in China, they certainly enabled evil to go about its business by running a censored search engine there. They were unequivocally better than Yahoo, who handed over the names and email addresses of dissidents, but they weren’t doing the right thing, either. They weren’t standing up to an autocratic, dictatorial regime.
As of today, that has changed. Google has announced that they will no longer censor their Chinese search results. While you could argue that Google’s doing this out of anger that their resources have been hacked, rather than out of a genuine desire to protect its users, their result of their actions is beyond dispute: they are taking the moral high ground. And potentially at great cost: while China has certainly failed to materialize as the unstoppable threat to the West that pundits were claiming it would become two decades ago, it’s nevertheless home for nearly a billion people, and shows no sign of stopping its economic growth in the near future. For Google to make a move that will almost certainly sacrifice any chance they have of winning the Chinese market is an economically painful move.
But it’s the right move.
So, at least for today, at least this once, look at Google as a company which is not merely avoiding perpetrating evil. Google is doing the right thing, at great cost. And they deserve to be lauded for that.
Microsoft and Yahoo: this is your turn to follow in Google’s footsteps. Do the right thing. It won’t make you money. In fact, it’ll cost you. But it’s the right thing to do.
Google: where it’s not don’t be evil. It’s, “Do the right thing.”
Congratulations.
The Amazing Spammable Marketplace
January 7th, 2010, at 4:05 p.m.
Whenever I browse the Android Marketplace, I’m utterly amazed by how many “app reviews” are nothing but spam. The problem is so pandemic that I have to conclude that Google has thus far done absolutely nothing to combat the problem. Shopping in the Marketplace ends up feeling like going through a dirty bazaar, surrounded by panhandlers and con artists looking to make a cheap buck. I don’t care how good the deals may be; if shopping ends up being an annoying experience that makes me feel dirty, I’m unlikely to bother going in the first place. In the Marketplace, that translates to finding nothing but crap reviews, making shopping for any given application basically a crap shoot. Plus, the apps themselves end up looking like schlock you’d find on a spyware site.
If Google’s serious about trying to compete with the iPhone App Store, they need to get off their feet and fix this problem right now.
Droid Update Makes Droid Not Suck
December 11th, 2009, at 11:07 a.m.
Well. At least it makes it suck less.
I bought a Droid the day it came out. While it was a tremendous improvement over my BlackBerry, I’ve been disappointed with the phone overall. The battery cover comes off constantly ([2], [3]), the phone’s proximity sensor was extraordinarily finicky (usually resulting in me hitting the “mute” button with my cheek in the middle of a call), the camera was all but useless, and, for reasons I did not really understand, my Android developer phone running Android 1.6 provided a much smoother user experience than the vastly-more-powerful-on-paper Droid. In other words, the Droid was a solid upgrade from what I had, but still disappointing. I have to agree with Dave Winer’s now-famous rant on why the Droid sucks.
Last night, Motorola and Google unleashed Android 2.0.1 as an over-the-air update. While the update does little about the battery cover, it seems, at least so far, to resolve nearly all of the software issues. The proximity sensor’s logic seems improved, though not perfect; many operations are visually smoother (although oddly still not universally as smooth as the G1); the camera’s usable taking pictures, rather than mocking the incompetence of Motorola’s engineers; and there have even been some very nice visual refinements to fonts and color schemes. Best of all, and unusual for the first update to a new device, nothing broke: sudoku, SpacePhysics, Twidroid, TripIt, and other applications seem to still be working just fine.
So if you were previously hesitant about buying a Droid for software reasons, and don’t really have a problem using $2 double-sided tape to compensate for Motorola’s QA team having the same skill as an inebriated eight-year-old, I think you’ll be much happier with your purchase now than you’d have been a month ago. Otherwise, you might want to wait for the next Android-powered phone on Verizon and see if it works better. It’s certainly unlikely to be worse.
Kiln's Evolution, Part 1: DVCS as Code Review
November 10th, 2009, at 8:41 a.m.
One of the things that really sucks about doing online code reviews is that, in all the systems I know, your code reviews do not integrate with your source control. If the code reviews are versioned at all—and they’re frequently not—then they’re in an entirely different system than your real VCS. For larger reviews, where you’re talking about a major piece of functionality, that means that your source control system will end up lacking the history of how a feature came to be. In other words, the more you use code reviews, the less actual history you have in your VCS.
That’s totally broken. You’re being punished for doing the right thing.
A little over a year a year ago, Tyler came to me and asked me to join him in the Django Dash, a weekend code sprint. Tyler and I had been talking about the code review problem, and had been thinking of writing our own that lacked these issues. Django Dash seemed like the perfect time to try to actually do that.
That opened up a question: how do you actually achieve better code review? If you accept that a huge part of the problem is that the code review history is out-of-stream with your VCS, then it follows that you have to somehow store the in-process code in the VCS.
In most systems, that means using branches. But branches in almost any system suck. Everyone has a horror story about trying to do a merge in Subversion, or CVS, or Perforce—and these are usually not meaningfully large merges; just small feature branches. Trying to use branches for long-running code reviews in these systems simply isn’t viable.
But DVCSes are great at handling this kind of problem. To be distributed, they have to have extremely robust branching and merging systems. Because their systems are so good, it’s very common in DVCSes to do quick experiments and features in their own branch, then merge when complete.
Tyler and I were both big fans of Mercurial (in fact, we convinced all of Fog Creek to switch to Mercurial from Subversion), so using Mercurial as our DVCS base seemed like the best bet. After some discussion of the technical details for making the system work, we got a good night’s sleep, woke up early, threw a nice breakfast on the table, and started coding.
Forty-eight hours later, we had our first prototype. When users wanted to contribute code to a repository, they would fork the repository, push all of their changes to the fork, and then request a review on the fork. Users would see the exact diff of what they would be approving to the repository; no more, no less. Code could not be approved unless it had already merged in the trunk, ensuring that the user who wrote the code had taken care of the merge. When the review was approved, it’d be seamlessly merged into trunk (guaranteed seamlessly, due to the previous rule), with full history.
The design was inflexible and unintuitive, and would have had serious issues in a shipping project, but we achieved what we set out to do:
- Approving a code review was the same as pushing it
- Which meant that we could fully separate the concepts of code author and code approver
- And which meant that the full history of reviewed code was completely preserved
A screenshot of the prototype that would later become Kiln
Even in its nascent form, the tool was already impressive enough, and unique enough, that we won the Django Code Dash.
Tyler and I talked of making our code review tool into a real product, but we were knee-deep in Copilot work, so it had to wait. But we had proved, if only to ourselves, that using a DVCS, even in a centralized model, provided some very unique capabilities that simply were not possible in other systems.
When Joel announced a few months later that FogBugz needed a source control system, we’d be ready.
A Typographer's Captcha
October 15th, 2009, at 6:07 p.m.
I’m not entirely sure that this qualifies as a reasonable captcha for mere mortals.

Perhaps on a typographer’s site…
The Launch of a Secret Product
October 14th, 2009, at 9:12 a.m.
For the past year, an odd thing has happened, if you’ve followed my doings. My work on Fog Creek Copilot seemed to dwindle, I became tight-lipped about what I was working on, and I started getting really excited about an upcoming product release. Also around this time, my knowledge of Mercurial, Python, C#, and ASP.NET MVC all seemed to dramatically increase, even though my free-time code output shrank to nothing. What was going on?
Oh, the usual. I was working on a top-secret brand-new project. And now, it’s released to closed beta.

I’d like to introduce to you Kiln, a brand-new source code hosting and code review tool from Fog Creek. Kiln introduces what I believe is a truly novel take on code reviews that integrates the strengths of Mercurial and FogBugz to provide rejectable code review after commit. We also have some unique takes on Mercurial features, such as the ability to preview what will go into a merge beforehand, really awesome branch management, the most beautiful DAG view I’ve seen in any DVCS product, and lots more.
Over the next few days, I’ll be providing a couple of blog posts detailing how Kiln was developed. In the meantime, go check out Kiln and sign up for the beta. We’re approving new people for the beta on a regular basis, so if you don’t get an invite immediately, don’t worry; we’ll get to you sooner rather than later.
hg log -R tips_and_tricks
October 9th, 2009, at 7:54 a.m.
I was delighted to find that Steve Losh has begun making a website called hg tip—a site updated on a regular basis with Mercurial tips for both beginner and expert users. The site’s beautifully designed and a pleasure to read. If you use Mercurial, do yourself a favor and go take a look.
(My favorite tip, incidentally, is definitely the tutorial on making a command called nudge, which allows you to push only the current head by default, rather than all of them. I’ve been using a variant of that, combined with bookmarks, to have Git-style lightweight branching in my Mercurial work when appropriate.)
Finally, a Phone I Can Code For
October 6th, 2009, at 3:53 p.m.
Finally, a phone whose dev program doesn’t make me want to vomit. Now if only Palm would get the Pre out on Verizon faster, I might actually do so…
local_settings.py Considered Harmful
October 2nd, 2009, at 4:31 p.m.
One piece of increasingly conventional wisdom when developing Django applications is that your settings.py file ought to conclude with some variant of
try:
from local_settings import *
except ImportError:
pass
the idea being that you can put a local_settings.py file on each server with the appropriate information for that site.
I would challenge that this approach is wrong. By definition, you cannot put local_settings.py in version control, which means that a critical part of your infrastructure—how to deploy the thing—is not getting versioned. At the same time, using the same settings for your development box and your deployment box is unreasonable, and usually stupid.
After being unhappy with several solutions, including the one Simon Willison advocated in his excellent Django Heresies talk (which has a pile of other great tips as well), I stole a bit from everything I’d seen to come up with what I believe is my own design.
Rather than importing from local_settings, I decided to make settings be a full-blown module and use some Python magic. First, kill settings.py and make a settings directory. Then, add the following to settings/__init__.py:
import socket
hostname = socket.gethostname().replace('.', '_').lower()
try:
custom_settings = __import__(hostname)
__all__.append(custom_settings)
except ImportError:
from development import *
While this is definitely somewhat magical, it’s not too hard to follow: grab the hostname, convert it into a more Python-like form, and attempt to import it. If that succeeds, add it to __all__ for export. Otherwise, default to the development import.
With this change, we can now have a file for each machine we intend to deploy on, named after the machine, put into the settings directory, and it’ll be used automatically. Now, you’ve got the best of both worlds: site-specific configuration, managed by source control.
There’s one little caveat here that may or may not bother you: one of the data that generally go into local_settings.py is your database password, which you probably do not want in version control. Thankfully, that’s trivial to work around: in a non-version-controlled file on the server, create a settings/passwords.py file which contains only relevant passwords for that system. Then, at the tail end of that site’s settings file, add:
from passwords import *
It’s the same trick we used to use for local_settings.py, except that now, the only thing that’s not version-controlled is your password. Everything else—memcached settings, custom middleware and template directories, ADMINS lists, and so on—will be safely in your VCS of choice.
The One in Which I Call Out Hacker News
July 1st, 2009, at 8:26 a.m.
“Implementing caching would take thirty hours. Do you have thirty extra hours? No, you don’t. I actually have no idea how long it would take. Maybe it would take five minutes. Do you have five minutes? No. Why? Because I’m lying. It would take much longer than five minutes. That’s the eternal optimism of programmers.”
— Professor Owen Astrachan during 23 Feb 2004 lecture for CPS 108
Accusing open-source software of being a royal pain to use is not a new argument; it’s been said before, by those much more eloquent than I, and even by some who are highly sympathetic to the open-source movement. Why go over it again?
On Hacker News on Monday, I was amused to read some people saying that writing StackOverflow was hilariously easy—and proceeding to back up their claim by promising to clone it over July 4th weekend. Others chimed in, pointing to existing clones as a good starting point.
Let’s assume, for sake of argument, that you decide it’s okay to write your StackOverflow clone in ASP.NET MVC, and that I, after being hypnotized with a pocket watch and a small club to the head, have decided to hand you the StackOverflow source code, page by page, so you can retype it verbatim. We’ll also assume you type like me, at a cool 100 WPM (a smidge over eight characters per second), and unlike me, you make zero mistakes. StackOverflow’s *.cs, *.sql, *.css, *.js, and *.aspx files come to 2.3 MB. So merely typing the source code back into the computer will take you about eighty hours if you make zero mistakes.
Except, of course, you’re not doing that; you’re going to implement StackOverflow from scratch. So even assuming that it took you a mere ten times longer to design, type out, and debug your own implementation than it would take you to copy the real one, that already has you coding for several weeks straight—and I don’t know about you, but I am okay admitting I write new code considerably less than one tenth as fast as I copy existing code.
Well, okay, I hear you relent. So not the whole thing. But I can do most of it.
Okay, so what’s “most”? There’s simply asking and responding to questions—that part’s easy. Well, except you have to implement voting questions and answers up and down, and the questioner should be able to accept a single answer for each question. And you can’t let people upvote or accept their own answers, so you need to block that. And you need to make sure that users don’t upvote or downvote another user too many times in a certain amount of time, to prevent spambots. Probably going to have to implement a spam filter, too, come to think of it, even in the basic design, and you also need to support user icons, and you’re going to have to find a sanitizing HTML library you really trust and that interfaces well with Markdown (provided you do want to reuse that awesome editor StackOverflow has, of course). You’ll also need to purchase, design, or find widgets for all the controls, plus you need at least a basic administration interface so that moderators can moderate, and you’ll need to implement that scaling karma thing so that you give users steadily increasing power to do things as they go.
But if you do all that, you will be done.
Except…except, of course, for the full-text search, especially its appearance in the search-as-you-ask feature, which is kind of indispensable. And user bios, and having comments on answers, and having a main page that shows you important questions but that bubbles down steadily à la reddit. Plus you’ll totally need to implement bounties, and support multiple OpenID logins per user, and send out email notifications for pertinent events, and add a tagging system, and allow administrators to configure badges by a nice GUI. And you’ll need to show users’ karma history, upvotes, and downvotes. And the whole thing has to scale really well, since it could be slashdotted/reddited/StackOverflown at any moment.
But then! Then you’re done!
…right after you implement upgrades, internationalization, karma caps, a CSS design that makes your site not look like ass, AJAX versions of most of the above, and G-d knows what else that’s lurking just beneath the surface that you currently take for granted, but that will come to bite you when you start to do a real clone.
Tell me: which of those features do you feel you can cut and still have a compelling offering? Which ones go under “most” of the site, and which can you punt?
Developers think cloning a site like StackOverflow is easy for the same reason that open-source software remains such a horrible pain in the ass to use. When you put a developer in front of StackOverflow, they don’t really see StackOverflow. What they actually see is this:
create table QUESTION (ID identity primary key,
TITLE varchar(255), -- why do I know you thought 255?
BODY text,
UPVOTES integer not null default 0,
DOWNVOTES integer not null default 0,
USER integer references USER(ID));
create table RESPONSE (ID identity primary key,
BODY text,
UPVOTES integer not null default 0,
DOWNVOTES integer not null default 0,
QUESTION integer references QUESTION(ID))
If you then tell a developer to replicate StackOverflow, what goes into his head are the above two SQL tables and enough HTML to display them without formatting, and that really is completely doable in a weekend. The smarter ones will realize that they need to implement login and logout, and comments, and that the votes need to be tied to a user, but that’s still totally doable in a weekend; it’s just a couple more tables in a SQL back-end, and the HTML to show their contents. Use a framework like Django, and you even get basic users and comments for free.
But that’s not what StackOverflow is about. Regardless of what your feelings may be on StackOverflow in general, most visitors seem to agree that the user experience is smooth, from start to finish. They feel that they’re interacting with a polished product. Even if I didn’t know better, I would guess that very little of what actually makes StackOverflow a continuing success has to do with the database schema—and having had a chance to read through StackOverflow’s source code, I know how little really does. There is a tremendous amount of spit and polish that goes into making a major website highly usable. A developer, asked how hard something will be to clone, simply does not think about the polish, because the polish is incidental to the implementation.
That is why an open-source clone of StackOverflow will fail. Even if someone were to manage to implement most of StackOverflow “to spec,” there are some key areas that would trip them up. Badges, for example, if you’re targeting end-users, either need a GUI to configure rules, or smart developers to determine which badges are generic enough to go on all installs. What will actually happen is that the developers will bitch and moan about how you can’t implement a really comprehensive GUI for something like badges, and then bikeshed any proposals for standard badges so far into the ground that they’ll hit escape velocity coming out the other side. They’ll ultimately come up with the same solution that bug trackers like Roundup use for their workflow: the developers implement a generic mechanism by which anyone, truly anyone at all, who feels totally comfortable working with the system API in Python or PHP or whatever, can easily add their own customizations. And when PHP and Python are so easy to learn and so much more flexible than a GUI could ever be, why bother with anything else?
Likewise, the moderation and administration interfaces can be punted. If you’re an admin, you have access to the SQL server, so you can do anything really genuinely administrative-like that way. Moderators can get by with whatever django-admin and similar systems afford you, since, after all, few users are mods, and mods should understand how the sites work, dammit. And, certainly, none of StackOverflow’s interface failings will be rectified. Even if StackOverflow’s stupid requirement that you have to have and know how to use an OpenID (its worst failing) eventually gets fixed, I’m sure any open-source clones will rabidly follow it—just as GNOME and KDE for years slavishly copied off Windows, instead of trying to fix its most obvious flaws.
Developers may not care about these parts of the application, but end-users do, and take it into consideration when trying to decide what application to use. Much as a good software company wants to minimize its support costs by ensuring that its products are top-notch before shipping, so, too, savvy consumers want to ensure products are good before they purchase them so that they won’t have to call support. Open-source products fail hard here. Proprietary solutions, as a rule, do better.
That’s not to say that open-source doesn’t have its place. This blog runs on Apache, Django, PostgreSQL, and Linux. But let me tell you, configuring that stack is not for the faint of heart. PostgreSQL needs vacuuming configured on older versions, and, as of recent versions of Ubuntu and FreeBSD, still requires the user set up the first database cluster. MS SQL requires neither of those things. Apache…dear heavens, don’t even get me started on trying to explain to a novice user how to get virtual hosting, MovableType, a couple Django apps, and WordPress all running comfortably under a single install. Hell, just trying to explain the forking vs. threading variants of Apache to a technically astute non-developer can be a nightmare. IIS 7 and Apache with OS X Server’s very much closed-source GUI manager make setting up those same stacks vastly simpler. Django’s a great a product, but it’s nothing but infrastructure—exactly the thing that I happen to think open-source does do well, precisely because of the motivations that drive developers to contribute.
The next time you see an application you like, think very long and hard about all the user-oriented details that went into making it a pleasure to use, before decrying how you could trivially reimplement the entire damn thing in a weekend. Nine times out of ten, when you think an application was ridiculously easy to implement, you’re completely missing the user side of the story.
The One in Which I Say That Open-Source Software Sucks
June 30th, 2009, at 8:12 a.m.
These days, arguing that open-source software is crap seems dumb. How many websites are powered by a combination of MySQL, PHP, and Apache? How many IT applications, written in Eclipse, run on Java, using SWT widgets? How many design studios rely heavily on The GIMP and Inkscape for their everyday photo-retouching and page layout needs?
Er, wait. That last one. Doesn’t quite ring true. In fact, as good as most people seem to insist that Inkscape and The GIMP are, I’ve yet to see a major shop that ran on anything other than Adobe or Quark.
Okay, but, at least I can point to the many offices that run OpenOffice or KOffice! Or that have ditched FileMaker and Access for Kexi! Or that proudly rely on OpenGroupware for their scheduling needs!
…well. Except that I can’t.
I mean, yeah, there are some real businesses, here and there, that actually use those products, and some of them are successful. But, by and large, despite the success of open-source on the backend, open-source end-user applications have failed. In fact, when it comes to end-user applications that people other than open-source developers actually use, you’re pretty much limited to a single application: the web browser. (And even there, if you’re on Safari, then only your engine is open-source—and if you’re on IE or Opera, not even that.) And although I’m sure someone’s gonna say that open-source end-user apps are going to take over any day now, they’ve been claiming that since the height of the dot-com bubble, ten years ago. I wait with bated breath.
Why does open-source fail to reach critical mass anywhere but the server closet?
Easy: because open-source software is, incontrovertibly, a total usability clusterfuck.
Programmers are superb optimizers. We’ve been accused of being lazy in the best way possible: we try to ensure we do not solve problems that do not need to be solved, and that we solve those that we must deal with completely, so that they never bother us again.
On open-source projects, where anyone can quickly nail that one little bug that was ticking them off, that means that your software is gonna be lean (why implement what you won’t use?) and operate to spec (you don’t wanna keep dealing with that one annoying bug every day). So far, so good.
But what about using software? You only gotta learn software once—and, for those actually contributing on an open-source project, you probably learned from the bottom up, so that’s how you view the thing. Interface elements that expose quirks of the underlying implementation seem totally natural; what a user perceives as a bug strikes a developer as little more than the reflection of the underlying system.
Developers could fix this problem. They just completely lack motivation. Being “lazy” at this point means leaving the software as-is. If users find the software frustrating and unintuitive, or can’t get the thing installed, they should spend the time to learn the underlying, beautiful implementation, at which point they will discover a world of awesome and inspiring flexibility far greater than what closed-source offerings could possibly provide. And, until then, go bug the mailing list so that we can all call you an idiot.
What the developer-as-lazy argument misses is that companies, too, are lazy—and, if competently run, lazy in a good way. Employee time is money, and therefore a smart company will attempt to reduce how much time its employees must spend on a problem. If companies only employed developers, the end result would be the same as the open-source model. But they don’t. Companies also have support staff, and the amount of time support staff must spend fixing problems is directly proportional to how well-written and intuitive the software is.
Note that second part: any time that a user gets confused, and has to phone support to get through a problem, it costs the company money. The company is highly motivated to produce easy-to-understand and easy-to-use software to keep its support costs down. After all, at the end of the day, a user does not care how robust, or how elegant, or even how beautiful, the implementation of their software is; all they care about is whether they can use your video software to make glitter fly out their plastered boss’s butt in the Christmas party video.
So, if faced between spending time on the elegance of the implementation, or the intuitiveness of the interface, companies will optimize for intuition; open-source projects, for elegance of implementation.
In the general case, the open-source emphasis is wrong—at least, if you want your software to actually get used. All of the oddball exceptions in open-source usability—Firefox, Firefox 2, Firefox 3, and I guess GNOME, if you make twist my arm—have massive corporations backing them up, thinking in both development and support costs, and spending the time to make sure that users have a positive first-time experience with the software. Without exception, these products are only open-source because they, as a product, don’t actually confer much value to the parent company. Mozilla ultimately cares far less about whether you actually use Firefox than whether your Google queries list Mozilla as the referrer; Sun, at least in its drunk camel of a business plan, cared more whether you were running on Sun hardware than whether your desktop happened to run GNOME over KDE. Open-source, without corporate guidance, cares more about making sure you can tweak your cluster size to match the optimal expression of K-trees on BlenderFS on inverted Xeon cache vertibrates without affecting the MIPS port, than about ensuring that a new user has the faintest idea how to do anything with that package he just downloaded.
Maybe, someday, human altruism will make possible the Grand Dream of Open-Source, where projects are open-source, and well thought-out, and easy-to-use, and easy-to-install, and highly efficient, and bug free. Until then, open-source software is going to run great, but be painful to use, and closed-source software will be easy, but less efficient. Pick which you want for your own purposes. Just don’t forget to take a look around at how much Apple hardware you see these days to figure out which users actually value.
Zombie Operating Systems and ASP.NET MVC
June 12th, 2009, at 3:48 p.m.
In 1973, an operating system called CP/M was born. CP/M had no directories, and filenames were limited to 8.3 format. To support input and output from user programs, the pseudofiles COM1, COM2, COM3, COM4, LPT1, LPT2, CON, AUX, PRN, and NUL were provided.
In 1980, Seattle Computer Products decided to make a cheap, approximate
clone of CP/M, called 86-DOS. 86-DOS therefore had no directories, supported 8.3 file names, and included the pseudofiles COM1, COM2, COM3, COM4, LPT1, LPT2, CON, AUX, PRN, and NUL. Further, because many programs always saved their files with a specific extension, any file with these names and an extension was treated as identical to the filename without the extension.
In 1981, Microsoft Corporation purchased the rights to use 86-DOS, renamed it MS-DOS, and shipped it to customers.
In 1983, Microsoft Corporation released MS-DOS 2.0. MS-DOS 2.0 supported hierarchical directories. To maintain backwards compatibility with applications designed for MS-DOS 1.0, which had no concept of directories, Microsoft placed the pseudofiles COM1, COM2, COM3, COM4, LPT1, LPT2, CON, AUX, PRN, and NUL, with all possible extensions, in all directories.
In 1988, Microsoft began the development of a modern, preemptively multitasked, memory-protected, multiuser system loosely based on VMS, called Windows NT. Windows NT supported a completely new file system design, called NTFS, modeled on OS/2’s HPFS, which allowed for arbitrary file names in Unicode UCS-2. To maintain backwards compatibility with DOS applications running under the new operating system, some of which were written before DOS had hierarchical directories, Windows NT placed the pseudofiles COM1-9, LPT1-9, CON, AUX, PRN, and NUL, with all possible extensions, in all directories. Windows NT shipped to customers in 1993.
In 1998, Microsoft released the Option Pack to the two-year-old Windows NT 4.0, containing a new technology called Active Server Pages, or ASP. ASP allowed the creation of dynamic websites via COM scripting. Because ASP was file-based, URLs by definition could not include /com1-9.asp, /lpt1-9.asp, /con.asp, /aux.asp, /prn.asp, or /nul.asp.
In 2002, Microsoft announced the imminent release of the .NET framework. One component of .NET was ASP.NET, a vast improvement over ASP. Although ASP.NET provided vastly superior ways to write web pages than those provided by ASP, the mechanism was still completely based on per-file web pages—and although this mechanism could be overridden by various means, for backwards compatibility reasons, ASP.NET always checked for the existence of a file first before attempting to execute a custom handler. Thus, web pages could not contain /com[1-9](\..*)?, /lpt[1-9](\..*)?, /con(\..*)?, /aux(\..*)?, /prn(\..*)?, or /nul(\..*)?.
In 2009, Microsoft released ASP.NET MVC, a thoroughly modern, orthogonal web framework supporting the most up-to-date understanding of how to architect well-factored, scalable web applications. ASP.NET MVC broke free of the per-file emphasis of previous frameworks, instead emphasizing regex-based URL dispatching, similar to popular scripting frameworks such as Django, Rails, and Catalyst. These URLs did not map to files; they instead mapped to objects that were designed to handle the request. Thus, URLs could be built entirely on what made logical sense, rather than on the confines of what the file system dictated.
But ASP.NET MVC was based on ASP.NET. Which checks for the existence of a file before running any scripts. Which means it will check directories that have COM1-9, LPT1-9, CON, AUX, PRN, and NUL in them, with any extension.
And that is why, in 2009, when developing in Microsoft .NET 3.5 for
ASP.NET MVC 1.0 on a Windows 7 system, you cannot include /com\d(\..*)?, /lpt\d(\..*)?, /con(\..*)?, /aux(\..*)?, /prn(\..*)?, or /nul(\..*)? in any of your routes.
