2008-01-14

Ancient history, Part 2: using open source.

The first time it dawned on me how oddly mistaken most people were
about what sort of tools and technologies we used to develop Web
search systems in FAST back in the early days, was when Doc Searls
told me that he thought we were running everything on Windows NT (I
think this must have been back in late 1999 or early 2000).

I have to admit I was a bit shocked.

To me, it was odd that someone would even assume that we ran
everything off Windows NT, so told him a bit about the sort of systems
and tools we used; that we used FreeBSD for our servers, that most of
us at the time were running Linux or FreeBSD as a desktop operating
system. A few ran Windows, and I think one guy used a Mac. But most
developers ran Linux or FreeBSD.

Our entire code-base was built using the GNU tool-chain and various
open source languages and systems.

In all fairness, eventually some commercial tools and libraries were
used. The first commercial development tool I can remember us having
was a system for finding memory leaks. I also remember that we had
some libraries to handle parsing text in some languages. But by and
large, everything was built using open source tools.

The first thing people think about when they hear "Free Software" is
"free" as in "doesn't cost anything". Of course, our management was
very happy that we didn't need to spend money licensing operating
systems, compilers, editors and whatnot -- but that wasn't really why
we used Open Source systems. We used them because they were much
better suited to solve the sort of problems we needed to solve.

First of all, we were intimately familiar with these systems. Most of
us had been using them for years and when you are going to do
something that is a bit hard, you tend to stick to the tools you know
well and trust.

Second, we didn't really have the time to deal with closed source
vendors. If something broke, we needed to understand the problem and
come up with a solution. If you have access to the source and you
have competent developers on-staff, fixing show-stopper issues is a
lot easier than when you have to rely on a third party to make time
for you and then perhaps come up with a solution in a week or two.

We had competent people on-staff and we did fix our own problems. When
we ran into problems in the OS we had at least two or three people who
could look into it. Several of them used to contribute fixes to
FreeBSD, Linux etc., so whatever problems we came across and fixed,
the rest of the world would benefit from as well. It was the same for
innumerable other systems and tools.

Some people also took part more directly in open source projects. For
instance, my long-time colleague Stig took part in the PHP project as
a member of the core team for several years. Since we made heavy use
of PHP for a lot of front-end work, having him working on PHP and
being part of the PHP project was of great benefit to us.

This active use of open source tools and participation in open source
was by no means unique to FAST. There were many companies that had a
very intimate relationship to open source, and today their numbers are
even greater. In fact, all companies I have worked for since (with one
notable exception) have been big believers in open source. Look at
any of the great Internet brands and chances are they are using,
and/or contributing to open source.

I am not sure how people come to believe that open source tools are
somehow not suited for developing large scale, cutting edge systems.
My experience of this was very different. The sort of systems we
built were not Enterprise-scale, they were Web-scale.

Web-scale systems are very different from Enterprise-scale in that
they start off having to handle traffic and data a few orders of a
magnitude larger than enterprise systems, and then have to handle
exponential growth from there on. If you can't design a feature in a
way that'll scale for that sort of growth, you can just forget about
it. If the feature is really important, you have to realize it in a
way that allows it to scale. If that means developing some custom
technology for dealing with it: so be it.

I often see people complain about how some huge web service is missing
some feature and that implementing it would be "trivial". Often this
simply isn't true, but it is an easy mistake to make when you're not
familiar with systems that need to scale really well and where every
millisecond counts in response times.

In enterprise-scale systems you can afford greater feature richness.
You can afford to use traditional systems like databases for data
storage, and really heavy, feature rich frameworks. Your users are
numbered in the tens of thousands -- not in tens of millions, or some
extreme cases, hundreds of millions. The growth rates of your data
are more predictable.

This is why some systems designed for web scale won't necessarily work
for the enterprise, and vice versa.

For most people "enterprise" systems means "really large" -- but the
enterprise concept of "really large" is different from that of "web
scale". As are the expectations for features offered. This doesn't
mean that one is harder than the other, but it does mean that the ways
you set about working on such problems are often fundamentally
different.


So let's summarize some main points here.

Open Source plays a big role in the Internet industry. Serious people
create serious systems using Open Source.

It is not only possible to create large, cutting edge systems with
nothing but open source tools, but it has been done many times and it
continues to be done today. That isn't going to change any time soon.

Open source gives you more control. When problems arise, you can
address them. This is far more valuable than any support contract you
can get for a closed source system. It also insulates you from the
misfortunes and follies of other companies -- if the vendor for a
critical closed source component you rely on goes out of business, you
are toast. The worst that can happen to an open source project is
that people lose interest in it and stop contributing to it. You'll
still have the source and you'll still have people willing to work on
it for money.

The most important resource you need in order to innovate and create,
is good people. If you want to create the next big thing on the net,
you start with a small team of good developers. People who are not
afraid to dive all the way down and build things from the ground up if
need be.

No comments:

Post a Comment