A (New?) Development Methodology
Aug 11, 1998 – v1.00
Microsoft Confidential
Table of Contents *
Executive Summary *
Open Source Software *
Software Licensing Taxonomy *
Open Source Software is Significant to Microsoft *
History *
OSS Development Coordination *
Parallel Development *
Parallel Debugging *
Conflict resolution *
Motivation *
Code Forking *
Long-term credibility *
Parallel Debugging *
Parallel Development *
OSS = ‘perfect’ API evangelization / documentation *
Release rate *
Process Issues *
Organizational Credibility *
Loss Leader -- Market Entry *
Commoditizing Downstream Suppliers *
First Mover – Build Now, $$ Later *
Linux is a real, credible OS + Development process *
Linux is a short/medium-term threat in servers *
Linux is unlikely to be a threat on the desktop *
Beating Linux *
Strengths *
Weaknesses *
Predictions *
Organization *
Strengths *
Weaknesses *
IBM & Apache *
Microsoft Response *
Capturing OSS benefits -- Developer Mindshare *
Capturing OSS benefits – Microsoft Internal Processes *
Extending OSS benefits -- Service Infrastructure *
Blunting OSS attacks *
Acknowledgments *
Revision History *
Open Source Software
A (New?) Development Methodology
Consequently, OSS poses a direct, short-term revenue and platform threat to Microsoft – particularly in server space. Additionally, the intrinsic parallelism and free idea exchange in OSS has benefits that are not replicable with our current licensing model and therefore present a long term developer mindshare threat.
However, other OSS process weaknesses
provide an avenue for Microsoft to garner advantage in key feature areas
such as architectural improvements (e.g. storage+), integration (e.g. schemas),
ease-of-use, and organizational support.
Software Type | |||||||
Commercial | |||||||
Trial Software |
(Non-full featured) |
|
|||||
Non-Commercial Use |
(Usage dependent) |
|
|||||
Shareware |
|
|
|||||
Royalty-free binaries (“Freeware”) |
|
|
|
||||
Royalty-free libraries |
|
|
|
|
|||
Open Source (BSD-Style) |
|
|
|
|
|
||
Open Source (Apache Style) |
|
|
|
|
|
|
|
Open Source (Linux/GNU style) |
|
|
|
|
|
|
|
License Feature | Zero Price Avenue | Redistributable | Unlimited Usage | Source Code Available | Source Code Modifiable | Public “Check-ins” to core codebase | All derivatives must be free |
Open Source Software is Significant to Microsoft
A key barrier to entry for OSS
in many customer environments has been its perceived lack of quality. OSS
advocates contend that the greater code inspection & debugging in OSS
software results in higher quality code than commercial software.
Recent
case studies (the Internet) provide very dramatic evidence in customer’s
eyes that commercial quality can be achieved / exceeded by OSS projects.
At this time, however there is no strong evidence of OSS code quality aside
from anecdotal.
Another barrier to entry that has
been tackled by OSS is project complexity. OSS teams are undertaking projects
whose size & complexity had heretofore been the exclusive domain of
commercial, economically-organized/motivated development teams. Examples
include the Linux Operating System and Xfree86 GUI.
OSS process vitality is directly tied to the Internet to provide distributed development resources on a mammoth scale. Some examples of OSS project size:
Project | Lines of Code |
Linux Kernel (x86 only) | 500,000 |
Apache Web Server | 80,000 |
SendMail | 57,000 |
Xfree86 X-windows server | 1.5 Million |
“K” desktop environment | 90,000 |
Full Linux distribution | ~10 Million |
Internet Software
The largest case study of OSS is the Internet. Most of the earliest code on the Internet was, and is still based on OSS as described in an interview with Tim O’Reilly (http://www.techweb.com/internet/profile/toreilly/interview ):
Credit for the first instance of modern, organized OSS is generally given to Richard Stallman of MIT. In late 1983, Stallman created the Free Software Foundation (FSF) – http://www.gnu.ai.mit.edu/fsf/fsf.html -- with the goal of creating a free version of the UNIX operating system. The FSF released a series of sources and binaries under the GNU moniker (which recursively stands for “Gnu’s Not Unix”).
The original FSF / GNU initiatives fell short of their original goal of creating a completely OSS Unix. They did, however, contribute several famous and widely disseminated applications and programming tools used today including:
FSF/GNU software introduced the “copyleft” licensing scheme that not only made it illegal to hide source code from GNU software but also made it illegal to hide the source from work derived from GNU software. The document that described this license is known as the General Public License (GPL).
Wired magazine has the following summary of this scheme & its intent (http://www.wired.com/wired/5.08/linux.html):
In
other words, to understand how to compete against OSS, we must target a
process rather than a company.
Coordination of an OSS team is extremely dependent on Internet-native forms of collaboration. Typical methods employed run the full gamut of the Internet’s collaborative technologies:
Common Direction
In addition to the communications medium, another set of factors implicitly coordinate the direction of the team.
Common Goals
Common goals are the equivalent of vision statements which permeate the distributed decision making for the entire development team. A single, clear directive (e.g. “recreate UNIX”) is far more efficiently communicated and acted upon by a group than multiple, intangible ones (e.g. “make a good operating system”).
Common Precedents
Precedence is potentially the most important factor in explaining the rapid and cohesive growth of massive OSS projects such as the Linux Operating System. Because the entire Linux community has years of shared experience dealing with many other forms of UNIX, they are easily able to discern – in a non-confrontational manner – what worked and what didn’t.
There weren’t arguments about the
command syntax to use in the text editor – everyone already used “vi”
and the developers simply parcelled out chunks of the command namespace
to develop.
Having historical, 20:20 hindsight
provides a strong, implicit structure. In more forward
looking organizations, this structure is provided by strong, visionary
leadership.
Common Skillsets
NatBro points out that the need for a commonly accepted skillset as a pre-requisite for OSS development. This point is closely related to the common precedents phenomena. From his email:
The Cathedral and the Bazaar
A very influential paper by an open source software advocate – Eric Raymond – was first published in May 1997 (http://www.redhat.com/redhat/cathedral-bazaar/). Raymond’s paper was expressly cited by (then) Netscape CTO Eric Hahn as a motivation for their decision to release browser source code.
Raymond dissected his OSS project in order to derive rules-of-thumb which could be exploited by other OSS projects in the future. Some of Raymond’s rules include:
Every good work of software starts by scratching a developer's personal itch
Widely available open source reduces search costs for finding a particular code snippet.
Code documentation is cited as an area which commercial developers typically neglect which would be a fatal mistake in OSS.
This is a classic
play out of the Microsoft handbook. OSS advocates will note, however,
that their release-feedback cycle is potentially an order of magnitude
faster than commercial software’s.
Because the developers are typically hobbyists, the ability to ‘fund’ multiple, competing efforts is not an issue and the OSS process benefits from the ability to pick the best potential implementation out of the many produced.
Note, that this is very dependent on:
Raymond includes Linus Torvald’s description of the Linux debugging process:
“Impulse Debugging”
An extension to parallel debugging that I’ll add to Raymond’s hypothesis is “impulsive debugging”. In the case of the Linux OS, implicit to the act of installing the OS is the act of installing the debugging/development environment. Consequently, it’s highly likely that if a particular user/developer comes across a bug in another individual’s component – and especially if that bug is “shallow” – that user can very quickly patch the code and, via internet collaboration technologies, propagate that patch very quickly back to the code maintainer.
Put another way, OSS processes have a very low entry barrier to the debugging process due to the common development/debugging methodology derived from the GNU tools.
In the case of Linux, Linus Torvalds is the undisputed ‘leader’ of the project. He’s delegated large components (e.g. networking, device drivers, etc.) to several of his trusted “lieutenants’ who further de-facto delegate to a handful of “area” owners (e.g. LAN drivers).
Other organizations are described by Eric Raymond: (http://earthspace.net/~esr/writings/homesteading/homesteading-15.html):
Solving the Problem at Hand
This is basically a rephrasing of Raymond’s first rule of thumb – “Every good work of software starts by scratching a developer’s personal itch”.
Many OSS projects – such as Apache -- started as a small team of developers setting out to solve an immediate problem at hand. Subsequent improvements of the code often stem from individuals applying the code to their own scenarios (e.g. discovering that there is no device driver for a particular NIC, etc.)
Education
The Linux kernel grew out of an educational project at the University of Helsinki. Similarly, many of the components of Linux / GNU system (X windows GUI, shell utilities, clustering, networking, etc.) were extended by individuals at educational institutions.
The most ethereal, and perhaps most profound motivation presented by the OSS development community is pure ego gratification.
In “The Cathedral and the Bazaar”, Eric S. Raymond cites:
Homesteading on the Noosphere
A second paper published by Raymond – “Homesteading on the Noosphere” (http://sagan.earthspace.net/~esr/writings/homesteading/), discusses the difference between economically motivated exchange (e.g. commercial software development for money) and “gift exchange” (e.g. OSS for glory).
“Homesteading” is acquiring property by being the first to ‘discover’ it or by being the most recent to make a significant contribution to it. The “Noosphere” is loosely defined as the “space of all work”. Therefore, Raymond posits, the OSS hacker motivation is to lay a claim to the largest area in the body of work. In other words, take credit for the biggest piece of the prize.
From “Homesteading on the Noosphere”:
…
For examined in this way, it is quite clear that the society of open-source hackers is in fact a gift culture. Within it, there is no serious shortage of the `survival necessities' -- disk space, network bandwidth, computing power. Software is freely shared. This abundance creates a situation in which the only available measure of competitive success is reputation among one's peers.
This is a controversial motivation and I’m inclined to believe that at some level, Altruism ‘degenerates’ into a form of the Ego Gratification argument advanced by Raymond.
One
smaller motivation which, in part, stems from altruism is Microsoft-bashing.
Code forking occurs when over normal push-and-pull of a development project, multiple, inconsistent versions of the project’s code base evolve.
In the commercial world, for example, the strong, singular management of the Windows NT codebase is considered to be one of it’s greatest advantages over the ‘forked’ codebase found in commercial UNIX implementations (SCO, Solaris, IRIX, HP-UX, etc.).
Forking in OSS – BSD Unix
Within OSS space, BSD Unix is the best example of forked code. The original BSD UNIX was an attempt by U-Cal Berkeley to create a royalty-free version of the UNIX operating system for teaching purposes. However, Berkeley put severe restrictions on non-academic uses of the codebase.
In order to create a fully free version of BSD UNIX, an ad hoc (but closed) team of developers created FreeBSD. Other developers at odds with the FreeBSD team for one reason or another splintered the OS to create other variations (OpenBSD, NetBSD, BSDI).
There are two dominant factors which led to the forking of the BSD tree:
(Lack of) Forking in Linux
In contrast to the BSD example, the Linux kernel code base hasn’t forked. Some of the reasons why the integrity of the Linux codebase has been maintained include:
Linus is considered by the development team to be a fair, well-reasoned code manager and his reputation within the Linux community is quite strong. However, Linus doesn’t get involved in every decision. Often, sub groups resolve their – often large – differences amongst themselves and prevent code forking.
Indirectly this presents a further disincentive to code forking. There is almost no credible mechanism by which the forked, minority code base will be able to maintain the rate of innovation of the primary Linux codebase.
Put another way, the growth of the Internet will make existing OSS projects bigger and will make OSS projects in “smaller” software categories become viable.
One of the most interesting implications of viable OSS ecosystems is long-term credibility.
Long-Term Credibility Defined
Long term credibility exists if there is no way you can be driven out of business in the near term. This forces change in how competitors deal with you.
For example, Airbus Industries garnered
initial long term credibility from explicit government support. Consequently,
when bidding for an airline contract, Boeing would be more likely to accept
short-term, non-economic returns when bidding against Lockheed than when
bidding against Airbus.
Loosely applied to the vernacular of the software industry, a product/process is long-term credible if FUD tactics can not be used to combat it.
OSS is Long-Term Credible
OSS systems are considered credible
because the source code is available from potentially millions of places
and individuals.
The likelihood that Apache will cease to exist is orders of magnitudes lower than the likelihood that WordPerfect, for example, will disappear. The disappearance of Apache is not tied to the disappearance of binaries (which are affected by purchasing shifts, etc.) but rather to the disappearance of source code and the knowledge base.
Inversely stated, customers know that Apache will be around 5 years from now -- provided there exists some minimal sustained interested from its user/development community.
One Apache customer, in discussing his rationale for running his e-commerce site on OSS stated, “because it’s open source, I can assign one or two developers to it and maintain it myself indefinitely. “
Lack of Code-Forking Compounds Long-Term Credibility
The GPL and its aversion to code forking
reassures customers that they aren’t riding an evolutionary ‘dead-end’
by subscribing to a particular commercial version of Linux.
The “evolutionary
dead-end” is the core of the software FUD argument.
In particular, larger, more savvy, organizations who rely on OSS for business operations (e.g. ISPs) are comforted by the fact that they can potentially fix a work-stopping bug independent of a commercial provider’s schedule (for example, UUNET was able to obtain, compile, and apply the teardrop attack patch to their deployed Linux boxes within 24 hours of the first public attack)
For example, the Linux TCP/IP stack was probably rewritten 3 times. Assembly code components in particular have been continuously hand tuned and refined.
NatBro and Ckindel point out a split in developer capabilities here. Whereas the “enthusiast developer” is comforted by OSS evangelization, novice/intermediate developers –the bulk of the development community – prefer the trust model + organizational credibility (e.g. “Microsoft says API X looks this way”)
Starting an OSS project is difficult
From Eric Raymond:
Bazaar Credibility
Obviously, there are far more fragments of source code on the Internet than there are OSS communities. What separates “dead source code” from a thriving bazaar?
One article (http://www.mibsoftware.com/bazdev/0003.htm) provides the following credibility criteria:
What both projects did have was a handful of enthusiasts and a plausible promise. The promise was partly technical (this code will be wonderful with a little effort) and sociological (if you join our gang, you'll have as much fun as we're having). So what's necessary for a bazaar to develop is that it be credible that the full-blown bazaar will exist!"
When describing this problem to JimAll,
he provided the perfect analogy of “chasing tail lights”. The easiest
way to get coordinated behavior from a large, semi-organized mob is to
point them at a known target. Having the taillights provides concreteness
to a fuzzy vision. In such situations, having a taillight to follow is
a proxy for having strong central leadership.
Of course, once this implicit organizing principle is no longer available (once a project has achieved “parity” with the state-of-the-art), the level of management necessary to push towards new frontiers becomes massive.
This is possibly the single most interesting
hurdle to face the Linux community now that they’ve achieved parity with
the state of the art in UNIX in many respects.
Un-sexy work
Another interesting thing to observe in the near future of OSS is how well the team is able to tackle the “unsexy” work necessary to bring a commercial grade product to life.
In the operating systems space, this includes small, essential functions such as power management, suspend/resume, management infrastructure, UI niceties, deep Unicode support, etc.
For Apache, this may mean novice-administrator functionality such as wizards.
Integrative/Architectural work
Integrative work across modules is the biggest cost encountered by OSS teams. An email memo from Nathan Myrhvold on 5/98, points out that of all the aspects of software development, integration work is most subject to Brooks’ laws.
Up till now, Linux has greatly
benefited from the integration / componentization model pushed by previous
UNIX’s. Additionally, the organization of Apache was simplified by the
relatively simple, fault tolerant specifications of the HTTP protocol and
UNIX server application design.
Future innovations
which require changes to the core architecture / integration model are
going to be incredibly hard for the OSS team to absorb because it simultaneously
devalues their precedents and skillsets.
Iterative Cost
One of the key’s to the OSS process
is having many more iterations than commercial software (Linux was known
to rev it’s kernel more than once a day!). However, commercial customers
tell us they want fewer revs, not more.
“Non-expert” Feedback
The Linux OS is not developed for end users but rather, for other hackers. Similarly, the Apache web server is implicitly targetted at the largest, most savvy site operators, not the departmental intranet server.
The key thread here is that because
OSS doesn’t have an explicit marketing / customer feedback component,
wishlists – and consequently feature development -- are dominated by the
most technically savvy users.
One thing that development groups at MSFT have learned time and time again is that ease of use, UI intuitiveness, etc. must be built from the ground up into a product and can not be pasted on at a later time.
The interesting trend to observe here will be the effect that commercial OSS providers (such as RedHat in Linux space, C2Net in Apache space) will have on the feedback cycle.
Support Model
Product support is typically the first
issue prospective consumers of OSS packages worry about and is the primary
feature that commercial redistributors tout.
However, the vast majority of OSS projects are supported by the developers of the respective components. Scaling this support infrastructure to the level expected in commercial products will be a significant challenge. There are many orders of magnitude difference between users and developers in IIS vs. Apache.
For the short-medium run, this factor alone will relegate OSS products to the top tiers of the user community.
A very sublime problem which will affect full scale consumer adoption of OSS projects is the lack of strategic direction in the OSS development cycle. While incremental improvement of the current bag of features in an OSS product is very credible, future features have no organizational commitment to guarantee their development.
What does it mean for the Linux
community to “sign up” to help build the Corporate Digital Nervous System?
How can Linux guarantee backward compatibility with apps written to previous
API’s? Who do you sue if the next version of Linux breaks some commitment?
How does Linux make a strategic alliance with some
other entity?
In many cases, the answers to these questions are similar to “why should I submit my protocol/app/API to a standards body?”
Linux distributors, such as RedHat, Caldera, and others, are expressly willing to fund full time developers who release all their work to the OSS community. By simultaneously funding these efforts, Red Hat and Caldera are implicitly colluding and believe they’ll make more short term revenue by growing the Linux market rather than directly competing with each other.
An indirect example is O’Reilly & Associates employment of Larry Wall – “leader” and full time developer of PERL. The #1 publisher of PERL reference books, of course is O’Reilly & Associates.
For the short run, especially as the OSS project is at the steepest part of it’s growth curve, such investments generate positive ROI. Longer term, ROI motivations may steer these developers towards making proprietary extensions rather than releasing OSS.
The best examples of this currently are the thin server vendors such as Whistle Communications, and Cobalt Micro who are actively funding developers in SAMBA and Linux respectively.
Both Whistle and Cobalt generate their revenue on hardware volume. Consequently, funding OSS enables them to avoid today’s PC market where a “tax” must be paid to the OS vendor (NT Server retail price is $800 whereas Cobalt’s target MSRP is around $1000).
The earliest Apache developers were employed by cash-strapped ISPs and ICPs.
Another, more recent example is IBM’s deal with Apache. By declaring the HTTP server a commodity, IBM hopes to concentrate returns in the more technically arcane application services it bundles with it’s Apache distribution (as well as hope to reach Apache’s tremendous market share).
In addition, the developer scale, iteration rate, and reliability advantages of the OSS process are a blessing to small startups who typically can’t afford a large in–house development staff.
Examples of startups in this space include SendMail.com (making a commercially supported version of the sendmail mail transfer agent) and C2Net (makes commercial and encrypted Apache)
Notice, that no case of a successful startup originating an OSS project has been observed. In both of these cases, the OSS project existed before the startup was formed.
Sun Microsystem’s has recently announced that its “JINI” project will be provided via a form of OSS and may represent an application of the pre-emption doctrine.
A second memo titled “Linux OS Competitive Analysis” provides an in-depth review of the Linux OS. Here, I provide a top-level summary of my findings in Linux.
Top-Level Features:
Linux is a short/medium-term threat in servers
Linux’s future strength against NT server (and other UNIXes) is fed by several key factors:
Linux’s homebase is currently commodity network and server infrastructure. By folding extended functionality (e.g. Storage+ in file systems, DAV/POD for networking) into today’s commodity services, we raise the bar & change the rules of the game.
Relative to other OSS projects, Mozilla is considered to be one of the most direct, near-term attacks on the Microsoft establishment. This factor alone is probably a key galvanizing factor in motivating developers towards the Mozilla codebase.
New credibility
The availability of Mozilla source code has renewed Netscape’s credibility in the browser space to a small degree. As BharatS points out in http://ie/specs/Mozilla/default.htm:
The browser is widely used / disseminated. Consequently, the pool of people who may be willing to solve “an immediate problem at hand” and/or fix a bug may be quite high.
Mozilla is already at close to parity with IE4/5. Consequently, there no strong example to chase to help implicitly coordinate the development team.
Netscape has assigned some of their top developers towards the full time task of managing the Mozilla codebase and it will be interesting to see how this helps (if at all) the ability of Mozilla to push on new ground.
Small Noosphere
An interesting weakness is the size of the remaining “Noosphere” for the OSS browser.
There are no longer any large,
high-profile segments of the stand-alone browser which must be developed.
In otherwords, Netscape has already solved the interesting 80% of the problem.
There is little / no ego gratification in debugging / fixing the remaining
20% of Netscape’s code.
Potentially the single biggest detriment to the Mozilla effort is the level of integration that customers expect from features in a browser. As stated earlier, integration development / testing is NOT a parallelizable activity and therefore is hurt by the OSS process.
In particular, much of the new work for IE5+ is not just integrating components within the browser but continuing integration within the OS. This will be exceptionally painful to compete aga inst.
Mozilla Mailing List |
|
|
|
Feature Wishlist |
|
|
|
UI Development |
|
|
|
General Discussion |
|
|
|
During May-June 1995, a new server architecture (code-named Shambhala) was developed which included a modular structure and API for better extensibility, pool-based memory allocation, and an adaptive pre-forking process model. The group switched to this new server base in July and added the features from 0.7.x, resulting in Apache 0.8.8 (and its brethren) in August.
Less than a year after the group was
formed, the Apache server passed NCSA's httpd as the #1 server on the Internet.
A description of the code management and dispute resolution procedures followed by the Apache team are found on http://www.apache.org:
Leadership:
Apache far and away has #1 web site share on the Internet today. Possession of the lion’s share of the market provides extremely powerful control over the market’s evolution.
In particular, Apache’s market share in web server space presents the following competitive hurdles:
The number of tools / modules / plug-ins available for Apache has been growing at an increasing rate.
In the short run, IIS soundly beats Apache on SPECweb. Moving further, as IIS moves into kernel and takes advantage deeper integration with the NT, this lead is expected to increase further.
Apache, by contrast, is saddled with the requirement to create portable code for all of its OS environments.
HTTP Protocol Complexity & Application services
Part of the reason that Apache was able to get a foothold and take off was because the HTTP protocol is so simple. As more and more features become layered on top of the humble web server (e.g. multi-server transaction support, POD, etc.) it will be interesting to see how the Apache team will be able to keep up.
ASP support, for example is a key driver for IIS in corporate intranets.
Server vs. Client
The server is more vulnerable to OSS products than the client. Reasons for this include:
How can Microsoft capture some of the rabid developer mindshare being focused on OSS products?
Some initial ideas include:
The OSS communities “MSDN” equivalent, of course, is a loose confederation of web sites with API docs of varying quality. MS has an opportunty to really exploit the web for developer evangelization.
De-commoditize protocols & applications
OSS projects have been able to gain a foothold in many server applications because of the wide utility of highly commoditized, simple protocols. By extending these protocols and developing new protocols, we can deny OSS projects entry into the market.
David Stutz makes a very good
point: in competing with Microsoft’s level of desktop integration, “commodity
protocols actually become the means of integration” for OSS
projects. There is a large amount of IQ being expended
in various IETF working groups which are quickly creating the architectural
model for integration for these OSS projects.
Some examples of Microsoft initiatives which are extending commodity protocols include:
The rise of specialty servers is a particularly potent and dire long term threat that directly affects our revenue streams. One of the keys to combating this threat is to create integrative scenarios that are valuable on the server platform. David Stutz points out:
Jim Allchin
Charlie Kindel
Ben Slivka
Josh Cohen
George Spix
David Stutz
Stephanie Ferguson
Jackie Erickson
Michael Nelson
Dwight Krossa
David D’Souza
David Treadwell
David Gunter
Oshoma Momoh
Alex Hopman
Jeffrey Robertson
Sankar Koundinya
Alex Sutton
Bernard Aboba
Date | Revision | Comments |
8/03/98 | 0.95 | |
8/10/98 | 0.97 | Started revision
table
Folded in comments from JoshCo |
8/11/98 | 1.00 | More fixes, printed copies for PaulMa review |