Thursday, February 22, 2007

Open-Source Software Development


Open source is software developed by uncoordinated but loosely collaborating programmers, using freely distributed source code and the communications infrastructure of the Internet. Open source has a long history rooted in the Hacker Ethic. The term open source was adopted in large part because of the ambiguous nature of free software. Various categories of free and non-free software are commonly referenced, some with interchangeable meanings. Several licensing agreements have therefore been developed to formalize distribution terms. The Cathedral and the Bazaar is the most frequently cited description of the open-source development methodology , however although the paper identifies many mechanisms of successful open-source development, it does not expose the dynamics. There are literally hundreds, if not thousands, of open-source projects currently in existence.

1. Introduction

Open source has generated a considerable amount of interest over the past year. The concept itself is based on the philosophy of free software, which advocates freely available source code as a fundamental right. However, open source extends this ideology slightly to present a more commercial approach that includes both a business model and development methodology.

Open Source Software, or OSS, refers to software for which the source code is distributed without charge or limitations on modifications. Open source sells this approach as a business model, emphasizing faster development and lower overhead, as well as a closer customer relationship and exposure to a broader market.

Open source also encompasses a software development methodology. Open source development is described as a rapid evolutionary process, which leverages large-scale peer review. The basic premise is that allowing source code to be freely modified and redistributed encourages collaborative development. The software can be incrementally improved and more easily tested, resulting in a highly reliable product.

1.1 Background

Much of the Internet infrastructure is open-source software. For example, Sendmail is the dominant mail transfer system on the Internet. BIND is the most widely used implementation of the Internet Domain Name System, and InterNetNews is the most popular Usenet news server. (O’Reilly, 1998a) It is therefore no surprise that the momentum associated with open source has coincided with the rapid growth of the Internet. The Web has made collaboration between programmers easier and possible on a larger scale than before, and projects such as Linux and Apache have become immensely successful. The projected size of various open-source communities is shown in Table 1 (O'Reilly, 1998b).

Table 1. Projected size of open-source communities.

Estimating size of user community
Linux 7,000,000
Perl 1,000,000
BSD 960,000
Apache 400,000
Sendmail 350,000
Python 325,000
Tcl/Tk 300,000
Samba 160,000

The response from the software industry has been varied, but open source has made some notable inroads in a relatively short time. IBM has adopted the Apache Web server as the cornerstone of its WebSphere Internet-commerce application server (IBM, 1998). IBM has also released the source code for an XML parser and Jikes, a Java byte code interpreter (Gonsalves, 1998). Netscape has released the source code for the next generation of its popular Communicator product, restructuring ongoing development as an open-source project (Charles, 1998). Apple has taken a similar approach, releasing portions of its next generation operating system, MacOS X, as open source (Apple, 1999a). Microsoft acknowledged open source as a potential business threat in an internal memo that was subsequently leaked to the press (Valloppillil, 1998), and has recently indicated that it may consider releasing some code.

These developments demonstrate a sustained interest in open source, and it is quickly becoming a viable alternative to conventional methods of software development, as companies attempt to leverage the Internet in reducing time to market.

2. History

2.1 The Hacker Ethic

Open source is firmly rooted in the Hacker Ethic. In the late 1950’s, MIT’s computer culture originated the term hacker, defined today as "a person who enjoys exploring the details of programmable systems …" (Raymond, 1996). Various members of the Tech Model Railroad Club, or TMRC, formed the nucleus of MIT’s Artificial Intelligence Laboratory. These individuals were obsessed with the way systems worked. The word hack had long been used to describe elaborate college pranks devised by MIT students, however TMRC members used the word to describe a task ‘imbued with innovation, style, and technical virtuosity" (Levy, 1984). A project undertaken not solely to fulfill some constructive goal, but with some intense creative interest was called a hack.

Projects encompassed everything electronic, including constant improvements to the elaborate switching system controlling the TMRC’s model railroad. Increasingly though, attentions were directed toward writing computer programs, initially for an IBM 704 and later on the TX-0, one of the first transistor-run computers in the world. Early hackers would spend days working on programs intended to explore the limits of these machines.

In 1961, MIT acquired a PDP-1, the first minicomputer, designed not for huge number-crunching tasks but for scientific inquiry, mathematical formulations, and of course hacking. Manufactured by Digital Equipment Corporation, the PDP series of computers pioneered commercial interactive computing and time-sharing operating systems. MIT hackers developed software that was freely distributed by DEC to other PDP owners. Programming at MIT became a rigorous application of the Hacker Ethic, a belief that "access to computers – and anything which might teach you something about the way the world works – should be unlimited and total" (Levy, 1984).

2.2 ARPAnet

MIT was soon joined by Stanford University’s Artificial Intelligence Laboratory and later Carnegie-Mellon University. All were thriving centres of software development able to communicate with each other through the ARPAnet, the first transcontinental, high-speed data network. Built by the Defense Department in the late 1960’s, it was originally designed as an experiment in digital communication. However, the ARPAnet quickly grew to link hundreds of universities, defense contractors, and research laboratories. This allowed for the free exchange of information with unprecedented speed and flexibility, particularly software.

Programmers began to actively contribute to various shared projects. These early collaborative efforts led to informal principles and guidelines for distributed software development stemming from the Hacker Ethic. The most widely known of these projects was UNIX, which contributed to the ongoing growth of what would eventually become the Internet.

2.3 Unix and BSD

Unix was originally developed at AT&T Bell Labs, and was not strictly speaking a freely available product. However, it was licensed to universities for a nominal sum, which resulted in an explosion of creativity as programmers built on each other’s work.

Traditionally, operating systems had been written in assembler to maximize hardware efficiency, but by the early 1970’s hardware and compiler technology had become good enough that an entire operating system could be written in a higher level language. UNIX was written in C, and this provided unheard of portability between hardware platforms, allowing programmers to write software that could be more easily shared and dispersed.

The most significant source of Unix development outside of Bell Labs was the University of California at Berkeley. UC Berkeley’s Computer Science Research Group folded their own changes and other contributions into a succession of releases. Berkley Unix came to be known as BSD, or Berkley Standard Distribution, and included a rewritten file system, networking capabilities, virtual memory support, and a variety of utilities (Ritchie, 1979).

A few of the BSD contributors founded Sun Microsystems, marketing Unix on 68000-based hardware. Rivalry ensued between supporters of Berkley Unix and AT&T versions. This intensified in 1984, when AT&T divested and Unix was sold as a commercial product for the first time through Unix System Laboratories.

2.4 The GNU Project

The commercialization of Unix not only fractured the developer community, but it resulted in a confusing mass of competing standards that made it increasingly difficult to develop portable software. Other companies had entered the marketplace, selling various proprietary versions of Unix. Development largely stagnated, and Unix System Laboratories was sold to Novell after efforts to create a canonical commercial version failed. The GNU project was conceived in 1983 to rekindle the cooperative spirit that had previously dominated software development.

GNU, which stands for GNU’s Not Unix, was initiated under the direction of Richard S. Stallman, who had been a later participant in MIT’s Artificial Intelligence Lab and believed strongly in the Hacker Ethic. The GNU project had the ambitious goal of developing a freely available Unix-like operating system that would include command processors, assemblers, compilers, interpreters, debuggers, text editors, mailers, and much more. (FSF, 1998a)

Stallman created the Free Software Foundation, an organization that promotes the development and use of free software, particularly the GNU operating system (FSF, 1998c). Hundreds of programmers created new, freely available versions of all major Unix utility programs. Many of these utilities were so powerful that they became the de facto standard on all Unix systems. However, a project to create a replacement for the Unix kernel faltered.

By the early 1990’s, the proliferation of low-cost, high-performance personal computers along with the rapid growth of the World Wide Web had reduced entry barriers to participation in collaborative projects. Free software development extended to reach a much larger community of potential contributors, and projects such as Linux and Apache became immensely successful, prompting a further formalism of hacker best practices.

2.5 The Cathedral and the Bazaar

The Cathedral and the Bazaar (Raymond, 1998a), a position paper advocating the Linux development model, was first presented at Linux Kongress 97 and made widely available on the Web shortly thereafter. The paper presents two singular approaches to software development. The Cathedral represents conventional commercial practices, where developers work using a relatively closed, centralized methodology. In contrast, the Bazaar embodies the Hacker Ethic, in which software development is an openly cooperative effort.

The paper essentially ignored contemporary techniques in software engineering, using the Cathedral as a pseudonym for the waterfall lifecycle of the 1970s (Royce, 1970), however it served to attract widespread attention. A grassroots movement quickly developed, culminating in a January 1998 announcement that Netscape Communications would release the source code for its Web browser. This was the first time that a Fortune 500 company had transformed an enormously popular commercial product into free software.

The term Open Source was coined shortly afterward out of a growing realization that free software development could be marketed as a viable alternative to commercial companies.

3. Definition

The term open source was adopted in large part because of the ambiguous nature of the expression free software. The notion of free software does not mean free in the financial sense, but instead refers to the users' freedom to run, copy, distribute, study, change and improve software. Confusion over the meaning can be traced to the problem that, in English, free can mean no cost as well as freedom. In most other languages, free and freedom do not share the same root; gratuit and libre, for instance. "To understand the concept, you should think of free speech, not free beer," writes Richard Stallman (FSF, 1999a).

3.1 Categories of Free and Non-Free Software

Due to the inherent ambiguity of the terminology, various wordings are used interchangeably. This is misleading, as software may be interpreted as something it is not. Even closely related terms such as free software and open source have developed subtle distinctions. (FSF, 1998b)

3.1.1 Public Domain

Free software is often confused with public domain software. If software is in the public domain, then it is not subject to ownership and there are no restrictions on its use or distribution. More specifically, public domain software is not copyrighted. If a developer places software in the public domain, then he or she has relinquished control over it. Someone else can take the software, modify it, and restrict the source code.

3.1.2 Freeware

Freeware is commonly used to describe software that can be redistributed but not modified. The source code is not available, and consequently freeware should not be used to refer to free software.

3.1.3 Shareware

Shareware is distributed freely, like freeware. Users can redistribute shareware, however anyone who continues to use a copy is required to pay a modest license fee. Shareware is seldom accompanied by the source code, and is not free software.

3.1.4 Open Source

Open source is used to mean more or less the same thing as free software. Free software is "software that comes with permission for anyone to use, copy, and distribute, either verbatim or with modifications, either gratis or for a fee." (FSF, 1999a) In particular, this means that source code must be available.

Free software is often used in a political context, whereas open source is a more commercially oriented term. The Free Software Foundation advocates free software as a right, emphasizing the ethical obligations associated with software distribution (Stallman, 1999). Open source is commonly used to describe the business case for free software, focusing more on the development process rather than any underlying moral requirements.

3.2 Licensing

Various free software licenses have been developed. The licenses each disclaim all warranties. The intent is to protect the author from any liability associated with the software. Since the software is provided free of charge, this would seem to be a reasonable request.

Table 2 provides a comparison of several common licensing practices (Perens, 1999).

Table 2. Comparison of licensing practices.

License Can be mixed with non-free software Modifications can be taken private and not returned to you Can be re-licensed by anyone Contains special privileges for the original copyright holder over your modifications
Public Domain X X X

3.2.1 Copyleft and the GNU Public License

Copyleft is a concept originated by Richard Stallman to address problems associated with placing software in the public domain. As mentioned previously, public domain software is not copyrighted. Someone can make changes to the software, many or few, and distribute the result as a proprietary product. People who receive the modified product may not have the same freedoms that the original author provided. Copyleft says that "anyone who redistributes the software, with or without changes, must pass along the freedom to further copy and change it." (FSF, 1999b)

To copyleft a program, first it is copyrighted and then specific distribution terms are added. These terms are a legal instrument that provide rights to "use, modify, and redistribute the program's code or any program derived from it but only if the distribution terms are unchanged." (FSF, 1999b)

In the GNU project, copyleft distribution terms are contained in the GNU General Public License, or GPL. The GPL does not allow private modifications. Any changes must also be distributed under the GPL. This not only protects the original author, but it also encourages collaboration, as any improvements are made freely available

Additionally, the GPL does not allow the incorporation of licensed programs into proprietary software. Any software that does not grant as many rights as the GPL is defined as proprietary. However, the GPL contains certain loopholes that allow it to be used with software that is not entirely free. Software libraries that are normally distributed with the compiler or operating system may be linked with programs licensed under the GPL. The result is a partially-free program. The copyright holder has the right to violate the license, but this right does not extend to any third parties who redistribute the program. Subsequent distributions must follow all of the terms of the license, even those that the copyright holder violates.

An alternate form of the GPL, the GNU General Library Public License or LGPL, allows the linking of free software libraries into proprietary executables under certain conditions. In this way, commercial development can also benefit from free software. A program covered by the LGPL can be converted to the GPL at any time, but that program, or anything derived from it, cannot be converted back to the LGPL.

The GPL is a political manifesto as well as a software license, and much of the text is concerned with explaining the rationale behind the license. Unfortunately this political dialogue has alienated some developers. For example, Larry Wall, creator of Perl and the Artistic license, says "the FSF [Free Software Foundation] has religious aspects that I don’t care for" (Lash, 1998). As a result, some free software advocates have created more liberal licensing terms, avoiding the political rhetoric associated with the GPL.

3.2.2 The X, BSD, and Apache Licenses

The X license and the related BSD and Apache licenses are very different from the GPL and LGPL. The software originally covered by the X and BSD licenses was funded by monetary grants from the US government. In this sense, the public owned the software, and the X and BSD licenses therefore grant relatively broad permissions.

The most important difference is that X-licensed modifications can be made private. An X-licensed program can be modified and redistributed without including the source or applying the X license to the modifications. Other developers have adopted the X license and its variants, including the BSD and the Apache web server.

3.2.3 The Artistic License

The Artistic license was originally developed for Perl, however it has since been used for other software. The terms are more loosely defined in comparison with other licensing agreements, and the license is more commercially oriented. For instance, under certain conditions modifications can be made private. Furthermore, although sale of the software is prohibited, the software can be bundled with other programs, which may or may not be commercial, and sold.

3.2.4 The Netscape Public License and the Mozilla Public License

The Netscape Public License, or NPL, was originally developed by Netscape. The NPL contains special privileges that apply only to Netscape. Specifically, it allows Netscape to re-license code covered by the NPL to third parties under different terms. This provision was necessary to satisfy proprietary contracts between Netscape and other companies. The NPL also allows Netscape to use code covered by the NPL in other Netscape products without those products falling under the NPL.

Not surprisingly, the free software community was somewhat critical of the NPL. Netscape subsequently released the MPL, or Mozilla Public License. The MPL is similar to the NPL, but it does not contain exemptions. Both the NPL and the MPL allow private modifications.

3.2.5 The Open Source Definition

The Open Source Definition is not a software license. Instead it is a specification of what is permissible in a software license for that software to be considered open source. The Open Source Definition is based on the Debian free software guidelines or social contract, which provides a framework for evaluating other free software licenses.

The Open Source Definition includes several criteria, which can be paraphrased as follows (OSI, 1999):

  1. Free Redistribution – Copies of the software can be made at no cost.
  2. Source Code – The source code must be distributed with the original work, as well as all derived works.
  3. Derived Works – Modifications are allowed, however it is not required that the derived work be subject to the same license terms as the original work.
  4. Integrity of the Author’s Source Code – Modifications to the original work may be restricted only if the distribution of patches is allowed. Derived works may be required to carry a different name or version number from the original software.
  5. No Discrimination Against Persons or Groups – Discrimination against any person or group of persons is not allowed.
  6. No Discrimination Against Fields of Endeavor – Restrictions preventing use of the software by a certain business or area of research are not allowed.
  7. Distribution of License – Any terms should apply automatically without written authorization.
  8. License Must Not Be Specific to a Product – Rights attached to a program must not depend on that program being part of a specific software distribution.
  9. License Must Not Contaminate Other Software – Restrictions on other software distributed with the licensed software are not allowed.

The GNU GPL, BSD, X Consortium, MPL, and Artistic licenses are all examples of licenses that conform to the Open Source Definition.

The evaluation of a proposed license elicits considerable debate in the free software community. With the growing popularity of open source, many companies are developing licenses intended to capitalize on this interest. Some of these licenses conform to the Open Source Definition, however others do not. For example, the Sun Community Source License approximates some open source concepts, but it does not conform to the Open Source Definition. The Apple Public Source License, or APSL (Apple, 1999b), as been alternately endorsed and rejected by members of the open-source community.

4. Methodology

The Cathedral and the Bazaar is the most frequently cited description of the open-source development methodology. Eric Raymond’s discussion of the Linux development model as applied to a small project is a useful commentary. However, it should be noted that although the paper identifies many mechanisms of successful open-source development, it does not expose the dynamics. In this sense, the description is inherently weak.

4.1 Plausible Promise

Raymond remarks that it would be difficult to originate a project in bazaar mode. To build a community, a program must first demonstrate plausible promise. The implementation can be crude or incomplete, but it must convince others of its potential. This is given as a necessary precondition of the bazaar, or open-source, style.

Interestingly, many commercial software companies use this approach to ship software products. Microsoft, for example, consistently ships early versions of products that are notoriously bug ridden. However as long as a product can demonstrate plausible promise, either by setting a standard or uniquely satisfying a potential need, it is not necessary for early versions to be particularly strong.

Critics suggest that the effective utilization of bazaar principles by closed source developers implies ambiguity. Specifically, that the Cathedral and the Bazaar does not sufficiently describe certain aspects of the open-source development process (Eunice, 1998).

4.2 Release Early, Release Often

Early and frequent releases are critical to open-source development. Improvements in functionality are incremental, allowing for rapid evolution, and developers are "rewarded by the sight of constant improvement in their work." (Raymond, 1998a)

Product evolution and incremental development are not new. Mills initially proposed that any software system should be grown by incremental development (Mills, 1971). Brooks would later elaborate on this concept, suggesting that developers should grow rather than build software, adding more functions to systems as they are run, used, and tested (Brooks, 1986). Basili suggested the concept of iterative enhancement in large-scale software development (Basili and Turner, 1975), and Boehm proposed the spiral model, a evolutionary prototyping approach incorporating risk management (Boehm, 1986).

Open source relies on the Internet to noticeably shorten the iterative cycle. Raymond notes that "it wasn’t unknown for [Linus] to release a new kernel more than once a day." (Raymond, 1998a) Mechanisms for efficient distribution and rapid feedback make this practice effective.

However, successful application of an evolutionary approach is highly dependent on a modular architecture. Weak modularity compromises change impact and minimizes the effectiveness of individual contributors. In this respect, projects that do not encourage a modular architecture may not be suitable for open-source development. This contradicts Raymond’s underlying assertion, that open source is a universally better approach.

4.3 Debugging is Parallelizable

Raymond emphasizes large-scale peer review as the fundamental difference underlying the cathedral and bazaar styles. The bazaar style assumes that "given a large enough beta-tester and co-developer base, almost every problem will be characterized quickly and the fix obvious to someone." Debugging requires less coordination relative to development, and thus is not subject "to the same quadratic complexity and management costs that make adding developers problematic." (Raymond, 1998a)

The basic premise is that more debuggers will contribute to a shorter test cycle without significant additional cost. In other words, "more users find more bugs because adding more users adds more ways of stressing the program." (Raymond, 1998a) However, open source is not a prerequisite for peer review. For instance, various forms of peer review are commonly employed in software engineering. The question might then become one of scale, but Microsoft practices beta-testing on a scale matched only by larger open-source projects.

Raymond continues, suggesting that debugging is even more efficient when users are co-developers, as is most often the case in open-source projects. This is also subject to debate. Raymond notes that each tester "approaches the task of bug characterization with a slightly different perceptual set and analytical toolkit, a different angle on the problem." (Raymond, 1998a) This is characterized by the fact that developers and end-users evaluate products in very different ways. It therefore seems likely that peer review under the bazaar model would be constrained by a disproportionate number of co-developers.

5. Project Profiles

There are literally hundreds, if not thousands, of open-source projects currently in existence. These projects include operating systems, programming languages, utilities, Internet applications and many more. The following projects are notable for their influence, size, and success.

5.1 Linux

Linux is a Unix-like operating system that runs on several platforms, including Intel processors, Motorola MC68K, and DEC Alphas (SSC, 1998). It is a superset of the POSIX specification, with SYS V and BSD extensions. Linux began as a hobby project of Linus Torvalds, a graduate student at the University of Helsinki. The project was inspired by his interest in Minix, a small Unix system developed primarily as an educational tool by Andy Tannenbaum. Linus set out to create, in his own words, "a better Minix than Minix." In October 1991, Linus announced the first official release of Linux, version 0.02. Since then, hundreds of programmers have contributed to the ongoing improvement of the system.

Linux kernel development is largely coordinated through the linux-kernel mailing list. The list is high volume, and currently includes over 200 active developers as well as many other debuggers and testers. With the growth of the project, Linus has relinquished control over certain areas of the kernel, such as file systems and networking, to other ‘trusted lieutenants." However, Linus remains the final authority on decisions related to kernel development. The kernel is under the GPL, and official versions are made available via ftp.

Arguably the most well known open-source project, Linux has quietly gained popularity in academia as well as among scientific researchers and Internet service providers. Recently, it has made commercial advances, and is currently marketed as the only viable alternative to Microsoft Windows NT. A study by International Data Corporation reported that Linux accounted for 17.2 % of server operating system shipments in 1998, an increase of 212% over the previous year (Shankland, 1998). The Linux kernel is typically packaged with the various other programs that comprise a Unix operating system. Several commercial companies currently sell these packages as Linux distributions.

5.2 Apache

Apache originated in early 1995 as a series of enhancements to the then-popular public domain HTTP daemon developed by Rob McCool at the National Center for Supercomputing Applications, or NCSA. Rob McCool had left NCSA in mid 1994, and many Webmasters had become frustrated with a lack of further development. Some proceeded to develop their own fixes and improvements. A small group coordinated these changes in the form of patches and made the first official release of the Apache server in April 1995, hence the name A PAtCHy server. (Laurie, 1999)

The Apache Group is currently a core group of about 20 project contributors, who now focus more on business issues and security problems. The larger user community manages mainstream development. Apache operates as a meritocracy, in a format similar to most open-source projects. Responsibility is based on contribution, or "the more work you have done, the more work you are allowed to do." (The Apache Group, 1999) Development is coordinated through the new-httpd mailing list, and a voting process exists for conflict resolution.

Apache has consistently ranked as the most popular Web server on the Internet (Netcraft, 1999). Currently, Apache dominates the market and is more widely used than all other Web servers combined. Industry leaders such as DEC, UUNet, and Yahoo use Apache. Several companies, including C2Net, distribute commercial versions of Apache, earning money for support services and added utilities.

5.3 Mozilla

Mozilla is an open-source deployment of Netscape’s popular Web browsing suite, Netscape Communicator. Netscape’s decision was strongly influenced by a whitepaper written by employee Frank Hecker (Hecker, 1998), which referenced the Cathedral and the Bazaar. In January 1998, Netscape announced that the source code for the next generation of Communicator would be made freely available. The first developer release of the source code was made in late March 1998. exists as a group within Netscape responsible for coordinating development. Mozilla has established an extensive web site, which includes problem reporting and version management tools. Discussion forums are available through various newsgroups and mailing lists. The project is highly modular and consists of about 60 groups, each responsible for a particular subsystem. All code issued in March was released under the NPL. New code can be released under the MPL or any compatible license. Changes to the original code are considered modifications and are covered by the NPL.

Although it has benefited from widespread media exposure, Mozilla has yet to result in a production release. It is therefore difficult to evaluate the commercial success of the project. The recent merger of AOL and Netscape has introduced additional uncertainty, but many continue to feel confident that the project will produce a next generation browser.

5.4 Perl and Python

Perl and Python are mature scripting languages that have achieved considerable market success. Originally developed in 1986 by Larry Wall, Perl has become the language of choice for system and network administration, as well as CGI programming. Large commercial Web sites such as Yahoo and Amazon make extensive use of Perl to provide interactive services.

Perl, which stands for Practical Extraction and Report Language, is maintained by a core group of programmers via the perl5porters mailing list. Larry Wall retains artistic control of the language, but a well-defined extension mechanism allows for the development of add-on modules by independent programmers. (Wall et al, 1996)

Python was developed by Guido van Rossum at Centrum voor Wiskunde en Informatica, or CWI, in Amsterdam. It is an interactive, object-oriented language and includes interfaces to various system calls and libraries, as well as to several popular windowing systems. The Python implementation is portable and runs on most common platforms. (Lutz, 1996)

5.5 KDE and GNOME

KDE and GNOME are X11 based desktop environments. KDE also includes an application development framework and desktop office suite. The application framework is based on KOM/OpenParts technology, and leverages open industry standards such as the object request broker CORBA 2.0. The office suite, KOffice, consists of a spreadsheet, a presentation tool, an organizer, and an email and news client.

GNOME, or the GNU Network Object Model Environment, is similar in many ways to KDE. However GNOME uses the gtk+ toolkit, which is also open source, whereas KDE uses Qt, a foundation library from Troll Tech that was commercially licensed until recently.

KDE and GNOME are interesting because they represent the varying commitments in the open source community to commercial markets and the free software philosophy. The KDE group and Troll Tech initially tried to incorporate Qt, a proprietary product, into the Linux infrastructure. This was met with mixed reactions. The prospect of a graphical desktop for Linux was so attractive that some were willing to overlook the contradictory nature of the project. However, others rejected KDE and instead supported GNOME, which was initiated as a fully open source competitor. Eventually, Troll Tech realized Qt would not be successful in the Linux market without a change in license, and a new agreement was released, defusing the conflict. GNOME continues, aiming to best KDE in terms of functionality rather than philosophy (Perens, 1999).

5.6 Other Projects

Other lesser known, but equally interesting, projects include GIMP, FreeBuilder, Samba, and Kaffe. Each of these projects follows the open source methodology, originating under the direction of an individual or small group and rapidly extending to a larger development community.

GIMP, or the GNU Image Manipulation Program, can be used for tasks such as photo retouching, image composition and image authoring. GIMP was written by Peter Mattis and Spencer Kimball, and released under the GPL. FreeBuilder is a visual programming environment based on Java. It includes an integrated text editor, debugger, and compiler. Samba allows Unix systems to act as file and print servers on Microsoft Windows networks. Development is headed by Andrew Tridgell. Kaffe is a cleanroom implementation of the Java virtual machine and class libraries.

6. Summary and Conclusions

Open source is software developed by uncoordinated but loosely collaborating programmers, using freely distributed source code and the communications infrastructure of the Internet. Open source is based on the philosophy of free software. However, open source extends this ideology slightly to present a more commercial approach that includes both a business model and development methodology. Various categories of free and non-free software are commonly, and incorrectly, referenced, including public domain, freeware, and shareware. Licensing agreements such as the GPL have been developed to formalize distribution terms. The Open Source Definition provides a framework for evaluating these licenses.

The Cathedral and the Bazaar is the most frequently cited description of the open-source development methodology, however although the paper identifies many mechanisms of successful open-source development, it does not expose the dynamics. Critics note that certain aspects remain ambiguous, which suggests that the paper does not sufficiently describe the open-source development process.

There are hundreds, if not thousands, of open-source projects currently in existence. These projects face growing challenges in terms of scalability and inherently weak tool support. However open source is a pragmatic example of software development over the Internet. It provides interesting lessons with regard to large-scale collaboration and distributed coordination. If the open-source approach is adequately studied and evaluated, ideally the methodology might be applied in a broader context.