The Changing Role of CIOs
The High Cost of Swimming Upstream with IT Projects

Is Biopharma ready for open source software

In the future, 2008 may be looked at as the year when open source software finally became viable in the biopharmaceutical industry. To be clear, I don't mean software that can be used across industries but only those designed for operations specifically and uniquely performed in our industry.

A second caveat is that the term "viable" may only apply to the software products themselves (i.e. the vendors) but not necessarily the companies that would use or benefit from them. In other words, the technology is ready while the industry may not be!

We'll dive into the relevance of and issues surrounding open source software by answering a few questions.

What is open source software?

According to Wikipedia, "open source software is "computer software for which the human-readable source code is made available under a copyright license (or arrangement such as the public domain) that meets the Open Source Definition. This permits users to use, change, and improve the software, and to redistribute it in modified or unmodified form. It is often developed in a public, collaborative manner."

The key term from this description is "source code." Source code is all of the content (i.e. programs and related files) written by programmers that are compiled and then installed on computers so that you can accomplish something. If you prefer, you can call the finished product an "application." For example, I write and publish this blog using a software package called "TypePad."

As a user, I don't get to see the "source code" but only the user interface served up to me by the TypePad application. Typically, this protects the software company from those who would either want to steal the code or hack into it so that it does not do what it was intended to do. Of course, such security protection is also a way for the publisher to protect their revenue stream.

How is open source software different than proprietary licensed software?

When you buy a proprietary software package like Microsoft Office, you are given permission by the copyright holder to simply use the software for a certain or unlimited amount of time. With MS Word, for example, you can use the version you buy as long as you like, but you do have to pay an additional fee if you want to upgrade to a newer version. Using the software, however, does not give you the right to play around with the source code itself. If you tried to do so, you would be violating your agreement with Microsoft.

When you license an open source software package like OpenOffice (a direct MS Office competitor), you can use it in two ways. First, you can simply take advantage of it as a user and do word processing, create a spreadsheet, design a presentation, etc. Second, you can modify the program so that it does something new or modifies an existing behavior. In other words, you can program it yourself instead of hoping that the vendor will listen to your requests.

An oft debated issue about open source software is price. For this article, it will be enough to say that open source software is:

  • not necessarily free
  • may not be cheaper than proprietary software, but
  • most likely is less expensive than it's equivalent proprietary competitor

What's more important is that open source software let's you modify the source code.

Are there good examples of open source software in our industry?

YES. Perhaps the best example is the EDC software package called PhOSCo, now managed by a company called Penguin Trials.

Several years ago, the PhOSCo open source software was licensed by Novartis and implemented as one of the most successful EDC systems to date. Even if you did not know that the Novartis EDC system was powered by PhOSCo, you probably heard or read that it allowed Novartis to leave behind the old-fashioned paper CRF data collection mode and show some pretty impressive error reduction and faster execution statistics. What you haven't read is that Novartis made significant modifications to the PhOSCo source code over several years, to the extent that some staffers began to ask whether it was still appropriate to call the software by its original name (i.e. PhOSCo). What we can learn from this success story is that open source software can work in our industry provided that the willingness to do so is there.

So, what is the key barrier to wider adoption of open-source software?

The typical barrier to adoption is the level of comfort of the buyer with the software and the company providing it. Comfort is measured in many ways, both objective and subjective. The buyer will typically ask the following types of questions:

  • Does the software deliver the capabilities we required today and in the future?
  • How widely is this software used in my industry?
  • How solid are the financials of the software firm?
  • Are sales of this software increasing, staying flat or decreasing?
  • How do I know if this software company will be around in a few years?
  • Can I get support from the software firm in the geographies where we operate?
  • Do they use software and hardware that are compatible with our corporate standards?
  • If I recommend this software, will I be held accountable if things don't go well?

Interestingly, these questions are the same whether one is evaluating proprietary or open-source software. The problem is thus not with the questions but with the reality that most biopharma companies remain relatively ignorant about the nature of open source software. In other words, our industry has so far failed to give room at the table to companies selling open source software.

Why should open source software firms be invited to the table?

The simple answer is that many of these companies have equal or better solutions than proprietary software vendors. They should not be left out of the running simply because they sell open source software. And, if they can answer the questions posed above in a satisfactory or superior manner, perhaps they deserve to win.

OK, but I'm nervous about this whole idea of software that can be changed by anybody. Why shouldn't I be worried about that?

Perhaps it would be best if I answer this question by describing how one open source software company is handling this issue. That company is Alfresco Software, a company specializing in Enterprise Content Management and thus a competitor to products like EMC Documentum and Microsoft Sharepoint.

Note: I will write more about Alfresco's products in another article. For now, I want to focus on open source software in general.

As an open source software company, Alfresco allows its source code to be modified by anyone who is willing to sign its software licensing agreement. In fact, the company actively encourages software developers to "play" with its code and contribute them back for evaluation. However, Alfresco has its own development team and has a rigorous software development methodology in place. What's different about Alfresco, however, is that two versions of the software are released at regular intervals, Alfresco Community and Alfresco Enterprise.

As you may be able to guess, Alfresco Community is the version available to external developers while Alfresco Enterprise is the locked-down code sold to companies for commercial use. The key differences between these versions is explained here. In short, however, the Enterprise version is sold under terms roughly equivalent to those you would expect from a proprietary software vendor.

Now imagine the tremendous advantage that such a two-pronged product strategy can bring to Alfresco and to its clients. Rather than have only its captive developers trying new things or enhancing the software, the entire open source community is able to help out. Indeed, there is nothing stopping your own software developers from looking at the latest Alfresco Community source code, modifying it directly to show a new or enhanced capability, or making suggestions to management or Alfresco based on their understanding of the source code.

This approach essentially lets you have your cake and eat it too. On the one hand, companies are assured that the software they buy is fully supported and regularly upgraded. On the other, they can directly influence the direction the software package takes or leverage new ideas and methods brought to the table by a much larger software development community.

OK, but isn't the list of open source software packages for our industry really short?

Yes, that's true at the moment. Until our industry takes the time to fully understand what open source is all about, the list of vendors will continue to stay small. However, there are some viable vendors for you to evaluate now:

  • Alfresco for content management and collaboration;
  • Akaza Research for EDC;
  • NCI Firebird for investigator registry;
  • CRIX for industry, academia and regulatory agency collaboration



Another platform we use is Jumper Networks for managing distributed data.

EDC Consultant

For an open source product to be successful, it requires specific attributes. In particular, it must be modular and extensible. If you look at Linux, much of the reason for the products success was the potential to expand and extend how the OS worked. If the Open Source product does not meet requirements, there is no 'vendor' as such to customize it. The PhoSCo product was based on the original IBM ClinWare RDE system from the late 90's. Although it made use of TCP/IP for data communication, it wasn't architected to make full use of the internet or browsers. Also, the product worked well for study by study implementations, but was not geared up for enterprise scaling or volume hosting. Novartis extended, and built around the system to fill the gaps, but it required a lot of manual effort. OpenClinca has applied some modern principles, but, it will still mainly be used by academics or companies with no budget for software or services. Good EDC requires Software, Processes and Services.

For further discussions around EDC system topics, see

- Former PhOSCo Developer

Gene Xiao

I didn't see OpenClinica mentioned. It seems to me like its probably the most widely used open source system for EDC and clinical data management.

The comments to this entry are closed.