Tuesday 5 January 2010

The life sciences cloud

When a group of life science companies gathered in Cambridge recently to discuss cloud computing, a question on many people’s lips was: “What is it?”

The session, ‘Cloud computing in life sciences’ hosted by ERBI, the industry body for international biotech and healthcare companies, shed some light on this question.

Below I summarise the discussion and highlight some examples of cloud in life sciences.

Defining cloud
According to @IanMcKendrick, life science IT strategist, put simply, cloud is about outsourcing computing. But as Microsoft’s Andy Davies emphasised, cloud solutions have two important features: they are scalable and elastic. This allows computing resource to be acquired and released very quickly.

Low cost computing
The ‘pay-per-use’ model also defines cloud. It means there are no worries about over-provisioning (wasting investment) or under-provisioning (resulting in poor performance).

Overall, the total cost of ownership for a cloud-based application should be less than building in-house resource. From his experience as IT operations manager at the Wellcome Trust, Richard Gough (@chopsm) projected the 3 year costs of running their databases in the cloud being 75% less than in-house.

Discovery on demand
The ability to access computing resource on demand is seen as one of the most powerful benefits of cloud. Eli Lilly & Co. has capitalised on this to cope with ‘spikiness’ of demand. Using Amazon’s Elastic Compute Cloud (EC2) Lilly was able to launch a 64-machine cluster, complete a bioinformatics analysis and shutdown in 20 minutes. The cost was just a few dollars and replaced a process that would have taken 12 weeks internally.

Other pharmaceutical companies have also speeded up their computational drug design processes using the cloud. Also using Amazon Web Services (AWS) protein engineers and informaticians at Pfizer’s Biotherapeutics and Bioinnovation Centre can carry out antibody docking modelling that previously took 2-3 months overnight.

Enquiring into the mind
Understanding how the brain works is a major scientific challenge. It requires knowledge of how information is encoded, accessed, analysed, archived and decoded by networks of neurons. The CARMEN e-science project is designing a cloud system to allow neuroscientists to share, integrate and analyse data. Globally, over 100,000 neuroscientists are working on this problem. Solving it could revolutionise biology, medicine and computer science.

Processing power on tap
Proteomics is the field of large scale study of protein structures and functions and is key to understanding disease processes and designing interventions. One of the main obstacles to starting proteomics programs is setting up computational resource to cope with major data processing requirements. However, a team at the Medical College of Wisconsin used Amazon’s cloud services for low cost, scalable proteomics analysis. This approach allows users to have large scale computational resources on tap at very low cost per run.

Indiana University is also exploring the use of cloud computing for analysing next-generation sequencing data. This is expected to be generated in volumes "one to two orders of magnitude larger" than possible with current computational capabilities.

Trusting the cloud
No discussion on cloud would be complete without raising the aspect of security. However, given the dedicated nature of cloud providers security is high on the agenda and should be class-leading. There is every chance that capability in this respect is ahead of in-house resources; it’s just that if things do go wrong, it tends to get high profile coverage.

Richard Gough also noted that just because an organisation is using the cloud it doesn’t negate the need for it to be managed rigorously. In the acquisition phase strong procurement processes are vital and security people should be involved from day one of the procurement cycle.

Evolving models
Over the last year Microsoft’s Simon Davies has been working with early adopters of Windows Azure, its cloud-based platform. He says whilst the cloud enables users to achieve lots of things they couldn’t before, it’s “not a panacea for everything.” This is not to say that Microsoft is not committed to cloud, rather they believe that ‘software plus services’ is a better and more flexible approach.

Moving forward, there will be a number of models that businesses can choose from to suit their requirements. This is an evolving ecosystem that ranges from the raw computing power provided by the likes of Amazon through platforms that allow others to develop (Windows Azure, Google App Engine) to web-based services (Science Warehouse, Salesforce.com) and the wealth of consumer-oriented online offerings like Facebook and Twitter.

At whatever level organisations engage with the cloud the future is all about being able to do more for less.

Links to additional information quoted in this article:

Eli Lilly bioinformatics
Pfizer antibody docking
CARMEN brain research project
Wisconsin proteomics
Indiana sequencing
Economist mag debate on cloud computing (pits Salesforce.com's Marc Benioff against Microsoft's Stephen Elop)
A great series of articles on Cloud Computing in Life Sciences published by BioITWorld is also available here.

This post is an edited and updated version of my post on the Science Warehouse site: http://bit.ly/8h0RQT