In recent years, artificial intelligence has been hailed as the next big thing in drug discovery. But to leverage the full potential of this technology the life sciences industry first needs to pay attention to experimental design and data collection.
We embrace the latest digital technologies to enhance almost everything in our daily lives—from how we communicate, learn, travel, entertain ourselves, and so much more. But in the laboratory, we more often defer to the tools we’re familiar with or rely on manual practices.
The benefits of adopting digital tools in the life sciences industry are plentiful: new functionality, better insights, efficiency, and automation of error-prone processes, just to name a few. The common theme in recent years is that the endpoint of fully contextualized, reproducible, and unbiased scientific data should be the focus of digitization efforts. Yet, such high-quality data is rarely achieved. The industry is still at the start of a journey of reimagining how to generate high-quality and complete scientific data.
Is the life sciences industry ready to embrace digital technology?
Pharma and biotech industry leaders are delivering ever more drug products to the market. The industry boasted 53 FDA drug approvals in 2020, continued revenue growth, healthy R&D investments, and M&A deals to secure future pipelines. However, I believe the issue is not about what the current quality of R&D data is able to deliver today, but rather what is not being delivered tomorrow as a direct result of incomplete, insufficiently contextualized data.
The Covid-19 pandemic has increased the adoption of digital tools across the pharma industry. However, uptake has been fragmented, and organizations are at vastly different stages of the digital journey. Life science leaders and scientists are increasingly looking to artificial intelligence (AI) and machine learning (ML) to extract more value out of their R&D data. But the reality is that these exciting and powerful technologies are unable to find purpose in life sciences, which in large part is due to how we generate and present data.
The challenge is that most canonical laboratory environments continue to fall short of adopting the technology stack that unlocks the potential of AI and ML. This is contributing to a ‘winner-takes-all’ situation for pharmaceuticals; companies that invest in early adoption of AI concurrently develop an organizational culture and skillsets that embrace advanced data science, establishing a competitive gulf. We can already see some winners, such as Exscientia and Recursion, emerging in the race to shift from drug discovery to drug design.
So how can life science companies avoid falling behind? To generate the data required to validate AI and ML tools for drug design and other research applications, organisations need more efficient cloud services and automation.
Very often, where automation exists, scalable software-based data collection and query tools are inadequate or non-existent. As a result, information is being lost which would otherwise help contextualize the experimental data for other users and as inputs into AI and ML tools. While some organizations are reaching AI and ML superiority, the industry quorum isn’t good enough for that technology to be deployed into laboratories with minimal friction in a matter of days or weeks.
Experimental design is the sharpest tool in the scientist’s toolbox
Producing new and better datasets using automation is essential for AI and ML. But simultaneously, experimental design rarely gets the attention it deserves. Statistical analysis is typically bolted onto the end of an experimental campaign rather than implemented in its conception. In these instances, the experiment’s real statistical and insight-rich potential cannot be reached.
Fundamentally, the scientific community’s data should be generated with better design in mind. Statisticians need to be brought in upstream, at the experimental design stage, not at the end of a flawed experimental campaign.
Using the right experimental design gets us closer to the answer, accounts for bias, can identify higher-order interactions and generates insightful landscapes advising us where to go next. The data generated is more statistically significant and can be resurrected and useful in future. From here, ML algorithms could be applied to interrogate complete data sets in order to search for new drug candidates to de-risk portfolios, protocols, development processes, and more.
In summary, better experimental design produces more valuable data.
Better data starts with automation software
Fundamentally, the scientific automation journey will always tend to generate better data. Cloud-based software can enable rapid, no-code building of automation protocols, easy sharing and modification, and automatic structuring of the data generated, capturing every step of the experiment for future use. With better automation, the flexibility-craving R&D scientist is empowered to run more statistically powerful experiments, and protocols are subsequently captured within electronic notebooks that are more complete and reproducible.
Nowadays, more intuitive and less intimidating user interfaces are available. Scientists are able to envision experiments they wouldn’t otherwise design, such as those requiring sophisticated multifactorial techniques. And they can generate data that is more contextualized and interoperable with other analysis tools.
However, without a way of sharing, adapting, and democratizing the scientific protocols used to generate the data, these solutions remain bespoke and the scientific community doesn’t benefit from the protocols and data created. For those seeking to transform the industry and create relevant networks, sharing scientific protocols would likely be a key agenda.
Business models will welcome the cloud
With better experimental design and improved automation software, opportunities abound to generate high-quality data and inform the next generation of therapeutics. Central to enabling this change are new business models where life science companies look externally to build out their technology stack. Software-as-a-service platforms hosted on the cloud, such as Synthace and Benchling, are able to augment and eventually replace existing on-premise solutions.
Two of the most attractive advantages for R&D teams to adopt cloud platforms are the rapid and frequent upgrades in software capabilities and the endless possibilities for integration with other cutting-edge technologies, such as closed-loop experimental design and quantum computing, which would otherwise require huge internal software engineering investments.
The quality, origin, and completeness of R&D data are the most important outcomes of superior experimental design and adopting automation. This generates more statistically powerful and historically comparable data sets, allowing the use of more sophisticated analysis tools such as AI and ML.
To best leverage the leaps and bounds in which modern computing is progressing, the life sciences industry must reflect on scientific data strategies and implement better practices for generating contextualized data with automation, curating and sharing reproducible data sets, and plumbing robust data pipelines to feed AI and ML tools. By doing so, our data will be more readily available for the next generation of innovators.
It’s a momentous time for life sciences, with an unprecedented opportunity to create more complete insights, share richer science, and finally build more consistently on the shoulders of giants. This new wave of cloud technology for life sciences R&D will transform our understanding of key scientific questions and will ultimately transform our way of life.
Dr. Vishal Sanchania, Head of Biology at Synthace, is an expert implementer of sophisticated experimental design with a keen interest in the digital transformation of R&D. He leads the lab team of Synthace, which demonstrates scientific applications and innovations and helps steer key product features. Vishal has a PhD in Biophysics and Biotechnology and a BSc in Biochemistry from University College London.