More genomic data is produced today than ever before as it’s fundamental to the development of new medicines. But although there is a lot of data out there, much of it is inaccessible. With a background in computer science and bioinformatics, Fiona Nielsen, co-founder and CEO of Repositive, decided to overcome this bottleneck and open up genomic data to the industry using cancer models.
Nielsen first encountered the challenges of accessing data when she started working at Illumina in 2011. During her time there, she would analyze cancer samples from both academic and clinical collaborators. One day, she came across a fusion gene that had not been published anywhere before. Was it a potential new biomarker for that particular type of cancer?
“The only way to find out whether the fusion gene had potential was to gather a lot of data from that cancer as well as from other types of cancer,” Nielsen recalled. “But I never had access to the data I needed because the data is fragmented in silos around the world. Not only that, I only had a couple of months working on that project before I was on to the next project. Maybe there was a discovery to be made. I don’t know.”
This experience and others she’d had in academia frustrated Nielsen because she felt that available data was not being used efficiently for research. She left Illumina in 2013 and founded the charity DNAdigest to make data access and data sharing easier within the life science industry.
After recognizing that the business wasn’t sustainable as a charity, Nielsen co-founded Repositive together with former Illumina colleague Adrian Alexa in August 2014. Based in Cambridge, UK, Repositive develops software tools to make accessing and sharing genomic data easier and more efficient.
Repositive mainly focuses on preclinical cancer models, such as those based on cells, animals, and human tissue like patient-derived xenografts. Repositive works closely together with 21 global contract research organizations (CROs) and collects the genomic data of these CROs’ preclinical cancer models, processes it, and standardizes it within its database.
“A researcher who is currently testing a cancer drug and is looking for a cancer model with very specific genetic characteristics can easily use our online platform instead of going from provider to provider,” Nielsen explained. “It’s similar to how you would search for a flight. You input some criteria: I want to go from A to B, on this date, below this price. Similarly, the researchers have criteria about the cancer model: it has to have a certain mutation, it has to express this gene, it should be a liver or breast cancer model, it should be a patient-derived xenograft model.”
Researchers can use Repositive’s online platform to find a matching cancer model, such as an organoid or cell or mouse model. If they can’t find what they are looking for online, a specialized team can do a more in-depth search of the database. After that, Repositive runs its Cancer Model Scout service: the team reaches out to its expansive network and searches for the cancer model that best matches the researchers’ criteria.
“In the case of preclinical cancer research, we found that enabling access to the data of cancer models solves a huge problem for researchers using these cancer models as well as for organizations that provide these cancer models,” Nielsen said. “It’s a two-sided search that is solved by making the genomic data available. Repositive acts as the intermediary that facilitates organizing the data, standardizing the data, and sorting the data.”
However, when she first founded Repositive, Nielsen encountered a lack of incentive to share data, especially from academia but also from the industry. Academics often don’t feel the need to share research data unless they have an incentive to do so because it requires a lot more work. The industry, on the other hand, stringently protects its data.
“The companies that own cancer models consider any data they generate their IP, so they guard it very carefully and are careful about who they give it away to,” Nielsen explained. “At the same time, the cancer models have commercial value and they need to make the data available if they want to find potential customers.”
In academia, researchers gain credit for the number of papers they publish, not for the amount of data they generate. Moreover, researchers only make data available if the public has asked them to, Nielsen pointed out.
“The challenge comes down to how academics work. They generate some data, they investigate that data, they get results, and then they publish those results. Only then do they maybe make the data available. So today, data generation and data sharing don’t give researchers as much credit as they get from writing papers, and that’s one of the biggest misalignments in academia.”
To counteract this, academics need to receive incentives from funding or government agencies to share data, Nielsen explained. Although some funding agencies and publishers have already started this process, it is still far from perfect. For example, some funding agencies now require researchers to include a data management plan in their grant applications. However, it is still difficult to check whether the data has actually been made publicly available, Nielsen added.
“There’s a lot of goodwill and many researchers that will do a lot of work to do the right thing. But the incentives are just too small for enough people to do the right thing, which is sharing their research data.”
With Repositive, Nielsen is working on making data more accessible for preclinical research. In an ideal world, she believes data sharing would be available to anyone. Results would be robust, reliable, and reproducible. Preclinical drug discovery would speed up and enable a faster readout of potential biomarkers and drug targets.
In a move to boost preclinical cancer research, Nielsen and her team at Repositive made their Cancer Models Platform publicly accessible in July 2019. This means that researchers can freely browse the metadata of 5,300 preclinical models from Repositive’s CRO partners. Since November 2019, this data is also available on Scientist.com, enabling Repositive to connect with an even larger community of biotech and pharma researchers worldwide.
“For just about any scientific question you ask, there’s a limit to how much you can conclude if you have a limited amount of data available,” Nielsen said. “My driver and my belief are that a lot of the bottleneck in advancing precision medicine is really in data access. Data is not just a ‘nice to have’, it’s a necessity to advance medicine. Imagine what we could do if we all had access to as much data as possible.”
As Repositive scales up its worldwide offering and works to make genomic data for preclinical cancer research more available, it remains to be seen how data sharing and data accessibility will evolve within the biotech and pharma industry. But as funding bodies and publishers give more incentives to researchers, we can be positive that easier data access is on the horizon.
Images via Elena Resko