Pharmaceutical companies generate a wealth of data and, with most investing in technologies to enable high throughput experimentation and automation, the speed at which new data is generated is rapidly increasing. However, most organizations are unable to gain actionable insights from this valuable resource and realize its true potential. In fact, it is estimated that over 50% of the cumulative knowledge generated by pre-clinical research is not reproducible1. Considering the US alone, this translates into tens of billions of dollars of R&D investment that cannot be exploited effectively.
To try and extract more value from their data, most companies have invested their data science capabilities, but the predominant role of typical data scientist is still that of a ‘data janitor’, with the vast majority of their work focused on collecting, cleansing, formatting and linking data and less than 20% of their time actually analyzing data2.
The root cause of both of these issues is that R&D data is typically spread across multiple disconnected silos which do not store data in a consistent manner, which presents a challenge when trying to find and make connections between data. It also limits the ability of an organization to make use of advanced computational techniques, such as artificial intelligence and machine learning (AI/ML).
In response to these issues, most Pharmaceutical companies have launched initiatives aimed at improving the management and stewardship of data, often inspired by the work of the wider R&D community, such as the F.A.I.R. data movement3. The appeal of is clear. After all, who wouldn’t want their data to be Findable, Accessible, Interoperable and Reusable? However, more recently many industry leaders have been highlighting the problems of not having F.A.I.R. data. For example, a recent report estimates that data that is not findable, understandable or has incomplete metadata introduces inefficiencies and negatively impact research quality, ultimately costing the European economy in excess of €26 billion per year4.
In this white paper, we explore what the F.A.I.R. principles mean in practical terms for your R&D data management strategy before describing how IDBS enables organizations to make their scientific data Findable, Accessible, Interoperable and Reusable and mitigate the costs of not being F.A.I.R.