Embracing big data and data transparency
in healthcare research and practice.

24 January 2019
Someone swimming in the middle of the sea

Clinical and biomedical research and healthcare practice are typically data-intensive in that they involve the generation, collection, storage, and use of large volumes of data at various stages and from diverse sources. These pieces of information are collectively termed “big data.” In the biomedical and healthcare context, big data could include information from different stages of clinical trials, treatment procedures and outcomes, specific interventions, conditions in randomized controlled trials, detailed patient records, demographical information especially for large-population cohorts, behavioral data about patients or users of online communities, medical claims, unpublished trials, and so on.

In recent years, there has been an increased focus on optimizing this big data to bring about improvements in healthcare research and practice. A larger cultural shift—from sharing this data within traditional journal publications to the need to make it more widely available and accessible with the aim of promoting an “information-based culture”—is also underway.1 In this post, we look at the role of big data and the emerging focus on increased transparency in healthcare research, publication, and practice.

What do data transparency and big data bring to the table?

The trends of data sharing and transparency being embraced by the scholarly publishing industry are also finding their way into the pharma industry, largely because of the realization that even though traditional journal publication shares critical findings, a bulk of the data that lies at the core of pharma R&D exists in silos and is not easily accessible.2,3 The underlying idea is that making biomedical and clinical data open will dissipate the opaqueness around research data, increase trust in pharma, and open avenues for innovation and collaboration.3 According to Simon Goudie, Senior Journal Publishing Manager at Wiley, “In a world where truth can be hard to determine, the transparency of open data ensures that published results can be interrogated and verified.”

Big data presents opportunities to identify newer patterns and undertake predictive analyses for specific problems or conditions. For example, having access to terabytes of relevant data could open the floodgates for replication in research, enabling researchers to explore newer treatment procedures4 by building upon previously published data. Big data could widen the evidence base available to practitioners, furthering the evolution of precision medicine and elevating the quality of treatment and care they can provide. Pharma companies can use this data to improve their understanding of HCP needs and prescription patterns, introduce efficiencies in the drug development and distribution process, and devise more effective promotion strategies. Thus, big data offers countless opportunities for healthcare research, publishing, and practice.

Caveats in data transparency

Despite these obvious benefits, the sharing of biomedical and clinical data comes with its own caveats that need to be addressed. These include concerns such as the potential misrepresentation or misinterpretation of scientific findings, leading to the dissemination of fake news in mainstream media; issues that could arise from sharing data that is not peer reviewed; and the need to build safeguards against the violation of patients’ privacy or copyright conditions.

Data sharing initiatives

While these concerns are relevant, the benefits of open data outweigh the downsides and several initiatives have been launched to fast-track innovation and place data in a relevant context. For example, the European Medicines Agency’s policy mandates that clinical reports be made available for new medicines authorized by the European Union,5 while the AllTrials Campaign6 initiative advocates that all clinical trials be listed in a relevant registry, all results reported, and all data shared openly.

Further, several major publishers, like Wiley, are recognizing the need for data transparency. Wiley has a robust suite of policies that support researchers’ efforts to share data, link it with peer reviewed articles, and cite it. Wiley policies, in the context of policies from other major publishers and journals, are reviewed in this blog from David Mellor, at Center for Open Science.

“Wiley is committed to open science and the role of data in improving the openness, transparency and reproducibility of research,” says Goudie. “Authors in Wiley journals are encouraged, and in many cases required, to make the data that underpins the research presented in their articles accessible….Data sharing is quickly becoming expected practice, and should form a part of every researcher’s study design and management plan. The source of many concerns, i.e., sensitive data, such as patient information or ecological material, can be managed and shared through careful collection and curation or by instead making metadata available that describes the information collected. With more expectations to make data accessible, there is quickly becoming no excuse for researchers to not prepare accordingly – or to ignore the possibilities that become available to them of sharing their own data and using data that has been shared by others.”

Big data has the potential to transform biomedical and clinical research and healthcare practice and outcomes. Using big data effectively and working on measures to increase data transparency and accessibility would open up new opportunities for innovation and collaboration. In our next post, we will talk about data management practices to help ensure the best and most ethical use of big data.


1. Margolis, R.; Derr, L.; Dunn, M.; Huerta, M.; Larkin, J.; Sheehan, J. Guyer, M.; Green, E. D., 2014, The National Institutes of Health’s Big Data to Knowledge (BD2K) initiative: capitalizing on biomedical big data, J Am Med Inform Assoc. Nov; 21(6): 957–958.

2. The Journal of mHealth, Data sharing: How pharma can benefit, http://thejournalofmhealth.com/data-sharing-how-pharma-can-benefit/

3. Ward, P., 2016, EMA gives open access to clinical data on new medicines, http://www.pharmexec.com/ema-gives-open-access-clinical-data-new-medicines-0

4. Schneeweiss, S., 2014, Learning from big health care data, N Engl J Med. 370:2161-2163, DOI: 10.1056/NEJMp1401111

5. European Medicines Agency, Opening up clinical data on new medicines, 2016, http://www.ema.europa.eu/ema/index.jsp?curl=pages/news_and_events/news/2016/10/news_detail_002624.jsp 6. AllTrials, http://www.alltrials.net/