Challenges to using and sharing big data in dementia research
Our interviewees revealed a range of deep-seated challenges to harnessing big data for dementia research as illustrated by the iceberg analogy. Hurdles in relation to technology are evident, but can largely be overcome. Below the surface are issues around how data collection, analysis and sharing are managed, as well as underlying people-related challenges.
Above the surface is the technical challenge in relation to mechanisms for sharing data securely and the need for common standards to pool data more easily. Establishing robust yet flexible core data standards can make data more sharable by design and save researchers time and effort. Consent needs to be set up in a way that it is understood by individuals and protects them against data misuses with effective enforcement mechanisms, but without unduly hampering the potential of research and routine data from a variety of sources and the ability for scientists to collaborate beyond borders and across time.
Below the surface are challenges of process and organisation, including the need to ensure a favourable ecosystem for research with stable and beneficial legal frameworks, and links to pharmaceutical companies and other private organisations for exchange of data and expertise, and connecting research findings to future prevention strategies and treatments. Similarly important is sustainable funding for data infrastructures, while funders at the same time can also have considerable influence on how research data, in particular, are made available.
The most fundamental level relates to the people involved in dementia research: the scientists, but also the policymakers, regulators, private partners, patients and research participants. We need more people with appropriate skills to manage big data, apply their imagination, and employ novel analytical approaches. These people must be trained, be connected across disciplines, and be given incentives: Currently, there are relatively few incentives or career rewards that accrue to data creators and curators. Rather than a one-size-fits-all model of research and publication, additional ways to recognise the value of shared data must be built into the system. Finally, everyone involved must shift their thinking to adopt a mindset towards responsible data sharing, collaborative effort, and long-term commitment to building the two-way connections between basic science, clinical care and the increasingly fluid boundaries of healthcare in everyday life.
Advancing dementia research requires addressing “all of the iceberg”, as only by tackling the challenges at all levels jointly, will change happen – without being held back by the weakest link in the chain. Some competing interests may need to be balanced with each other: While privacy concerns about digital, highly sensitive data are important and should not be deemphasised as a subordinate goal to advancing dementia research, they can be balanced with the openness required through releasing data in a protected environment, allowing people to voluntarily “donate data” about themselves more easily, and establishing governance mechanisms that safeguard appropriate data use for a wide range of purposes, especially in instances in which the significance of data changes with its context of use.
Next steps for unlocking the value of big data for dementia research
There is no lack of data for dementia research, but we need to exploit it more effectively: Sharing data globally across research teams and tapping into new data sources is a prerequisite for resources to be exploited more fully. At the same time, collaboration between dementia researchers and those disciplines researching factors “below the neck” (such as cardiovascular or metabolic diseases) is important – while collaboration with engineers, physicists or innovative private sector organisations may prove fruitful for tapping into new sources and skill sets.
It is worth highlighting that no one nation has it all, but complementarities exist. Global data sharing and collaboration can help to exploit data more fully and leverage more researchers to spend time on analysing, rather than collecting new data. Dementia is a disease that concerns all nations in the developed and developing world. To enable global collaboration, it is crucial that just as diseases do not respect national boundaries, neither should research into dementia and funding of data infrastructures be seen as purely a national or regional priority.
Dealing with big data is not entirely new, but requires some new operational procedures with larger and more complex types of data. As data are combined from different research teams, institutions and nations, funded by a variety of organisations, and even combined with data from outside the medical realm, new access models will have to be developed that make data widely available while protecting privacy as well as the personal, professional, and business interests of the data originator.
To fully capture its potential, big data requires thinking outside of the box. It requires imagination to consider what data sources to use, and how to link in big data being generated routinely across all facets of everyday life, ranging from mobile phone data, to customer data, to tracking data, to government data. All of these have potential for understanding the behaviour and environment of dementia patients not only after diagnosis, but for prevention, early identification and diagnosis, or even to retrospectively analyse the years leading up to diagnosis. For this, we need to develop a culture that promotes trust between people who form part of the data, and those capturing and using the data.
At the same time, big data also offers new forms of potential involvement for individuals. Actively involving people in contributing to research by donating their data, participating in consumer-led research, and engaging as citizen scientists can capture valuable user-generated data and yield unexpected benefits. People have a strongly vested interest in their health and the health of their loved ones, and empowering them to be active contributors to science is a way to alleviate the helplessness that many may feel, while also improving the future for themselves, their families, and others who will be touched by dementia.
Finally, we need an ongoing dialogue about new ethical questions that big data raises. We will need to discuss the direct and indirect benefits to participants engaged in research, when it is appropriate for data collected for one purpose to be put to novel uses, and to what extent individuals can make decisions especially on genetic data, which may have more far-reaching consequences. The scientific need to use longitudinal data to understand diseases may also need to be balanced with the fundamental right to privacy and the “right to be forgotten”.
Recommendations: How public policy can help
Policymakers and the international community have an integral leadership role to play in informing and driving the public debate on responsibly using and sharing data alongside with researchers, funders and other stakeholders. More directly, public policy can help to stimulate more innovative uses and sharing of data through a variety of supporting initiatives detailed as follows.
First of all, funding needs to support dementia and research infrastructures for using and sharing data. If data sharing and more widespread use of routine data is to become the norm, we must fund data sharing and infrastructures for doing so, and recognise this as one of the essential costs of good science. Policy should stimulate collaboration between public and private actors.
Public-private partnerships, in-kind donations of data and expertise, government tax incentives for contributions to science, and other innovative mechanisms can help make data for dementia research available – both in relation to pharmaceutical companies, but also supermarket chains, mobile phone companies or start-ups.
There needs to be investment in future health-/bioinformatics talent and increased collaboration with data experts outside dementia research. Universities will need to offer opportunities and funding for education in data science and related areas, create multi-disciplinary centres of excellence, and focus on interdisciplinary, multi-institution and multi-country research.
On an international level, guidelines for consent and Institutional Review Boards (IRB) or Ethics Review Committees (ERCs) need to be agreed on. Reducing uncertainty about whether consents obtained for medical research allow data to be shared beyond an institution, collaboration, or nation, can lower the barriers to sharing while still protecting research participants. Going forward, we need to obtain routinely and purposively collected data in a future-proof way, with governance mechanisms to give confidence that uses of the data remain consistent with an ethical framework that reflects the spirit in which an individual agreed to his or her data being used.
Also beyond national boundaries, a stable and beneficial legal framework must be ensured. We need policies that protect citizens against any undue exploitation of their data that they would not want, but must balance data protection and privacy rights with making medical advances in the interest of everyone. Legislation also needs to account for the growing global research communities in terms of funding and making best use of human and data resources.
It is worth emphasising that all of these recommendations do not just apply to dementia research, but are especially pronounced in this case due to the potential of data from outside the medical realm for advancing dementia research, the current state of research in relation to the aim of having a disease-modifying therapy by 2025, and the high personal, societal and economic importance to improve prevention, diagnosis, treatment and cure across the globe.