How Open Science and Open Data can help in the fight against COVID-19
With the global Novel Coronavirus pandemic filling headlines, TV news space and social media it can seem as if we are drowning in information and data about the virus. With so much data being pushed at us and shared it can be hard for the general public to know what is correct, what is useful and (unfortunately) what is dangerous.
Thankfully though, scientists around the world working on COVID-19 are able to work together, share data and findings and hopefully make a difference to the containment, treatment and eventually vaccines for COVID-19.
Open Science and Open Data
While cities are locked down and borders are closed in response to the coronavirus outbreak, science is becoming more open. This openness is already making a difference to scientists’ response to the virus and has the potential to change the world.
Open science can come in a variety of forms, including open data, open publications and open educational resources.
1. Open data
DNA sequencing is of great importance to developing specific diagnostic kits around the world. Yong-Zhen Zhang and his colleagues from Fudan University in Shanghai were the first to sequence the DNA of the novel coronavirus. They placed the gene sequence in GenBank, an open-access data repository. Researchers around the world immediately started analyzing it to develop diagnostics.
As of February 19, 2020, 81 different coronavirus gene sequences had been shared openly via GenBank and 189 via the China National Genomics Data Centre. They provide the data that will allow scientists to decode the mystery of the virus and hopefully find a treatment or vaccine.
The WHO and national organizations like the Chinese Center for Disease Control and Prevention also publish open statistical data, such as the number of patients. This can help researchers to map the spread of the virus and offer the public up-to-date and transparent information.
2. Open publications
Science publications are costly. One of the most expensive Elsevier journals, Tetrahedron Letters, costs £16,382 for an institutional annual subscription and £673 for a personal one. Even the University of Harvard cannot afford to subscribe to all journals. This means not all researchers have access to all subscription-based publications.
Authors can publish their articles free to access, which often means they need to pay the publishers an average £2,000 in article processing costs. In 2018, only 36.2 percent of science publications were open-access.
As of February 18, 2020, there were 500 scientific articles about the novel coronavirus in the comprehensive scholarly database Dimensions. Only 160 (32 percent) of them were in open-access publications. This includes preprint servers such as bioRxiv and arXiv, which are widely used open-access archives to publish research before it goes through scientific peer review.
Normally, you would need to pay subscription fees to read any of the other 340 articles. However, articles published by the 100 companies who have signed the Wellcome Trust’s statement on sharing coronavirus research have been made freely accessible by publishers.
Major publishers including Elsevier, Springer Nature, Wiley Online Library, Emerald, Oxford University Press and Wanfang have also set up featured open-access resources page. The Chinese database CQVIP has offered free access to all of its 14,000 journals during the coronavirus outbreak.
Free access to articles on the coronavirus can also accelerate global research on this subject.
3. Open educational resources
Due to the outbreak, universities in China have postponed their new semesters and switched to online learning. But alongside the 24,000 online courses open to students, universities (including the elite Peking University, Tsinghua University and Xi’an Jiaotong University) are offering free online courses to the public about the coronavirus. Such courses can offer the public reliable information grounded in academic research, helping them better understand and protect themselves against the virus.
Sharing Research Data
The Wellcome Trust calls on researchers, journals and funders to ensure that research findings and data relevant to this outbreak are shared rapidly and openly to inform the public health response and help save lives.
"We affirm the commitment to the principles set out in the 2016 Statement on data sharing in public health emergencies, and will seek to ensure that the World Health Organization (WHO) has rapid access to emerging findings that could aid the global response.
"Specifically, we commit to work together to help ensure:
- all peer-reviewed research publications relevant to the outbreak are made immediately open access, or freely available at least for the duration of the outbreak
- research findings relevant to the outbreak are shared immediately with the WHO upon journal submission, by the journal and with author knowledge
- research findings are made available via preprint servers before journal publication, or via platforms that make papers openly accessible before peer review, with clear statements regarding the availability of underlying data
- researchers share interim and final research data relating to the outbreak, together with protocols and standards used to collect the data, as rapidly and widely as possible - including with public health and research communities and the WHO
- authors are clear that data or preprints shared ahead of submission will not pre-empt its publication in these journals
"We intend to apply the principles of this statement to similar outbreaks in the future where there is a significant public health benefit to ensuring data is shared widely and rapidly.
"We urge others to make the same commitments. If your organisation is committed to supporting these principles, please contact us (firstname.lastname@example.org) and we will add your organisation to the list of signatories."
A completely new culture of doing science
On 22 January, Dave O’Connor and Tom Friedrich invited several dozen colleagues around the United States to join a new workspace on the instant messaging platform Slack. The scientists, both at the Wisconsin National Primate Research Center, had seen news about a new disease emerging in China and realized researchers would need a primate model if they were going to answer some important questions about its biology. “We put out a call to a bunch of investigators and basically said: ‘Hey, let’s talk,’” O’Connor says. The idea is to coordinate research and make sure results are comparable, Friedrich adds. (They named the Slack workspace the Wu-han Clan, a play on the hip-hop group Wu-Tang Clan.)
The Wu-han Clan is just one example of how the COVID-19 outbreak is transforming how scientists communicate about fast-moving health crises. A torrent of data is being released daily by preprint servers that didn’t even exist a decade ago, then dissected on platforms such as Slack and Twitter, and in the media, before formal peer review begins. Journal staffers are working overtime to get manuscripts reviewed, edited, and published at record speeds. The venerable New England Journal of Medicine (NEJM) posted one COVID-19 paper within 48 hours of submission. Viral genomes posted on a platform named GISAID, more than 200 so far, are analyzed instantaneously by a phalanx of evolutionary biologists who share their phylogenetic trees in preprints and on social media.
“This is a very different experience from any outbreak that I’ve been a part of,” says epidemiologist Marc Lipsitch of the Harvard T.H. Chan School of Public Health. The intense communication has catalyzed an unusual level of collaboration among scientists that, combined with scientific advances, has enabled research to move faster than during any previous outbreak. “An unprecedented amount of knowledge has been generated in 6 weeks,” says Jeremy Farrar, head of the Wellcome Trust.
Unraveling the Chinese coronavirus
Just 10 days after a pneumonia-like illness was first reported among people who visited a seafood market in Wuhan, China, scientists released the genetic sequence of the coronavirus that sickened them. That precious bit of data, freely available to any researcher who wanted to study it, unleashed a massive collaborative effort to understand the mysterious new pathogen that has been rapidly spreading in China and beyond.
At unprecedented speed, scientists are starting experiments, sharing data and revealing the secrets of the pathogen — a race that is made possible by new scientific tools and cultural norms in the face of a public health emergency.
“The pace is unmatched,” said Karla Satchell, a professor of microbiology-immunology at Northwestern University Feinberg School of Medicine. “This is really new. Lots of people [in science] still try to hide what they’re doing, don’t want to talk about what they’re doing, and everybody out there is like: This is the case where we don’t worry about egos, we don’t worry about who’s first, we just care about solving the problem. The information flow has been really fast.”
From "Scientists are unraveling the Chinese coronavirus with unprecedented speed and openness" - The Washington Post
Responsible open science
While all these developments are positive, it is important to remember that open science doesn’t mean science without limits. It must be used responsibly by researchers and the public.
To start, researchers need to have mutual respect for the integrity of their work. For example, there have reportedly already been disagreements over whether scientists need to request consent to reuse pre-publication data from shared coronavirus gene sequencing.
Assuming researchers act in good faith and not to simply further their own careers, it is still important for them to clarify the conditions with which they make their research available, and to carefully check and follow such conditions when using other people’s data. Responsible uses of pre-publication data are vital to fostering “a scientific culture that encourages transparent and explicit cooperation”.