Search the COVIDminer database for genes / proteins, chemicals and biological processes using a (partial) name or gene symbol.
The COVIDminer project provides access to a database of interactions between genes / proteins, chemicals and biological processes related to the SARS-CoV-2 (COVID-19) virus.
The interactions have been automatically extracted using text mining. The information shown on this site is thus not curated and is intended to be used as an aid to manual literature curation efforts.
The interactions have been extracted from several sources including the CORD-19 dataset, full-text manuscripts from the bioRxiv collection COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv, abstracts from a broad COVID-related PUBMED search, full-text manuscripts of a subset of these PubMed articles that are available via the PubMed Central Open Access Subset as well as a small collection of hand-selected manuscripts curated as part of the COVID-19 Disease Map project.
Natural language processing (text mining) is performed using the REACH (https://github.com/clulab/reach) reader together with the INDRA (http://www.indra.bio/) toolbox.
This project is run by Rupert Overall (https://rupertoverall.net/, https://twitter.com/rupertoverall)
Database last updated: 2024-11-11
To learn about the functionality of COVIDminer, have a look at the tutorial page.
To report bugs, make feature requests etc., please use the issues tracker at the github repository.
The COVIDminer tool is designed to allow rapid assessment of a large body of literature — it does not aim to provide detailed and accurate interaction information. In order to better serve this purpose, it was decided to collapse closely-related entities as much as possible. Specifically, all mentions of genes or gene products (RNA, proteins) are mapped to the corresponding gene identifier. In addition, genes and proteins from different organisms (or where the organism is unclear) are mapped to the homologous human gene. This means that potential interactions discovered in closely-related species will be visible when searching for interaction partners of human protein identifiers. The resulting links lead to the underlying literature and it is up to the curator to decide if and how that information fits into their curation scheme.
Likewise, all viral genes/proteins will be mapped to SARS-CoV-2 identifiers. There is currently no uniform mapping scheme available for SARS-CoV-2, but we are actively working on improving this and updated interaction data will be continuously added.
Finally, there are many concepts (particularly biological processes) for which there are no good mappings. Where the natural language processing software has detected an entity that cannot be mapped, it will be presented in the network as plain text. Although there is no detailed metadata for such entities, they can often be informative enough to warrant a closer look at the source manuscript. Often, many separate entities with similar names will be present where they should really be collapsed into one. We are working to improve the mapping resources to remedy this problem.