1243969
doi
10.5281/zenodo.1243969
oai:zenodo.org:1243969
Altman, Russ B.
Stanford University
A global network of biomedical relationships derived from text
Percha, Bethany
Icahn School of Medicine at Mount Sinai
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
natural language processing
Medline
text mining
relation extraction
unsupervised learning
<p>This repository contains labeled, weighted networks of chemical-gene, gene-gene, gene-disease, and chemical-disease relationships based on single sentences in PubMed abstracts. All raw dependency paths are provided in addition to the labeled relationships.</p>
<p>PART I: Connects dependency paths to labels, or "themes". Each record contains a dependency path followed by its score for each theme, and indicators of whether or not the path is part of the flagship path set for each theme (meaning that it was manually reviewed and determined to reflect that theme). The themes themselves are listed below and are in our paper (reference below).</p>
<p>PART II: Connects sentences to dependency paths. It consists of sentences and associated metadata, entity pairs found in the sentences, and dependency paths connecting those entity pairs. Each record contains the following information:</p>
<ul>
<li>PubMed ID</li>
<li>Sentence number (0 = title)</li>
<li>First entity name, formatted</li>
<li>First entity name, location (characters from start of abstract)</li>
<li>Second entity name, formatted</li>
<li>Second entity name, location</li>
<li>First entity name, raw string</li>
<li>Second entity name, raw string</li>
<li>First entity name, database ID(s)</li>
<li>Second entity name, database ID(s)</li>
<li>First entity type (Chemical, Gene, Disease)</li>
<li>Second entity type (Chemical, Gene, Disease)</li>
<li>Dependency path</li>
<li>Sentence, tokenized</li>
</ul>
<p>The "with-themes.txt" files only contain dependency paths with corresponding theme assignments from Part I. The plain ".txt" files contain all dependency paths.</p>
<p>This release contains the annotated network for the <strong>April 22, 2018 version of PubTator</strong>. The version discussed in our paper, below, is an older one - from April 30, 2016. If you're interested in that network, it can be found in Version 1 of this repository. We will be releasing updated networks periodically, as the PubTator community continues to release new versions of named entity annotations for Medline each month or so.</p>
<p>------------------------------------------------------------------------------------<br>
REFERENCES</p>
<p>Percha B, Altman RBA (2017) A global network of biomedical relationships derived from text. (In press at <em>Bioinformatics</em>.)<br>
Percha B, Altman RBA (2015) Learning the structure of biomedical relationships from unstructured text. <em>PLoS Computational Biology,</em> 11(7): e1004216.</p>
<p>This project depends on named entity annotations from the PubTator project:<br>
https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/PubTator/</p>
<p>Reference:<br>
Wei CH et. al., PubTator: a Web-based text mining tool for assisting Biocuration, Nucleic acids research, 2013, 41 (W1): W518-W522. doi: 10.1093/nar/gkt44</p>
<p>Dependency parsing was provided by the Stanford CoreNLP toolkit:<br>
https://stanfordnlp.github.io/CoreNLP/index.html</p>
<p>Reference:<br>
Manning, Christopher D., Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55-60.</p>
<p>------------------------------------------------------------------------------------<br>
THEMES</p>
<p><strong>chemical-gene</strong><br>
(A+) agonism, activation<br>
(A-) antagonism, blocking<br>
(B) binding, ligand (esp. receptors)<br>
(E+) increases expression/production<br>
(E-) decreases expression/production<br>
(E) affects expression/production (neutral)<br>
(N) inhibits</p>
<p><strong>gene-chemical</strong><br>
(O) transport, channels<br>
(K) metabolism, pharmacokinetics<br>
(Z) enzyme activity</p>
<p><strong>chemical-disease</strong><br>
(T) treatment/therapy (including investigatory)<br>
(C) inhibits cell growth (esp. cancers)<br>
(Sa) side effect/adverse event<br>
(Pr) prevents, suppresses<br>
(Pa) alleviates, reduces<br>
(J) role in disease pathogenesis</p>
<p><strong>disease-chemical</strong><br>
(Mp) biomarkers (of disease progression)</p>
<p><strong>gene-disease</strong><br>
(U) causal mutations<br>
(Ud) mutations affecting disease course<br>
(D) drug targets<br>
(J) role in pathogenesis<br>
(Te) possible therapeutic effect<br>
(Y) polymorphisms alter risk<br>
(G) promotes progression</p>
<p><strong>disease-gene</strong><br>
(Md) biomarkers (diagnostic)<br>
(X) overexpression in disease<br>
(L) improper regulation linked to disease</p>
<p><strong>gene-gene</strong><br>
(B) binding, ligand (esp. receptors)<br>
(W) enhances response<br>
(V+) activates, stimulates<br>
(E+) increases expression/production<br>
(E) affects expression/production (neutral)<br>
(I) signaling pathway<br>
(H) same protein or complex<br>
(Rg) regulation<br>
(Q) production by cell population</p>
Zenodo
2018-05-11
info:eu-repo/semantics/other
1035252
1579893914.980771
1427076854
md5:b947d498a7c54b051aab2e75b5cb173f
https://zenodo.org/records/1243969/files/part-ii-dependency-paths-chemical-disease-sorted.txt.gz
395887502
md5:02ebd618f619fca5cadcaa527ad1549c
https://zenodo.org/records/1243969/files/part-ii-dependency-paths-chemical-disease-sorted-with-themes.txt.gz
63732849
md5:66949134a3fde6752e8e36d21bc8c2b7
https://zenodo.org/records/1243969/files/part-i-gene-disease-path-theme-distributions.txt.gz
70016566
md5:a36e2e9516ce1d3ebcfa921b5c10467a
https://zenodo.org/records/1243969/files/part-i-chemical-disease-path-theme-distributions.txt.gz
24784283
md5:4f02054eec8b92fafada5cb7098ac7dc
https://zenodo.org/records/1243969/files/part-i-chemical-gene-path-theme-distributions.txt.gz
847352230
md5:51437d31c4f139e4778c15b452c08bbd
https://zenodo.org/records/1243969/files/part-ii-dependency-paths-chemical-gene-sorted.txt.gz
149936059
md5:b4a43f02f9ecb6ea1737b025e9e445c8
https://zenodo.org/records/1243969/files/part-ii-dependency-paths-chemical-gene-sorted-with-themes.txt.gz
1080242193
md5:d210bb11a0dc39ab8d7500a457a147a7
https://zenodo.org/records/1243969/files/part-ii-dependency-paths-gene-disease-sorted.txt.gz
382071196
md5:614f7becd38ad4c4543aec5b2e8781b6
https://zenodo.org/records/1243969/files/part-ii-dependency-paths-gene-gene-sorted-with-themes.txt.gz
2489250707
md5:bdfd25cc6a272483d0c490f138bebeae
https://zenodo.org/records/1243969/files/part-ii-dependency-paths-gene-gene-sorted.txt.gz
312221666
md5:4c949f854eb4a60c641a6e147dc81b8d
https://zenodo.org/records/1243969/files/part-ii-dependency-paths-gene-disease-sorted-with-themes.txt.gz
53028496
md5:9be6e7eac7ea044ebaca06a6fa83c34c
https://zenodo.org/records/1243969/files/part-i-gene-gene-path-theme-distributions.txt.gz
public
10.5281/zenodo.1035252
isVersionOf
doi