Other Open Access

Scalable Knowledge-Graph Analytics at 136 Petaflops/s – Data Readme

Kannan, Ramakrishnan; Sao, Piyush; Lu, Hao; Herrmannova, Drahomira; Patton, Robert; Potok, Thomas; Thakkar, Vijay; Vuduc, Richard


Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:creator>Kannan, Ramakrishnan</dc:creator>
  <dc:creator>Sao, Piyush</dc:creator>
  <dc:creator>Lu, Hao</dc:creator>
  <dc:creator>Herrmannova, Drahomira</dc:creator>
  <dc:creator>Patton, Robert</dc:creator>
  <dc:creator>Potok, Thomas</dc:creator>
  <dc:creator>Thakkar, Vijay</dc:creator>
  <dc:creator>Vuduc, Richard</dc:creator>
  <dc:date>2020-08-07</dc:date>
  <dc:description>This dataset contains data that was presented and analyzed in our paper "Scalable Knowledge-Graph Analytics at 136 Petaflop/s" [1]. The dataset is based on the COVID-19 Open Research Dataset (CORD-19) [2]. The CORD-19 dataset is a collection of scientific publications on SARS-COV-2, COVID-19, and other coronaviruses. This readme describes how we used the CORD-19 data to create a knowledge graph that was the input into our Distributed Semiring All-Pairs Shortest Path (DSNAPSHOT) algorithm [1]. The readme also describes the output of the algorithm. Both the input graph and the output all-pairs shortest paths information are shared in our dataset (https://dx.doi.org/10.13139/OLCF/1646608). The download size of the dataset is 155 GB. The unpacked dataset is 518 GB in size.

[1] Ramakrishnan Kannan, Piyush Sao, Hao Lu, Drahomira Herrmannova, Robert Patton, Thomas Potok, Vijay Thakkar, and Richard Vuduc. Scalable Knowledge-Graph Analytics at 136 Petaflops/s. In Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (IEEE Supercomputing), 2020.

[2] Lu Wang L, Lo K, Chandrasekhar Y, Reas R, Yang J, Eide D, Funk K, Kinney R, Liu Z, Merrill W, Mooney P, Murdick D, Rishi D, Sheehan J, Shen Z, Stilson B, Wade AD, Wang K, Wilhelm C, Xie B, Raymond D, Weld DS, Etzioni O, Kohlmeier S. CORD-19: The Covid-19 Open Research Dataset. ArXiv [Preprint]. 2020 Apr 22:arXiv:2004.10706v2. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7251955/</dc:description>
  <dc:identifier>https://zenodo.org/record/3980252</dc:identifier>
  <dc:identifier>10.13139/OLCF/1646608</dc:identifier>
  <dc:identifier>oai:zenodo.org:3980252</dc:identifier>
  <dc:relation>info:eu-repo/semantics/altIdentifier/url/https://doi.ccs.ornl.gov/ui/doi/94</dc:relation>
  <dc:relation>url:https://arxiv.org/abs/2004.10706</dc:relation>
  <dc:relation>url:https://zenodo.org/communities/covid-19</dc:relation>
  <dc:relation>url:https://zenodo.org/communities/zenodo</dc:relation>
  <dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
  <dc:rights>https://creativecommons.org/licenses/by/4.0/legalcode</dc:rights>
  <dc:title>Scalable Knowledge-Graph Analytics at 136 Petaflops/s – Data Readme</dc:title>
  <dc:type>info:eu-repo/semantics/other</dc:type>
  <dc:type>publication-other</dc:type>
</oai_dc:dc>
607
310
views
downloads
Views 607
Downloads 310
Data volume 69.8 MB
Unique views 536
Unique downloads 294

Share

Cite as