Illinois Data Bank
Illinois Data Bank - Dataset

Version DOI Comment Publication Date
1 10.13012/B2IDB-7362697_V1 2024-07-29

7.54 GB File
613 MB File

Contact the Research Data Service for help interpreting this log.

Dataset update: {"description"=>["This dataset consists of a citation graph. It was constructed by downloading and parsing the Works section of the Open Alex catalog of the global research system. Open Alex (see citation below) contains detailed information about scholarly research, including articles, authors, journals, institutions, and their relationships. The data were downloaded on 2024-07-15.\r\n\r\nThe dataset comprises two compressed (.xz) files.\r\n\r\n1) filename: openalexID_integer_id_hasDOI.parquet.xz. The tabular data within contains three columns: openalex_id, integer_id, and hasDOI. Each row represents a record with the following data types:\r\n• openalex_id: A unique identifier from the Open Alex catalog.\r\n• integer_id: An integer representing the new identifier (assigned by the authors)\r\n• hasDOI: An integer (0 or 1) indicating whether the record has a DOI (0 for no, 1 for yes).\r\n\r\n2) filename: citation_table.tsv.xz\r\nThis edgelist of citations has two columns (no header) of integer values that represent citing and cited integer_id, respectively.\r\n\r\nSummary Features\r\n• Total Nodes (Documents): 256,997,006\r\n• Total Edges (citations): 2,148,871,058\r\n• Documents with DOIs: 163,495,446\r\n• Edges between documents with DOIs: 1,936,722,541 \r\n\r\n\r\nThe code used to generate these files can be found here: https://github.com/illinois-or-research-analytics/lorran_openalex/", "This dataset consists of a citation graph. It was constructed by downloading and parsing the Works section of the Open Alex catalog of the global research system. Open Alex (see citation below) contains detailed information about scholarly research, including articles, authors, journals, institutions, and their relationships. The data were downloaded on 2024-07-15.\r\n\r\nThe dataset comprises two compressed (.xz) files.\r\n\r\n1) filename: openalexID_integer_id_hasDOI.parquet.xz. The tabular data within contains three columns: openalex_id, integer_id, and hasDOI. Each row represents a record with the following data types:\r\n• openalex_id: A unique identifier from the Open Alex catalog.\r\n• integer_id: An integer representing the new identifier (assigned by the authors)\r\n• hasDOI: An integer (0 or 1) indicating whether the record has a DOI (0 for no, 1 for yes).\r\n\r\n2) filename: citation_table.tsv.xz\r\nThis edgelist of citations has two columns (no header) of integer values that represent citing and cited integer_id, respectively.\r\n\r\nSummary Features\r\n• Total Nodes (Documents): 256,997,006\r\n• Total Edges (citations): 2,148,871,058\r\n• Documents with DOIs: 163,495,446\r\n• Edges between documents with DOIs: 1,936,722,541 [corrected to 2,148,788,148 edges Nov 13, 2025]\r\n• Count of unique nodes in edgelist 111,453,719 [updated Nov 13, 2025]\r\nNote: Nov 13, 2025. An improved curation process will be applied to a future version of this dataset \r\n\r\nNote: Nov 13, 2025.\r\n\r\n\r\nThe code used to generate these files can be found here: https://github.com/illinois-or-research-analytics/lorran_openalex/"]} 2025-11-13T15:15:53Z
RelatedMaterial create: {"material_type"=>"Preprint", "availability"=>nil, "link"=>"https://doi.org/10.48550/arXiv.2509.02590", "uri"=>"10.48550/arXiv.2509.02590", "uri_type"=>"DOI", "citation"=>"Dindoost, M., Rodriguez, O.A., Bryg, B., Park, M., Chacko, G., Warnow, T., & Bader, D.A. (2025). On the Optimization of Methods for Establishing Well-Connected Communities. https://doi.org/10.48550/arXiv.2509.02590", "dataset_id"=>2744, "selected_type"=>"Other", "datacite_list"=>"IsSupplementTo", "note"=>"", "feature"=>nil} 2025-09-08T19:13:16Z
Dataset update: {"publisher"=>["University of Illinois at Urbana-Champaign", "University of Illinois Urbana-Champaign"], "subject"=>["", "Technology and Engineering"], "external_files_link"=>[nil, ""], "external_files_note"=>[nil, ""]} 2025-07-16T20:53:15Z
RelatedMaterial destroy: {"material_type"=>"Code", "availability"=>nil, "link"=>"https://github.com/illinois-or-research-analytics/lorran_openalex/", "uri"=>"", "uri_type"=>"URL", "citation"=>"OR_Research_Analytics OpenAlex GitHub Repository ", "dataset_id"=>2744, "selected_type"=>"Code", "datacite_list"=>"IsSupplementTo", "note"=>"", "feature"=>nil} 2025-01-08T23:48:39Z
RelatedMaterial destroy: {"material_type"=>"Dataset", "availability"=>nil, "link"=>"https://openalex.org/", "uri"=>"", "uri_type"=>"URL", "citation"=>"Priem, J.; Piwowar, H.; Orr, R. OpenAlex: A Fully-Open Index of Scholarly Works, Authors, Venues, Institutions, and Concepts. arXiv June 16, 2022.", "dataset_id"=>2744, "selected_type"=>"Dataset", "datacite_list"=>"IsSupplementedBy", "note"=>"", "feature"=>nil} 2025-01-08T23:48:39Z
Research Data Service Illinois Data Bank
Access and Use Policies Web Privacy Notice Contact Us