Illinois Data Bank
Welcome
Log in
Deposit Dataset
Find Data
Policies
Guides
Contact Us
Displaying 201 - 225 of 851 in total
<
1
2
…
5
6
7
8
9
10
11
12
13
…
34
35
>
25 per page
50 per page
Show All
Go
Clear Filters
Generate Report from Search Results
Subject Area
Life Sciences (484)
Social Sciences (148)
Physical Sciences (135)
Technology and Engineering (79)
Uncategorized
Arts and Humanities (2)
Funder
Other (257)
U.S. National Science Foundation (NSF) (230)
U.S. Department of Energy (DOE) (122)
U.S. National Institutes of Health (NIH) (79)
U.S. Department of Agriculture (USDA) (56)
Illinois Department of Natural Resources (IDNR) (25)
U.S. Geological Survey (USGS) (8)
U.S. National Aeronautics and Space Administration (NASA) (6)
Illinois Department of Transportation (IDOT) (4)
U.S. Army (3)
Publication Year
2025 (155)
2021 (108)
2024 (107)
2022 (106)
2020 (96)
2023 (75)
2019 (72)
2018 (61)
2017 (36)
2016 (30)
2009 (1)
2011 (1)
2012 (1)
2014 (1)
2015 (1)
License
CC0 (449)
CC BY (378)
custom (24)
Illinois Data Bank Dataset Search Results
Dataset Search Results
published: 2023-07-01
Tonks, Adam; Hwang, Jeongwoo (2023): Data for the paper "Assessment of spatiotemporal flood risk due to compound precipitation extremes across the contiguous United States". University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-6626437_V1
This is the data used in the paper "Assessment of spatiotemporal flood risk due to compound precipitation extremes across the contiguous United States". Code from the Github repository https://github.com/adtonks/precip_extremes can be used with the data here to reproduce the paper's results. v1.0.0 of the code is also archived at https://doi.org/10.5281/zenodo.8104252 This dataset is derived from NOAA-CIRES-DOE 20th Century Reanalysis V3. The NOAA-CIRES-DOE Twentieth Century Reanalysis Project version 3 used resources of the National Energy Research Scientific Computing Center managed by Lawrence Berkeley National Laboratory which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231 and used resources of NOAA's Remotely Deployed High Performance Computing Systems.
keywords:
spatiotemporal; CONUS; United States; precipitation; extremes; flooding
published: 2023-07-05
Njuguna, Joyce; Clark, Lindsay; Lipka, Alexander; Anzoua, Kossonou; Bagmet, Larisa; Chebukin, Pavel; Dwiyanti, Maria; Dzyubenko, Elena; Dzyubenko, Nicolay; Ghimire, Bimal; Jin, Xiaoli; Johnson, Douglas; Kjeldsen, Jens; Nagano, Hironori; Oliveira, Ivone; Peng, Junhua; Petersen, Karen; Sabitov, Andrey; Seong, Eun; Yamada, Toshihiko; Yoo, Ji; Yu, Chang; Zhao, Hu; Munoz, Patricio; Long, Stephen; Sacks, Erik (2023): Impact of genotype-calling methodologies on genome-wide association and genomic prediction in polyploids. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-4829913_V2
This dataset contains all data used in the paper "Impact of genotype-calling methodologies on genome-wide association and genomic prediction in polyploids". The dataset includes genotypes and phenotypic data from two autotetraploid species Miscanthus sacchariflorus and Vaccinium corymbosum that was used used for genome wide association studies and genomic prediction and the scripts used in the analysis. In this V2, 2 files have the raw data are added: "Miscanthus_sacchariflorus_RADSeq.vcf" is the VCF file with the raw SNP calls of the Miscanthus sacchariflorus data used for genotype calling using the 6 genotype calling methods. "Blueberry_data_read_depths.RData" is the a RData file with the read depth data that was used for genotype calling in the Blueberry dataset.
keywords:
Polyploid; allelic dosage; Bayesian genotype-calling; Genome-wide association; Genomic prediction
published: 2023-07-11
Parulian, Nikolaus (2023): Data for A Conceptual Model for Transparent, Reusable, and Collaborative Data Cleaning. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-6827044_V1
The dissertation_demo.zip contains the base code and demonstration purpose for the dissertation: A Conceptual Model for Transparent, Reusable, and Collaborative Data Cleaning. Each chapter has a demo folder for demonstrating provenance queries or tools. The Airbnb dataset for demonstration and simulation is not included in this demo but is available to access directly from the reference website. Any updates on demonstration and examples can be found online at: https://github.com/nikolausn/dissertation_demo
published: 2023-10-22
Davidson, Ruth; Vachaspati, Pranjal; Mirarab, Siavash; Warnow, Tandy (2023): Data from: Phylogenomic species tree estimation in the presence of incomplete lineage sorting and horizontal gene transfer. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-6670066_V1
HGT+ILS datasets from Davidson, R., Vachaspati, P., Mirarab, S., & Warnow, T. (2015). Phylogenomic species tree estimation in the presence of incomplete lineage sorting and horizontal gene transfer. BMC genomics, 16(10), 1-12. Contains model species trees, true and estimated gene trees, and simulated alignments.
keywords:
evolution; computational biology; bioinformatics; phylogenetics
published: 2024-11-19
Salami, Malik Oyewale; McCumber, Corinne (2024): Dataset for Reassessment of the agreement in retraction indexing across 4 multidisciplinary sources: Crossref, Retraction Watch, Scopus, and Web of Science. University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-8457537_V1
This project investigates retraction indexing agreement among data sources: Crossref, Retraction Watch, Scopus, and Web of Science. As of July 2024, this reassesses the April 2023 union list of Schneider et al. (2023): https://doi.org/10.55835/6441e5cae04dbe5586d06a5f. As of April 2023, over 1 in 5 DOIs had discrepancies in retraction indexing among the 49,924 DOIs indexed as retracted in at least one of Crossref, Retraction Watch, Scopus, and Web of Science (Schneider et al., 2023). Here, we determine what changed in 15 months. Pipeline code to get the results files can be found in the GitHub repository https://github.com/infoqualitylab/retraction-indexing-agreement in the iPython notebook 'MET-STI2024_Reassessment_of_retraction_indexing_agreement.ipynb' Some files have been redacted to remove proprietary data, as noted in README.txt. Among our sources, data is openly available only for Crossref and Retraction Watch. FILE FORMATS: 1) unionlist_completed_2023-09-03-crws-ressess.csv - UTF-8 CSV file 2) unionlist_completed-ria_2024-07-09-crws-ressess.csv - UTF-8 CSV file 3) unionlist-15months-period_sankey.png - Portable Network Graphics (PNG) file 4) unionlist_ria_proportion_comparison.png - Portable Network Graphics (PNG) file 5) README.txt - text file FILE DESCRIPTION: Description of the files can be found in README.txt
keywords:
retraction status; data quality; indexing; retraction indexing; metadata; meta-science; RISRS
published: 2023-01-05
Tonks, Adam (2023): Data for the paper "Forecasting West Nile Virus with Graph Neural Networks: Harnessing Spatial Dependence in Irregularly Sampled Geospatial Data". University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-3628170_V1
This is the data used in the paper "Forecasting West Nile Virus with Graph Neural Networks: Harnessing Spatial Dependence in Irregularly Sampled Geospatial Data". A preprint may be found at https://doi.org/10.48550/arXiv.2212.11367 Code from the Github repository https://github.com/adtonks/mosquito_GNN can be used with the data here to reproduce the paper's results. v1.0.0 of the code is also archived at https://doi.org/10.5281/zenodo.7897830
keywords:
west nile virus; machine learning; gnn; mosquito; trap; graph neural network; illinois; geospatial
published: 2023-04-12
Towns, John; Hart, David (2023): XSEDE: Allocations Awards and Usage for the NSF Cyberfrastructure Portfolio, 2004-2022. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-3731847_V1
The XSEDE program manages the database of allocation awards for the portfolio of advanced research computing resources funded by the National Science Foundation (NSF). The database holds data for allocation awards dating to the start of the TeraGrid program in 2004 through the XSEDE operational period, which ended August 31, 2022. The project data include lead researcher and affiliation, title and abstract, field of science, and the start and end dates. Along with the project information, the data set includes resource allocation and usage data for each award associated with the project. The data show the transition of resources over a fifteen year span along with the evolution of researchers, fields of science, and institutional representation. Because the XSEDE program has ended, the allocation_award_history file includes all allocations activity initiated via XSEDE processes through August 31, 2022. The Resource Providers and successor program to XSEDE agreed to honor all project allocations made during XSEDE. Thus, allocation awards that extend beyond the end of XSEDE may not reflect all activity that may ultimately be part of the project award. Similarly, allocation usage data only reflects usage reported through August 31, 2022, and may not reflect all activity that may ultimately be conducted by projects that were active beyond XSEDE.
keywords:
allocations; cyberinfrastructure; XSEDE
published: 2023-05-30
Clem, C. Scott; Hart, Lily V.; McElrath, Thomas C. (2023): Primary Occurrence Data for "Clem, Hart, & McElrath. 2023. A century of Illinois hover flies (Diptera: Syrphidae): Museum and citizen science data reveal recent range expansions, contractions, and species of potential conservation significance". University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-1613645_V1
Primary occurrence data for Clem, Hart, & McElrath. 2023. A century of Illinois hover flies (Diptera: Syrphidae): Museum and citizen science data reveal recent range expansions, contractions, and species of potential conservation significance. Included are a license.txt file, the cleaned occurrences from each of the six merged datasets, and a cleaned, merged dataset containing all occurrence records in one spreadsheet, formatted according to Darwin Core standards, with a few extra fields such as GBIF identifiers that were included in some of the original downloads.
keywords:
csv; occurrences; syrphidae; hover flies; flies; biodiversity; darwin core; darwin-core; GBIF; citizen science; iNaturalist
published: 2024-02-08
Martinez, Carlos; Pena, Gisselle; Wells, Kaylee K. (2024): "Prairie Directory of North America" (2013) Entries for the Tallgrass, Mixed Grass, and Shortgrass Prairie Regions of the United States. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-0421892_V1
This dataset contains transcribed entries from the "Prairie Directory of North America" (Adelman and Schwartz 2013) for the Tallgrass, Mixed Grass, and Shortgrass prairie regions of the united states. We identified the historical spatial extent of the Tallgrass, Mixed Grass, and Shortgrass prairie regions using Ricketts et al. (1999), Olson et al. (2001), and Dixon et al. (2014) and selected the counties entirely or partially within these boundaries from the USDA Forest Service (2022) file. The resulting lists of counties are included as separate files. The dataset contains information on publicly accessible grasslands and prairies in these regions including acreage and amenities like hunting access, restrooms, parking, and trails.
keywords:
grasslands; prairies; prairie directory of north america; site amenities; site attributes
published: 2018-05-21
Karigerasi, Manohar H.; Wagner, Lucas K.; Shoemaker, Daniel P. (2018): Geometric analysis of magnetic dimensionality. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-3897093_V1
This dataset contains bonding networks and tolerance ranges for geometric magnetic dimensionality. The data can be searched in the html frontend above, code obtained at the GitHub repository, or the raw data can be downloaded as csv below. The csv data contains the results of 42520 compounds (unique icsd_code) from ICSD FindIt v3.5.0. The csv is semicolon-delimited since some fields contain multiple comma-separated values.
keywords:
materials science; physics; magnetism; crystallography
published: 2018-07-25
Scannapieco, Frank; Hoang, Linh; Schneider, Jodi (2018): Expert assessment of RobotReviewer data extraction performance on 10 articles. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-8274875_V1
The PDF describes the process and data used for the heuristic user evaluation described in the related article “<i>Evaluating an automatic data extraction tool based on the theory of diffusion of innovation</i>” by Linh Hoang, Frank Scannapieco, Linh Cao, Yingjun Guan, Yi-Yun Cheng, and Jodi Schneider (under submission).<br /> Frank Scannapieco assessed RobotReviewer data extraction performance on ten articles in 2018-02. Articles are included papers from an update review: Sabharwal A., G.-F.I., Stellrecht E., Scannapeico F.A. <i>Periodontal therapy to prevent the initiation and/or progression of common complex systemic diseases and conditions</i>. An update. Periodontol 2000. In Press. <br/> The form was created in consultation with Linh Hoang and Jodi Schneider. To do the assessment, Frank Scannapieco entered PDFs for these ten articles into RobotReviewer and then filled in ten evaluation forms, based on the ten Robot Reviewer automatic data extraction reports. Linh Hoang analyzed these ten evaluation forms and synthesized Frank Scannapieco’s comments to arrive at the evaluation results for the heuristic user evaluation.
keywords:
RobotReviewer; systematic review automation; data extraction
published: 2018-09-06
XSEDE-Extreme Science and Engineering Discovery Environment (2018): XSEDE: Allocations Awards for the NSF Cyberinfrastructure Portfolio, 2004-2017. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-4817808_V1
The XSEDE program manages the database of allocation awards for the portfolio of advanced research computing resources funded by the National Science Foundation (NSF). The database holds data for allocation awards dating to the start of the TeraGrid program in 2004 to present, with awards continuing through the end of the second XSEDE award in 2021. The project data include lead researcher and affiliation, title and abstract, field of science, and the start and end dates. Along with the project information, the data set includes resource allocation and usage data for each award associated with the project. The data show the transition of resources over a fifteen year span along with the evolution of researchers, fields of science, and institutional representation.
keywords:
allocations; cyberinfrastructure; XSEDE
published: 2018-11-21
Clark, Lindsay V.; Lipka, Alexander E.; Sacks, Erik J. (2018): Scripts for testing the error rate of polyRAD. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-9729830_V2
This set of scripts accompanies the manuscript describing the R package polyRAD, which uses DNA sequence read depth to estimate allele dosage in diploids and polyploids. Using several high-confidence SNP datasets from various species, allelic read depth from a typical RAD-seq dataset was simulated, then genotypes were estimated with polyRAD and other software and compared to the true genotypes, yielding error estimates.
keywords:
R programming language; genotyping-by-sequencing (GBS); restriction site-associated DNA sequencing (RAD-seq); polyploidy; single nucleotide polymorphism (SNP); Bayesian genotype calling; simulation
published: 2023-06-10
Cheng, Xi; Kontou, Eleftheria (2023): Data for Estimating the Electric Vehicle Charging Demand of Multi-Unit Dwelling Residents in the United States. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-4230392_V1
Data and code supporting the paper titled "Estimating the Electric Vehicle Charging Demand of Multi-Unit Dwelling Residents in the United States" by Xi Cheng and Eleftheria Kontou at the University of Illinois Urbana-Champaign. The data and the code enable analytics and assessment of multi-unit dwelling residents travel patterns and their electric vehicle charging demand.
keywords:
multi-unit residents; electric vehicles; home charging; travel patterns; energy use
published: 2023-01-12
Mischo, William; Schlembach, Mary C.; Cabada, Elisandro (2023): Data for: Relationships between Journal Publication, Citation, and Usage Metrics within a Carnegie R1 University Collection: A Correlation Analysis. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-6810203_V1
This dataset was developed as part of a study that examined the correlational relationships between local journal authorship, local and external citation counts, full-text downloads, link-resolver clicks, and four global journal impact factor indices within an all-disciplines journal collection of 12,200 titles and six subject subsets at the University of Illinois at Urbana-Champaign (UIUC) Library. While earlier investigations of the relationships between usage (downloads) and citation metrics have been inconclusive, this study shows strong correlations in the all-disciplines set and most subject subsets. The normalized Eigenfactor was the only global impact factor index that correlated highly with local journal metrics. Some of the identified disciplinary variances among the six subject subsets may be explained by the journal publication aspirations of UIUC researchers. The correlations between authorship and local citations in the six specific subject subsets closely match national department or program rankings. All the raw data used in this analysis, in the form of relational database tables with multiple columns. Can be opned using MS Access. Description for variables can be viewed through "Design View" (by right clik on the selected table, choose "Design View"). The 2 PDF files provide an overview of tables are included in each MDB file. In addition, the processing scripts and Pearson correlation code is available at <a href="https://doi.org/10.13012/B2IDB-0931140_V1">https://doi.org/10.13012/B2IDB-0931140_V1</a>.
keywords:
Usage and local citation relationships; publication; citation and usage metrics; publication; citation and usage correlation analysis; Pearson correlation analysis
published: 2022-09-29
Levine, Nathaniel (2022): 3DIFICE: A Synthetic Dataset for Training Computer Vision Algorithms to Recognize Earthquake Damage to Reinforced Concrete Structures. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-6415287_V1
3DIFICE: 3-dimensional Damage Imposed on Frame structures for Investigating Computer vision-based Evaluation methods This dataset contains 1,396 synthetic images and label maps with various types of earthquake damage imposed on reinforced concrete frame structures. Damage includes: cracking, spalling, exposed transverse rebar, and exposed longitudinal rebar. Each image has an associated label map that can be used for training machine learning algorithms to recognize the various types of damage.
keywords:
computer vision; earthquake engineering; structural health monitoring; civil engineering; structural engineering;
published: 2023-06-01
Storms, Suzanna (2023): RT-LAMP as diagnostic tool for Influenza-A Virus detection in swine. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2079467_V1
Results of RT-LAMP reactions for influenza A virus diagnostic development.
keywords:
swine influenza; LAMP; gBlock
published: 2022-07-25
Jett, Jacob (2022): SBKS - Species Noisy Entity Mentions. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-7146216_V1
This dataset is derived from the raw dataset (https://doi.org/10.13012/B2IDB-4950847_V1) and collects entity mentions that were manually determined to be noisy, non-species entities.
keywords:
synthetic biology; NERC data; species mentions, noisy entities
published: 2022-07-25
Jett, Jacob (2022): SBKS - Species Not Found Entity Mentions. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-5491578_V1
This dataset is derived from the raw entity mention dataset (https://doi.org/10.13012/B2IDB-4950847_V1) for species entities and represents those that were determined to be species (i.e., were not noisy entities) but for which no corresponding concept could be found in the NCBI taxonomy database.
keywords:
synthetic biology; NERC data; species mentions, not found entities
published: 2022-07-25
Jett, Jacob (2022): SBKS - Chemical - Cleaned & Grounded Entity Mentions. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-3396059_V1
This dataset represents the results of manual cleaning and annotation of the entity mentions contained in the raw dataset (https://doi.org/10.13012/B2IDB-4163883_V1). Each mention has been consolidated and linked to an identifier for a matching concept from the NCBI's taxonomy database.
keywords:
synthetic biology; NERC data; chemical mentions; cleaned data; ChEBI ontology
published: 2022-07-25
Jett, Jacob (2022): SBKS - Chemical Noisy Entity Mentions. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-7228767_V1
This dataset is derived from the raw dataset (https://doi.org/10.13012/B2IDB-4163883_V1) and collects entity mentions that were manually determined to be noisy, non-chemical entities.
keywords:
synthetic biology; NERC data; chemical mentions, noisy entities
published: 2022-07-25
Jett, Jacob (2022): SBKS - Chemical Not Found Entity Mentions. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-4570128_V1
This dataset is derived from the raw entity mention dataset (https://doi.org/10.13012/B2IDB-4163883_V1) for checmical entities and represents those that were determined to be chemicals (i.e., were not noisy entities) but for which no corresponding concept could be found in the ChEBI ontology.
keywords:
synthetic biology; NERC data; chemical mentions, not found entities
published: 2022-07-25
Jett, Jacob (2022): SBKS - Genes Raw Entity Mentions. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-3887275_V1
A set of gene and gene-related entity mentions derived from an NERC dataset analyzing 900 synthetic biology articles published by the ACS. This data is associated with the Synthetic Biology Knowledge System repository (https://web.synbioks.org/). The data in this dataset are raw mentions from the NERC data.
keywords:
synthetic biology; NERC data; gene mentions
published: 2022-09-16
Zhong, Jia; Khanna, Madhu (2022): Model Code and Data for "Assessing the Efficiency Implications of Renewable Fuel Policy Design in the United States". University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-6803176_V1
This dataset contains model code (including input data) to replicate the outcomes for "Assessing the Efficiency Implications of Renewable Fuel Policy Design in the United States". The model consists of: (1) The replication codes and data for the model. To run the model, using GAMS to run the "Models.gms" file.
keywords:
Renewable Fuel Standard; Nested structure; cellulosic waiver credit; RIN
published: 2022-08-22
Pastrana-Otero, Isamar; Majumdar, Sayani; Kraft, Mary L. (2022): Raman spectra of individual, living hematopoietic stem and progenitor cells. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-9950442_V1
This dataset contains Raman spectra, each acquired from an individual, living, primary murine cell belonging to one of the six most immature hematopoietic cell populations found in the body: hematopoietic stem cell (HSC), mutipotent progenitor 1 (MPP1), multipotent progenitor 2 (MPP2), multipotent progenitor 3 (MPP3), common lymphoid progenitor, common myeloid progenitor (CLP). These spectra are useful for identifying spectral signatures that are characteristic of each hematopoietic stem or early progenitor cell population. *NOTE: __MACOSX folder and files start with “._[file name]” found in "Raman spectra of single cells text files.zip" were created by the computer operation system, in unreadable format, which are not part of the data and can be removed/ignored when using the data.
keywords:
Raman spectroscopy; single-cell spectrum; hematopoietic cell; hematopoietic stem cell; multipotent progenitor cell; common myeloid progenitor; common lymphoid progenitor
Research Data Service
Illinois Data Bank
Access and Use Policies
Web Privacy Notice
Contact Us