Illinois Data Bank Dataset Search Results
Results
published:
2022-12-28
Harmon, Gabriel T.; Harmon-Threatt, Alexandra N.; Anderson, Nicholas L.
(2022)
The effect of pesticide contamination on arthropod biomass and diversity in simulated prairie restorations depended on arthropod feeding guild (e.g., predator, herbivore, or pollinator). The pesticides used in this study were the neonicotinoid insecticide clothianidin and the phthalimide fungicide captan. This dataset includes two data files. The first contains information about the study sites ("plots") and pesticide treatments. The second contains information about arthropod biomass and morphospecies richness separated by feeding guild for each month-plot combination. R code in an R Markdown file for the analysis and data presentation in the associated publication is also provided. Detected effects included: predator biomass was 66% lower in plots treated with clothianidin, and this effect persisted across the growing season; the impact on herbivore biomass appeared to be inconsistent, with biomass being 51% lower with clothianidin in June but no detected difference in July or August; herbivore morphospecies richness was 12% lower in plots treated with both clothianidin and captain; pollinators appeared to be unaffected by clothianidin; and pollinator biomass increased by 71% when captan was applied to a plot.
keywords:
Arthropod decline; pesticide; clothianidin; captan; habitat restoration; trophic effects; insects
published:
2026-01-09
Schultz, J Carl; Cao, Mingfeng; Zhao, Huimin
(2026)
Rhodotorula toruloides has been increasingly explored as a host for bioproduction of lipids, fatty acid derivatives and terpenoids. Various genetic tools have been developed, but neither a centromere nor an autonomously replicating sequence (ARS), both necessary elements for stable episomal plasmid maintenance, has yet been reported. In this study, cleavage under targets and release using nuclease (CUT&RUN), a method used for genome-wide mapping of DNA–protein interactions, was used to identify R. toruloides IFO0880 genomic regions associated with the centromeric histone H3 protein Cse4, a marker of centromeric DNA. Fifteen putative centromeres ranging from 8 to 19 kb in length were identified and analyzed, and four were tested for, but did not show, ARS activity. These centromeric sequences contained below average GC content, corresponded to transcriptional cold spots, were primarily nonrepetitive and shared some vestigial transposon-related sequences but otherwise did not show significant sequence conservation. Future efforts to identify an ARS in this yeast can utilize these centromeric DNA sequences to improve the stability of episomal plasmids derived from putative ARS elements.
keywords:
Genome Engineering; Genomics
published:
2018-11-18
Kwang, Jeffrey; Parker, Gary
(2018)
This dataset contains experimental measurements used in the paper, "Ultra-sensitivity of Numerical Landscape Evolution Models to their Initial Conditions." (to be submitted).
The data is taken from experimental runs in a miniature landscape model named the eXperimental Landscape Evolution (XLE) facility. In this facility, we complete five >24hr runs at 5 minute temporal resolution. Every five minutes, an planform image was capture, and a digital elevation model (DEM) was generated. For each run, images and a corresponding animation of images are documented. In addition,ASCII formatted DEMs along with color hillshade maps were generated. The hillshade map images were also made into an animation.
This dataset is associated with the following publication: https://doi.org/10.1029/2019GL083305
keywords:
landscape evolution model; digital elevation model; geomorphology
published:
2019-02-02
The bee visitation data includes the percentage of each bee pollinator group in bee bowls and observed. The data are referenced in the article with the following citation:
Bennett, A.B., Lovell, S.T. 2019. Landscape and local site variables differentially influence pollinators and pollination services in urban agricultural sites. Accepted for publication in: PLOS ONE.
published:
2019-06-11
Wang, Wenrui; Wang, Tao; Amin, Vivek P.; Wang, Yang; Radhakrishnan, Anil; Davidson, Angie; Allen, Shane R.; Silva, T. J.; Ohldag, Hendrik; Balzar, Davor; Zink, Barry L.; Haney, Paul M.; Xiao, John Q.; Cahill, David G.; Lorenz, Virginia O.; Fan, Xin
(2019)
This dataset provides the raw data, code and related figures for the paper, "Anomalous Spin-Orbit Torques in Magnetic Single-Layer Films."
keywords:
spintronics; spin-orbit torques; magnetic materials
published:
2024-05-23
Xing, Yuqing; Bae, Seokjin; Ritz, Ethan; Yang, Fan; Birol, Turan; Salinas , Andrea N. Capa ; Ortiz, Brenden R.; Wilson , Stephen D.; Wang, Ziqiang; Fernandes, Rafael M.; Madhavan, Vidya
(2024)
This dataset consists of all the figure files that are part of the main text and supplementary of the manuscript titled "Optical manipulation of the charge density wave state in RbV3Sb5". For detailed information on the individual files refer to the readme file.
keywords:
kagome superconductor; optics; charge density wave
published:
2017-12-04
Zaya, David N.; Leicht-Young, Stacey A.; Pavlovic, Noel; Hetrea, Christopher S.; Ashley, Mary V.
(2017)
Data used for Zaya et al. (2018), published in Invasive Plant Science and Management DOI 10.1017/inp.2017.37, are made available here. There are three spreadsheet files (CSV) available, as well as a text file that has detailed descriptions for each file ("readme.txt"). One spreadsheet file ("prices.csv") gives pricing information, associated with Figure 3 in Zaya et al. (2018). The other two spreadsheet files are associated with the genetic analysis, where one file contains raw data for biallelic microsatellite loci ("genotypes.csv") and the other ("structureResults.csv") contains the results of Bayesian clustering analysis with the program STRUCTURE. The genetic data may be especially useful for future researchers. The genetic data contain the genotypes of the horticultural samples that were the focus of the published article, and also genotypes of nearly 400 wild plants. More information on the location of the wild plant collections can be found in the Supplemental information for Zaya et al. (2015) Biological Invasions 17:2975–2988 DOI 10.1007/s10530-015-0926-z. See "readme.txt" for more information.
keywords:
Horticultural industry; invasive species; microsatellite DNA; mislabeling; molecular testing
published:
2018-05-16
Lewis, Quinn; Bruce, Rhoads
(2018)
These data are for two companion papers on use of LSPIV obtained from UAS (i.e. drones) to measure flow structure in streams. The LSPIV1 folder contains spreadsheet data used in each case referred to in Table 1 in the manuscript. In the spreadsheets, there is a cell that denotes which figure was constructed with which data. The LSPIV2 folder contains spreadsheets with data used for the constructed figures, and are labeled by figure.
keywords:
LSPIV; drone; UAS; flow structure; rivers
published:
2018-06-02
Palmer, Ryan; Albarracin, Dolores
(2018)
keywords:
conspiracy theory; trust in science
published:
2019-07-27
Clark, Lindsay V.; Dwiyanti, Maria Stefanie; Anzoua, Kossonou G.; Brummer, Joe E.; Glowacka, Katarzyna; Hall, Megan; Heo, Kweon; Jin, Xiaoli; Lipka, Alexander E.; Peng, Junhua; Yamada, Toshihiko; Yoo, Ji Hye; Yu, Chang Yeon; Zhao, Hua; Long, Stephen P.; Sacks, Erik J.
(2019)
Genotype calls are provided for a collection of 583 Miscanthus sinensis clones across 1,108,836 loci mapped to version 7 of the Miscanthus sinensis reference genome. Sequence and alignment information for all unique RAD tags is also provided to facilitate cross-referencing to other genomes.
keywords:
variant call format (VCF); sequence alignment/map format (SAM); miscanthus; single nucleotide polymorphism (SNP); restriction site-associated DNA sequencing (RAD-seq); bioenergy; grass
published:
2018-12-13
Yin, Dandong; Wang, Shaowen
(2018)
The dataset contains a complete example (inputs, outputs, codes, intermediate results, visualization webpage) of executing Height Above Nearest Drainage HAND workflow with CyberGIS-Jupyter.
keywords:
cybergis; hydrology; Jupyter
published:
2024-10-08
Mersich, Ina; Bishop, Rebecca; Diaz Yucupicio, Sandra; Nobrega, Ana D.; Austin, Scott; Barger, Anne; Fick , Megan E.; Wilkins, Pamela
(2024)
Acepromazine was administered to healthy adult horses to induce transient anemia secondary to splenic sequestration. Data was collected at baseline (T0), 1 hour (T1) and 12 hours (T2) post acepromazine administration. Data collection included PCV, TP, CBC, fibrinogen, PT, PTT and viscoelastic coagulation profiles (VCM Vet) as well as ultrasonographic measurements of the spleen at all 3 time points.
keywords:
horse; coagulation; viscoelastic testing; anemia; acepromazine
published:
2018-07-13
Hensley, Merinda Kaye; Johnson, Heidi R.
(2018)
Qualitative Data collected from the websites of undergraduate research journals between October, 2014 and May, 2015. Two CSV files. The first file, "Sample", includes the sample of journals with secondary data collected. The second file, "Population", includes the remainder of the population for which secondary data was not collected. Note: That does not add up to 800 as indicated in article, rows were deleted for journals that had broken links or defunct websites during random sampling process.
keywords:
undergraduate research; undergraduate journals; scholarly communication; libraries; liaison librarianship
published:
2018-12-20
Dong, Xiaoru; Xie, Jingyi; Hoang, Linh; Schneider, Jodi
(2018)
File Name: Error_Analysis.xslx
Data Preparation: Xiaoru Dong
Date of Preparation: 2018-12-12
Data Contributions: Xiaoru Dong, Linh Hoang, Jingyi Xie, Jodi Schneider
Data Source: The classification prediction results of prediction in testing data set
Associated Manuscript authors: Xiaoru Dong, Jingyi Xie, Linh Hoang, and Jodi Schneider
Associated Manuscript, Working title: Machine classification of inclusion criteria from Cochrane systematic reviews
Description: The file contains lists of the wrong and correct prediction of inclusion criteria of Cochrane Systematic Reviews from the testing data set and the length (number of words) of the inclusion criteria.
Notes: In order to reproduce the relevant data to this, please get the code of the project published on GitHub at: https://github.com/XiaoruDong/InclusionCriteria and run the code following the instruction provided.
keywords:
Inclusion criteria, Randomized controlled trials, Machine learning, Systematic reviews
published:
2021-03-15
Stodola, Alison P.; Lydeard, Charles; Lamer, James T.; Douglass, Sarah A.; Cummings, Kevin; Campbell, David
(2021)
Dataset associated with "Hiding in plain sight: genetic confirmation of putative Louisiana Fatmucket Lampsilis hydiana in Illinois" as submitted to Freshwater Mollusk Biology and Conservation by Stodola et al. Images are from cataloged specimens from the Illinois Natural History Survey (INHS) Mollusk Collection in Champaign, Illinois that were used for genetic research. File names indicate the species as confirmed in Stodola et al. (i.e., Lampsilis siliquoidea or Lampsilis hydiana) followed by the INHS Mollusk Collection catalog number, followed by the individual specimen number, followed by shell view (interior or exterior). If no specimen number is noted in the file name, there is only one specimen for that catalog number. For example: Lsiliquoidea_46515_1_2_3_exterior.
Images were created by photographing specimens on a metric grid in an OrTech Photo-e-Box Plus with a Nikon D610 single lens reflex camera using a 60mm lens. Post-processing of images (cropping, image rotation, and auto contrast) occurred in Adobe Photoshop and saved as TIFF files using no image compression, interleaved pixel order, and IBM PC Byte Order. One additional partial lot, INHS Mollusk Catalog No. 37059 (shown with both interior and exterior view in one image), is included for reference but was not genetically sequenced. A .csv file contains an index of all specimens photographed.
SPECIES: species confirmed using genetic analyses
GENE: cox1 or nad1 mitochondrial gene
ACCESSION: GenBank accession number
INHS CATALOG NO: Illinois Natural History Survey Mollusk Collection Catalog number
WATERBODY: waterbody where specimen was collected
PUTATIVE SPECIES: species determination based on morphological characters prior to genetic analysis
Phylogenetic sequence data (.nex files) were aligned using BioEdit (Hall, T.A. 1999. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symposium Series 41:95-98.). Pertinent methodology for the analysis are contained within the manuscript submittal for Stodola et al. to Freshwater Mollusk Biology and Conservation. In these files, "N" is a standard symbol for an unknown base.
keywords:
Lampsilis hydiana; Lampsilis siliquoidea; unionid; Louisiana Fatmucket; Fatmucket; genetic confirmation
published:
2024-01-31
Wang, Xiudan; Dietrich, Christopher; Zhang, Yalin
(2024)
The included files were used to reconstruct the phylogeny of Coelidiinae using combined morphological and molecular data, estimate divergence times and reconstruct ancestral biogeographic areas as described in the manuscript submitted for publication. The file “Coelidiinae_dna_morph_combined.nex” is a text file in standard NEXUS format used by various phylogenetic analysis programs. This file includes the aligned and concatenated nucleotide sequences or five gene regions (mitochondrial COI and 16S, and nuclear 28S D-2, histone H3, histone H2A and wingless) indicated by standard “ACGT” nucleotide symbols with missing data indicated by “?”, and morphological character data as defined in Table S3 used in the analyses. The data partitions are indicated toward the end of the file by ranges of numbers (“charset Subset 1 – 4” for the DNA data and “charset morph” for the morphological characters) followed by commands for the phylogenetic analysis program MrBayes that specify the model settings for each data partition. Detailed data on species included (as rows) in the dataset, including collection localities and GenBank accession numbers are provided in the Table_S1_Specimen_information.csv file. The file "TablesS2-S4.pdf" lists the primers used for polymerase chain reaction amplification, the list of morphological character definitions, and the morphological character matrix. The file “RASP_Distribution.csv” contains a list of the species included in the phylogenetic dataset (first column) and a code (second column) indicating their distributions as follows: (A) Oriental, (B) Palaearctic, (C) Australian, (D) Afrotropical, (E) Neotropical, and (F) Nearctic. More than one letter indicates that the species occurs in more than one region. The file "infile_for_BEAST.txt" is the input file in XML format used for the molecular divergence time analysis using the program BEAST (Bayesian Evolutionary Analysis by Sampling Trees) as described in the Methods section of the manuscript. This file includes comments that document the steps of the analysis.
keywords:
leafhopper; phylogeny; DNA sequence; insect; timetree; biogeography
published:
2016-08-18
Copyright Review Management System renewals by year, data from Table 2 of the article "How Large is the ‘Public Domain’? A comparative Analysis of Ringer’s 1961 Copyright Renewal Study and HathiTrust CRMS Data."
keywords:
copyright; copyright renewals; HathiTrust
published:
2018-04-05
GBS data from Phaseolus accessions, for a study led by Dr. Glen Hartman, UIUC. <br />The (zipped) fastq file can be processed with the TASSEL GBS pipeline or other pipelines for SNP calling. The related article has been submitted and the methods section describes the data processing in detail.
published:
2018-04-23
Contains a series of datasets that score pairs of tokens (words, journal names, and controlled vocabulary terms) based on how often they co-occur within versus across authors' collections of papers. The tokens derive from four different fields of PubMed papers: journal, affiliation, title, MeSH (medical subject headings). Thus, there are 10 different datasets, one for each pair of token type: affiliation-word vs affiliation-word, affiliation-word vs journal, affiliation-word vs mesh, affiliation-word vs title-word, mesh vs mesh, mesh vs journal, etc.
Using authors to link papers and in turn pairs of tokens is an alternative to the usual within-document co-occurrences, and using e.g., citations to link papers. This is particularly striking for journal pairs because a paper almost always appears in a single journal and so within-document co-occurrences are 0, i.e., useless.
The tokens are taken from the Author-ity 2009 dataset which has a cluster of papers for each inferred author, and a summary of each field. For MeSH, title-words, affiliation-words that summary includes only the top-20 most frequent tokens after field-specific stoplisting (e.g., university is stoplisted from affiliation and Humans is stoplisted from MeSH). The score for a pair of tokens A and B is defined as follows. Suppose Ai and Bi are the number of occurrences of token A (and B, respectively) across the i-th author's papers, then
nA = sum(Ai); nB = sum(Ai)
nAB = sum(Ai*Bi) if A not equal B; nAA = sum(Ai*(Ai-1)/2) otherwise
nAnB = nA*nB if A not equal B; nAnA = nA*(nA-1)/2 otherwise
score = 1000000*nAB/nAnB if A is not equal B; 1000000*nAA/nAnA otherwise
Token pairs are excluded when: score < 5, or nA < cut-off, or nB < cut-off, or nAB < cut-offAB.
The cut-offs differ for token types and can be inferred from the datasets. For example, cut-off = 200 and cut-offAB = 20 for journal pairs.
Each dataset has the following 7 tab-delimited all-ASCII columns
1: score: roughly the number tokens' co-occurrence divided by the total number of pairs, in parts per million (ppm), ranging from 5 to 1,000,000
2: nAB: total number of co-occurrences
3: nAnB: total number of pairs
4: nA: number of occurrences of token A
5: nB: number of occurrences of token B
6: A: token A
7: B: token B
We made some of these datasets as early as 2011 as we were working to link PubMed authors with USPTO inventors, where the vocabulary usage is strikingly different, but also more recently to create links from PubMed authors to their dissertations and NIH/NSF investigators, and to help disambiguate PubMed authors. Going beyond explicit (exact within-field match) is particularly useful when data is sparse (think old papers lacking controlled vocabulary and affiliations, or papers with metadata written in different languages) and when making links across databases with different kinds of fields and vocabulary (think PubMed vs USPTO records). We never published a paper on this but our work inspired the more refined measures described in:
<a href="https://doi.org/10.1371/journal.pone.0115681">D′Souza JL, Smalheiser NR (2014) Three Journal Similarity Metrics and Their Application to Biomedical Journals. PLOS ONE 9(12): e115681. https://doi.org/10.1371/journal.pone.0115681</a>
<a href="http://dx.doi.org/10.5210/disco.v7i0.6654">Smalheiser, N., & Bonifield, G. (2016). Two Similarity Metrics for Medical Subject Headings (MeSH): An Aid to Biomedical Text Mining and Author Name Disambiguation. DISCO: Journal of Biomedical Discovery and Collaboration, 7. doi:http://dx.doi.org/10.5210/disco.v7i0.6654</a>
keywords:
PubMed; MeSH; token; name disambiguation
published:
2019-09-06
This is a dataset of 1101 comments from The New York Times (May 1, 2015-August 31, 2015) that contains a mention of the stemmed words vaccine or vaxx.
keywords:
vaccine;online comments
published:
2020-10-01
Fraterrigo, Jennifer; Rembelski, Mara
(2020)
We measured the effects of fire or drought treatment on plant, microbial and biogeochemical responses in temperate deciduous forests invaded by the annual grass Microstegium vimineum with a history of either frequent fire or fire exclusion.
Please note, on Documentation tab / Experimental or Sampling Design, “15 (XVI)” should be “16 (XVI)”.
keywords:
plant-soil interaction; grass-fire cycle; Microstegium; carbon and nitrogen cycling; microbial decomposers
published:
2025-10-10
Yang, Pan; Cai, Ximing; Leibensperger, Carrie; Khanna, Madhu
(2025)
The success of a bioenergy policy relies largely on the wide adoption of perennial energy crops at the farm scale. This study uses survey data to examine potential adoption decisions by farmers in the U.S. Midwest and the causal effects of various direct and indirect influencing factors, especially heterogeneous preferences of farmers. A Bayesian network (BN) model is developed to delineate the causal relationship between farmers adoption decisions and the influencing factors. We find a dominating role of economic factors and a non-negligible impact of non-economic factors, such as the perceived environmental benefits and the extent of familiarity with perennial energy crops. To examine the effect of heterogeneity in farmer preferences, we classify the surveyed farmers into four categories based on their attitudes toward the economic, social, and environmental dimensions of perennial energy crops. We identified statistically significant between-group differences in the responses of the four types of farmers to the various influencing factors. Our findings contribute to disentangling the complicated motivations that will influence perennial energy crop adoption decisions and provide implications for more targeted policy development that need to consider the heterogeneous drivers of farmer decisions about land use.
keywords:
Sustainability;Modeling
published:
2017-07-29
This dataset contains the PartMC-MOSAIC simulations used in the article “Plume-exit modeling to determine cloud condensation nuclei activity of aerosols from residential biofuel combustion”. The data is organized as a set of folders, each folder representing a different scenario modeled. Each folder contains a series of NetCDF files, which are the output of the PartMC-MOSAIC simulation. They contain information on particle and gas properties, both of the biofuel burning plume and background. Input files for PartMC-MOSAIC are also included. This dataset was used during the open review process at Atmospheric Chemistry and Physics (ACP) and supports both the discussion paper and final article.
keywords:
CCN; cloud condensation nuclei; activation; supersaturation; biofuel
published:
2017-10-10
Kozak, Derek L.; Luo, Jie; Olson, Scott M.; LaFave, James M.; Fahnestock, Larry A.
(2017)
This dataset contains ground motion data for Newmark Structural Engineering Laboratory (NSEL) Report Series 048, "Modification of ground motions for use in Central North America: Southern Illinois surface ground motions for structural analysis". The data are 20 individual ground motion time history records developed at each of the 10 sites (for a total of 200 ground motions). These accompanying ground motions are developed following the detailed procedure presented in Kozak et al. [2017].
keywords:
earthquake engineering; ground motion records; southern Illinois seismic hazard; dynamic structural analysis; conditional mean spectrum