Illinois Data Bank Dataset Search Results
Results
published:
2019-12-12
Kamuda, Mark; Huff, Kathryn
(2019)
This dataset contains gamma-ray spectra templates for a source interdiction and uranium enrichment measurement task. This dataset also contains Keras machine learning models trained using datasets created using these templates.
keywords:
gamma-ray spectroscopy; neural networks; machine learning; isotope identification; uranium enrichment; sodium iodide; NaI(Tl)
published:
2024-02-08
Martinez, Carlos; Pena, Gisselle; Wells, Kaylee K.
(2024)
This dataset contains transcribed entries from the "Prairie Directory of North America" (Adelman and Schwartz 2013) for the Tallgrass, Mixed Grass, and Shortgrass prairie regions of the united states. We identified the historical spatial extent of the Tallgrass, Mixed Grass, and Shortgrass prairie regions using Ricketts et al. (1999), Olson et al. (2001), and Dixon et al. (2014) and selected the counties entirely or partially within these boundaries from the USDA Forest Service (2022) file. The resulting lists of counties are included as separate files. The dataset contains information on publicly accessible grasslands and prairies in these regions including acreage and amenities like hunting access, restrooms, parking, and trails.
keywords:
grasslands; prairies; prairie directory of north america; site amenities; site attributes
published:
2020-03-13
Sweet, Andrew; Johnson, Kevin; Cameron, Stephen
(2020)
Data files associated with the assembly of mitochondrial minicircles from five species of parasitic lice. This includes data from four species in the genus Columbicola and from the human louse (Pediculus humanus). The files include FASTA sequences for all five species, reference sequences for read mapping approaches, resulting contigs produced by various assembly approaches, and alignments of human louse minicircles mapped to published sequences of the same species.
keywords:
mitochondria; FASTA; nucleotide sequences; alignment; Columbicola; Pediculus
published:
2021-09-06
Airglow images and Meteor radar data used in the paper "Mesospheric gravity wave activity estimated via airglow imagery, multistatic meteor radar, and SABER data taken during the SIMONe–2018 campaign".
keywords:
airglow; meteor radar; gravity waves; momentum flux;
published:
2021-05-12
Clem, Scott; Harmon-Threatt, Alexandra
(2021)
These are the data sets associated with our publication "Field borders provide winter refuge for beneficial predators and parasitoids: a case study on organic farms." For this project, we compared the communities of overwintering arthropod natural enemies in organic cultivated fields and wildflower-strip field borders at five different sites in central Illinois.
Abstract:
Semi-natural field borders are frequently used in midwestern U.S. sustainable agriculture. These habitats are meant to help diversify otherwise monocultural landscapes and provision them with ecosystem services, including biological control. Predatory and parasitic arthropods (i.e., potential natural enemies) often flourish in these habitats and may move into crops to help control pests. However, detailed information on the capacity of semi-natural field borders for providing overwintering refuge for these arthropods is poorly understood. In this study, we used soil emergence tents to characterize potential natural enemy communities (i.e., predacious beetles, wasps, spiders, and other arthropods) overwintering in cultivated organic crop fields and adjacent field borders. We found a greater abundance, species richness, and unique community composition of predatory and parasitic arthropods in field borders compared to arable crop fields, which were generally poorly suited as overwintering habitat. Furthermore, potential natural enemies tended to be positively associated with forb cover and negatively associated with grass cover, suggesting that grassy field borders with less forb cover are less well-suited as winter refugia. These results demonstrate that semi-natural habitats like field borders may act as a source for many natural enemies on a year-to-year basis and are important for conserving arthropod diversity in agricultural landscapes.
keywords:
Natural enemy; wildflower strips; conservation biological control; semi-natural habitat; field border; organic farming
published:
2024-07-11
Schneider, Amy; Suski, Cory
(2024)
published:
2020-08-25
Allan, Brian; Fredericks, Lisa
(2020)
The Allan Lab has published a Fluidigm pipeline online. This is the url: https://github.com/HPCBio/allan-fluidigm-pipeline.
This url includes a tutorial for running the pipeline. However it does not have test datasets yet.
This tarball hosted at the Illinois Data Bank is the dataset that completes the github tutorial.
It includes inputs (custom database of tick pathogens and fluidigm raw reads) and output files (tables of samples with taxonomic classifications).
keywords:
custom database of tick pathogens; fluidigm pipeline; fluidigm paired reads; fluidigm tutorial
published:
2020-04-02
Parker, Christine; Meador, Morgan; Hoover, Jeffrey
(2020)
Automatic and manual counts of black flies captured in Illinois.
keywords:
black flies; simuliids; ImageJ; count method
published:
2020-12-01
This is the data set from the published manuscript 'Vertebrate scavenger guild composition and utilization of carrion in an East Asian temperate forest' by Inagaki et al.
keywords:
Japan;Sika Deer
published:
2020-09-27
Data extracted from Text, Tables and Figures of publications in summarizing crop responses to Free-Air CO2 Elevation (FACE)
keywords:
Free Air CO2 Elevation; FACE; wheat, rice, soybean, cassava;
published:
2021-10-15
Jianhao, Peng; Idoia, Ochoa
(2021)
This is the 5 states 5000 cells synthetic expression file we used for validation of SimiC, a single cell gene regulatory network inference method with similarity constraints. Ground truth GRNs are stored in Numpy array format, and expression profiles of all states combined are stored in Pandas DataFrame in format of Pickle files.
keywords:
Numpy array; GRNs; Pandas DataFrame;
published:
2025-07-11
Zhixin, Zhang; Jinho, Lim; Haoyang, Ni; Jian-Min, Zuo; Axel, Hoffmann
(2025)
This dataset includes experimental data supporting the findings in the manuscript "Magnetostriction and Temperature Dependent Gilbert Damping in Boron Doped Fe80Ga20 Thin Films". It contains raw data for X-Ray diffraction, high resolution transmission electron microscopy, magnetic hysteresis loop measurement, magnetostriction measurement, and temperature dependent magnetic damping measurement.
keywords:
magnetostriction; magnetic damping; magnetoelasticity; magnon-phonon coupling
published:
2018-07-25
Scannapieco, Frank; Hoang, Linh; Schneider, Jodi
(2018)
The PDF describes the process and data used for the heuristic user evaluation described in the related article “<i>Evaluating an automatic data extraction tool based on the theory of diffusion of innovation</i>” by Linh Hoang, Frank Scannapieco, Linh Cao, Yingjun Guan, Yi-Yun Cheng, and Jodi Schneider (under submission).<br />
Frank Scannapieco assessed RobotReviewer data extraction performance on ten articles in 2018-02. Articles are included papers from an update review: Sabharwal A., G.-F.I., Stellrecht E., Scannapeico F.A. <i>Periodontal therapy to prevent the initiation and/or progression of common complex systemic diseases and conditions</i>. An update. Periodontol 2000. In Press. <br/>
The form was created in consultation with Linh Hoang and Jodi Schneider. To do the assessment, Frank Scannapieco entered PDFs for these ten articles into RobotReviewer and then filled in ten evaluation forms, based on the ten Robot Reviewer automatic data extraction reports. Linh Hoang analyzed these ten evaluation forms and synthesized Frank Scannapieco’s comments to arrive at the evaluation results for the heuristic user evaluation.
keywords:
RobotReviewer; systematic review automation; data extraction
published:
2020-02-01
Williams, Benjamin R.; Benson, Thomas J.
(2020)
This data describes habitat use, availability, landscape level influences, and daily movement of dabbling ducks in the Wabash River Valley of southeastern Illinois and southwestern Indiana. It contains triangulated locations of individual ducks, associated habitat assignments of those locations, flood survey data to determine water availability, and randomly generated points to assess landscape level questions.
keywords:
waterfowl; ducks; dabbling; mallard; teal; habitat
published:
2020-06-01
Hoover, Jeffrey P; Davros, Nicole M; Schelsky, Wendy; Brawn, Jeffry D
(2020)
Dataset associated with Hoover et al AUK-19-093 submission: Local conspecific density does not influence reproductive output in a secondary cavity-nesting songbird. Excel CSV with all of the data used in analyses.
Description of variables
YEARS: year
ORDINAL_DATE: number for what day of the year it is with 1 January = 1,……30 December = 365
SITE: acronym for each study site
BOX: unique nest box identifier on each study site
TREAT: designates whether nest box was in a high- or low- nest box density area within each study site
ACTUAL_NO_NEIGHBORS: number of pairs of warblers using a nest box within 200 m of a given pair’s nest box
CLUTCH_SIZE: number of warbler eggs in nest at the onset of incubation
PROWN: number of warbler nestlings once eggs have hatched
PROWF: number of warbler nestlings that fledged out of the nest box
HATCH_SUCCESS: proportion of eggs in the nest that hatched
FLEDG_SUCCESS: proportion of the nestlings that fledged from the nest box
HATCH_SUCCESS2: binary category where “0” indicates there was some, and “1” indicates there was no hatching failure
FLEDG_SUCCESS2: binary category where “0” indicates there was some, and “1” indicates there was no nestling failure (i.e. nestling death)
BHCO_PARASIT2: binary category where “0” indicates no cowbird parasitism, and “1” indicates there was cowbird parasitism
BHCOE: number of cowbird eggs in clutch
BHCOF: number of cowbird nestlings that fledged from the nest
PAIRID: unique number that identifies a male and female warbler that are together at a nest box and this number is the same in a subsequent nesting attempt or year if the same male and female are together again
FEMALE_ID: unique identifier for each female which represents her leg band combination. Each letter represents a band with letters preceding the hyphen being on the right leg and after the hyphen the left leg
FEM_AGE: binary category where “0” indicates a 1-year-old bird and “1” indicates a >1-year-old bird
FEMALE_BREEDING_ATTEMPT: “1” indicates first, “2” indicates second,……..breeding attempt within a given year
SECOND_ATTEMPT: for any female that fledged a brood in a given year, binary category where “0” represents that they did not, and “1” indicates that they did attempt a second brood that year
F_TOT_PROWF: total reproductive output (number of warbler fledglings produced) for a given female in a given year
MALE_ID: unique identifier for each male which represents his leg band combination. Each letter represents a band with letters preceding the hyphen being on the right leg and after the hyphen the left leg
MALE_AGE2: binary category where “0” indicates a 1-year-old bird and “1” indicates a >1-year-old bird
Provisioning_rate: total number of food provisions per nestling per hour by male and female warbler combined
BROOD_MASS: average nestling mass (g) for the brood
BROOD_TARSUS: average nestling tarsus length (mm) for the brood
Brood_condition: unit-less index of nestling condition that uses the residuals of the BROOD_MASS/BROOD_TARSUS relationship
A period (“.”) represents where data were not collected, not available, or because individual nest or female did not qualify for consideration of a category assignment.
An empty cell represents no data available for this particular cell.
keywords:
conspecific density; density dependence; food limitation; hatching success; nestling body condition; nestling provisioning; Prothonotary Warbler; reproductive output
published:
2022-02-09
Kansara, Yogeshwar; Hoang, Khanh Linh
(2022)
The data file contains a list of articles with PMIDs information, which were used in a project associated with the manuscript "Evaluation of publication type tagging as a strategy to screen randomized controlled trial articles in preparing systematic reviews".
keywords:
Cochrane reviews; Randomized controlled trials; RCT; Automation; Systematic reviews
published:
2016-05-16
This dataset contains the protein sequences and trees used to compare Non-Ribosomal Peptide Synthetase (NRPS) condensation domains in the AMB gene cluster and was used to create figure S1 in Rojas et al. 2015. Instead of having to collect representative sequences independently, this set of condensation domain sequences may serve as a quick reference set for coarse classification of condensation domains.
keywords:
NRPS; biosynthetic gene cluster; antimetabolite; Pseudomonas; oxyvinylglycine; secondary metabolite; thiotemplate; toxin
published:
2023-01-12
Mischo, William; Schlembach, Mary C.; Cabada, Elisandro
(2023)
This dataset was developed as part of a study that examined the correlational relationships between local journal authorship, local and external citation counts, full-text downloads, link-resolver clicks, and four global journal impact factor indices within an all-disciplines journal collection of 12,200 titles and six subject subsets at the University of Illinois at Urbana-Champaign (UIUC) Library. While earlier investigations of the relationships between usage (downloads) and citation metrics have been inconclusive, this study shows strong correlations in the all-disciplines set and most subject subsets. The normalized Eigenfactor was the only global impact factor index that correlated highly with local journal metrics. Some of the identified disciplinary variances among the six subject subsets may be explained by the journal publication aspirations of UIUC researchers. The correlations between authorship and local citations in the six specific subject subsets closely match national department or program rankings.
All the raw data used in this analysis, in the form of relational database tables with multiple columns. Can be opned using MS Access. Description for variables can be viewed through "Design View" (by right clik on the selected table, choose "Design View"). The 2 PDF files provide an overview of tables are included in each MDB file.
In addition, the processing scripts and Pearson correlation code is available at <a href="https://doi.org/10.13012/B2IDB-0931140_V1">https://doi.org/10.13012/B2IDB-0931140_V1</a>.
keywords:
Usage and local citation relationships; publication; citation and usage metrics; publication; citation and usage correlation analysis; Pearson correlation analysis
published:
2022-10-13
Xue, Qingquan; Xue, Qingquan; Dietrich, Christopher H.; Dietrich, Christopher H.; Zhang, Yalin; Zhang, Yalin
(2022)
The text file contains the original DNA nucleotide sequence data used in the phylogenetic analyses of Xue et al. (in review), comprising the 13 protein-coding genes and 2 ribosomal gene subunits of the mitochondrial genome. The text file is marked up according to the standard NEXUS format commonly used by various phylogenetic analysis software packages. The file will be parsed automatically by a variety of programs that recognize NEXUS as a standard bioinformatics file format. The first six lines of the file identify the file as NEXUS, indicate that the file contains data for 30 taxa (species) and 13078 characters, indicate that the characters are DNA sequence, that gaps inserted into the DNA sequence alignment are indicated by a dash, and that missing data are indicated by a question mark. The positions of data partitions are indicated in the mrbayes block of commands for the phylogenetic program MrBayes (version 3.2.6) beginning near the end of the file. The mrbayes block also contains instructions for MrBayes on various non-default settings for that program. These are explained in the Methods section of the submitted manuscript. Two supplementary tables in the provided PDF file provide additional information on the species in the dataset, including the GenBank accession numbers for the sequence data (Table S1) and the DNA substitution models used for each of the individual mitochondrial genes and for different codon positions of the protein-coding genes used for analyses in the programs MrBayes and IQ-Tree (version 1.6.8) (Table S2). Full citations for references listed in Table S1 can be found by searching GenBank using the corresponding accession number. The supplemental tables will also be linked to the article upon publication at the journal website.
keywords:
Hemiptera; phylogeny; mitochondrial genome; morphology; leafhopper
published:
2023-05-08
Stickley, Samuel; Fraterrigo, Jennifer
(2023)
This dataset includes microclimate species distribution models at a ~3 m2 spatial resolution and free-air temperature species distribution models at ~0.85 km2 spatial resolution for three plethodontid salamander species (Demognathus wrighti, Desmognathus ocoee, and Plethodon jordani) across Great Smoky Mountains National Park. We also include heatmaps representing the differences between microclimate and free-air species distribution models and polygon layers representing the fragmented habitat for each species' predicted range. All datasets include predictions for 2010, 2030, and 2050.
keywords:
Ecological niche modeling, microclimate, species distribution model, spatial resolution, range loss, suitable habitat, plethodontid salamanders, montane ecosystems
published:
2023-07-05
Fu, Yuanxi; Hsiao, Tzu-Kun; Joshi, Manasi Ballal; Lischwe Mueller, Natalie
(2023)
The salt controversy is the public health debate about whether a population-level salt reduction is beneficial. This dataset covers 82 publications--14 systematic review reports (SRRs) and 68 primary study reports (PSRs)--addressing the effect of sodium intake on cerebrocardiovascular disease or mortality. These present a snapshot of the status of the salt controversy as of September 2014 according to previous work by epidemiologists: The reports and their opinion classification (for, against, and inconclusive) were from Trinquart et al. (2016) (Trinquart, L., Johns, D. M., & Galea, S. (2016). Why do we think we know what we know? A metaknowledge analysis of the salt controversy. International Journal of Epidemiology, 45(1), 251–260. https://doi.org/10.1093/ije/dyv184 ), which collected 68 PSRs, 14 SRRs, 11 clinical guideline reports, and 176 comments, letters, or narrative reviews. Note that our dataset covers only the 68 PSRs and 14 SRRs from Trinquart et al. 2016, not the other types of publications, and it adds additional information noted below.
This dataset can be used to construct the inclusion network and the co-author network of the 14 SRRs and 68 PSRs. A PSR is "included" in an SRR if it is considered in the SRR's evidence synthesis. Each included PSR is cited in the SRR, but not all references cited in an SRR are included in the evidence synthesis or PSRs. Based on which PSRs are included in which SRRs, we can construct the inclusion network. The inclusion network is a bipartite network with two types of nodes: one type represents SRRs, and the other represents PSRs. In an inclusion network, if an SRR includes a PSR, there is a directed edge from the SRR to the PSR. The attribute file (report_list.csv) includes attributes of the 82 reports, and the edge list file (inclusion_net_edges.csv) contains the edge list of the inclusion network. Notably, 11 PSRs have never been included in any SRR in the dataset. They are unused PSRs. If visualized with the inclusion network, they will appear as isolated nodes.
We used a custom-made workflow (Fu, Y. (2022). Scopus author info tool (1.0.1) [Python]. https://github.com/infoqualitylab/Scopus_author_info_collection ) that uses the Scopus API and manual work to extract and disambiguate authorship information for the 82 reports. The author information file (salt_cont_author.csv) is the product of this workflow and can be used to compute the co-author network of the 82 reports.
We also provide several other files in this dataset. We collected inclusion criteria (the criteria that make a PSR eligible to be included in an SRR) and recorded them in the file systematic_review_inclusion_criteria.csv. We provide a file (potential_inclusion_link.csv) recording whether a given PSR had been published as of the search date of a given SRR, which makes the PSR potentially eligible for inclusion in the SRR. We also provide a bibliography of the 82 publications (supplementary_reference_list.pdf). Lastly, we discovered minor discrepancies between the inclusion relationships identified by Trinquart et al. (2016) and by us. Therefore, we prepared an additional edge list (inclusion_net_edges_trinquart.csv) to preserve the inclusion relationships identified by Trinquart et al. (2016).
<b>UPDATES IN THIS VERSION COMPARED TO V2</b> (Fu, Yuanxi; Hsiao, Tzu-Kun; Joshi, Manasi Ballal (2022): The Salt Controversy Systematic Review Reports and Primary Study Reports Network Dataset. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-6128763_V2)
- We added a new column "pub_date" to report_list.csv
- We corrected mistakes in supplementary_reference_list.pdf for report #28 and report #80. The author of report #28 is not Salisbury D but Khaw, K.-T., & Barrett-Connor, E. Report #80 was mistakenly mixed up with report #81.
keywords:
systematic reviews; evidence synthesis; network analysis; public health; salt controversy;
published:
2025-03-12
Jeong, Gangwon; Villa, Umberto; Park, Seonyeong; Anastasio, Mark A.
(2025)
References
- Jeong, Gangwon, Umberto Villa, and Mark A. Anastasio. "Revisiting the joint estimation of initial pressure and speed-of-sound distributions in photoacoustic computed tomography with consideration of canonical object constraints." Photoacoustics (2025): 100700.
- Park, Seonyeong, et al. "Stochastic three-dimensional numerical phantoms to enable computational studies in quantitative optoacoustic computed tomography of breast cancer." Journal of biomedical optics 28.6 (2023): 066002-066002.
Overview
- This dataset includes 80 two-dimensional slices extracted from 3D numerical breast phantoms (NBPs) for photoacoustic computed tomography (PACT) studies. The anatomical structures of these NBPs were obtained using tools from the Virtual Imaging Clinical Trial for Regulatory Evaluation (VICTRE) project. The methods used to modify and extend the VICTRE NBPs for use in PACT studies are described in the publication cited above.
- The NBPs in this dataset represent the following four ACR BI-RADS breast composition categories:
> Type A - The breast is almost entirely fatty
> Type B - There are scattered areas of fibroglandular density in the breast
> Type C - The breast is heterogeneously dense
> Type D - The breast is extremely dense
- Each 2D slice is taken from a different 3D NBP, ensuring that no more than one slice comes from any single phantom.
File Name Format
- Each data file is stored as a .mat file. The filenames follow this format: {type}{subject_id}.mat where{type} indicates the breast type (A, B, C, or D), and {subject_id} is a unique identifier assigned to each sample. For example, in the filename D510022534.mat, "D" represents the breast type, and "510022534" is the sample ID.
File Contents
- Each file contains the following variables:
> "type": Breast type
> "p0": Initial pressure distribution [Pa]
> "sos": Speed-of-sound map [mm/μs]
> "att": Acoustic attenuation (power-law prefactor) map [dB/ MHzʸ mm]
> "y": power-law exponent
> "pressure_lossless": Simulated noiseless pressure data obtained by numerically solving the first-order acoustic wave equation using the k-space pseudospectral method, under the assumption of a lossless medium (corresponding to Studies I, II, and III).
> "pressure_lossy": Simulated noiseless pressure data obtained by numerically solving the first-order acoustic wave equation using the k-space pseudospectral method, incorporating a power-law acoustic absorption model to account for medium losses (corresponding to Study IV).
* The pressure data were simulated using a ring-array transducer that consists of 512 receiving elements uniformly distributed along a ring with a radius of 72 mm.
* Note: These pressure data are noiseless simulations. In Studies II–IV of the referenced paper, additive Gaussian i.i.d. noise were added to the measurement data. Users may add similar noise to the provided data as needed for their own studies.
- In Study I, all spatial maps (e.g., sos) have dimensions of 512 × 512 pixels, with a pixel size of 0.32 mm × 0.32 mm.
- In Study II and Study III, all spatial maps (sos) have dimensions of 1024 × 1024 pixels, with a pixel size of 0.16 mm × 0.16 mm.
- In Study IV, both the sos and att maps have dimensions of 1024 × 1024 pixels, with a pixel size of 0.16 mm × 0.16 mm.
keywords:
Medical imaging; Photoacoustic computed tomography; Numerical phantom; Joint reconstruction
published:
2019-09-17
Fraebel, David T.; Kuehn, Seppe
(2019)
BAM files for evolved strains from migration rate selection experiments conducted in low viscosity (0.2% w/v) agar plates containing M63 minimal medium with 1mM of mannose, melibiose, N-acetylglucosamine or galactose
published:
2019-10-16
Human annotations of randomly selected judged documents from the AP 88-89, Robust 2004, WT10g, and GOV2 TREC collections. Seven annotators were asked to read documents in their entirety and then select up to ten terms they felt best represented the main topic(s) of the document. Terms were chosen from among a set sampled from the document in question and from related documents.
keywords:
TREC; information retrieval; document topicality; document description
published:
2022-03-09
Rapti, Zoi; Rivera Quinones, Vanessa; Stewart Merrill, Tara
(2022)
MATLAB files for the analysis of an ODE model for disease transmission. The codes may be used to find equilibrium points, study transient dynamics, evaluate the basic reproductive number (R0), and simulate the model when parameters depend on the independent variables. In addition, the codes may be used to perform local sensitivity analysis of R0 on the model parameters.