Illinois Data Bank Dataset Search Results
Results
published:
2021-03-31
This archive contains the datasets used in the paper "Recursive MAGUS: scalable and accurate multiple sequence alignment".
- 16S.3, 16S.T, 16S.B.ALL
- HomFam
- RNASim
These can also be found at https://sites.google.com/eng.ucsd.edu/datasets/alignment/pastaupp
published:
2022-01-20
This dataset provides a 50-state (and DC) survey of state-level tax credits modeled after the federal New Markets Tax Credit program, including summaries of the tax credit amount and credit periods, key definitions, eligibility criteria, application process, and degree of conformity to federal law.
keywords:
New Markets Tax Credits; NMTC; tax incentives; state law
published:
2022-01-20
This dataset provides a 50-state (and DC) survey of state-level enterprise zone laws, including summaries and analyses of zone eligibility criteria, eligible investments, incentives to invest in human capital and affordable housing, and taxpayer eligibility.
keywords:
Enterprise Zones; tax incentives; state law
published:
2025-09-18
Saifuddin, Mustafa; Bhatnagar, Jennifer; Segrè, Daniel; Finzi, Adrien C.
(2025)
Respiration by soil bacteria and fungi is one of the largest fluxes of carbon (C) from the land surface. Although this flux is a direct product of microbial metabolism, controls over metabolism and their responses to global change are a major uncertainty in the global C cycle. Here, we explore an in silico approach to predict bacterial C-use efficiency (CUE) for over 200 species using genome-specific constraint-based metabolic modeling. We find that potential CUE averages 0.62 ± 0.17 with a range of 0.22 to 0.98 across taxa and phylogenetic structuring at the subphylum levels. Potential CUE is negatively correlated with genome size, while taxa with larger genomes are able to access a wider variety of C substrates. Incorporating the range of CUE values reported here into a next-generation model of soil biogeochemistry suggests that these differences in physiology across microbial taxa can feed back on soil-C cycling.
keywords:
Sustainability;Metabolomics;Modeling
published:
2021-08-28
Southey, Bruce; Rodriguez-Zas, Sandra
(2021)
Metabolite identifications and profiles of liver samples from 22 day old male and female pigs from gilt that exposed to porcine reproductive and respiratory syndrome virus (P) or not (C) that were weaned at 21 days of age (W) or not (N). Profiles were obtained by University of Illinois Carver Metabolomics Center. Spectrum for each sample was acquired using a gas chromatography mass spectrometry system consisting of an Agilent 7890 gas chromatograph, an Agilent 5975 MSD, and an HP 7683B auto sampler.
keywords:
gas chromatography; mass spectrometry; maternal immune activation; weaning; liver
published:
2025-01-29
Quiroz, Edwin; Ashley, Mary V.; Zaya, David N.
(2025)
These data records weekly aphid and monarch butterfly (Danaus plexippus) neonate counts on individual milkweed plants in multiple raised garden beds in Chicago during the summers of 2023 and 2024. Relationships between aphid infestation and monarch neonates can be investigated along with weekly trends of monarch oviposition and aphid abundances. All gardens included in this study were on the University of Illinois Chicago campus, and within 100 meters of proximity. Data are provided on three milkweed species in 2023, and one milkweed species in 2024.
keywords:
Aphis; Myzocallis; Danaus plexippus; urban gardens; Asclepias syriaca; milkweeds
published:
2021-04-22
All code in Matlab .m scripts or functions (version R2019b)
Affiliated with article “Temperate and chronic virus competition leads to low lysogen frequency” published in the Journal of Theoretical Biology (2021)
Codes simulate and plot the solutions of an Ordinary Differential Equations model and generate bifurcation diagrams.
published:
2022-02-20
Proescholdt, Randi; Hsiao, Tzu-Kun; Schneider, Jodi; Cohen, Aaron; McDonagh, Marian; Smalheiser, Neil
(2022)
This dataset contains the files used to perform the work savings and recall evaluation in the study titled "Data from Testing a filtering strategy for systematic reviews: Evaluating work savings and recall."
keywords:
systematic reviews; machine learning; work savings; recall; search results filtering
published:
2022-01-14
This dataset provides a 50-state (and DC) survey of state-level Opportunity Zones laws, including summaries of states' Opportunity Zone tax preferences, supplemental tax preferences, and approach to Opportunity Zones conformity. Data was last updated on January 14, 2022.
keywords:
Opportunity Zones; tax incentives; state law
published:
2024-12-01
Bishop, Rebecca C.; Kemper, Ann M.; Clark, Lindsay V.; Wilkins, Pamela A.; McCoy, Annette M.
(2024)
Healthy mares were kept at pasture for 3 weeks, stabled for 5 weeks, returned to pasture and an final sample collected 6 weeks later. Samples were collected weekly: gastric fluid by double-tube nasogastric intubation and aspiration, feces by rectal palpation. Microbial DNA was isolated using the QIAamp PowerFecal Pro DNA kit. Full length 16S, ITS and partial 23S rRNA gene libraries were created using the Shoreline Complete ID kit.
published:
2024-10-11
Zinnen, Jack; Barak, Rebecca; Matthews, Jeffrey
(2024)
This is the core data for Influence of ecological characteristics and phylogeny on native plant species’ commercial availability, a manuscript pending publication in Ecological Applications. The data regard ecological characteristics, phenology, and phylogeny of plant species native to the Midwestern United States and how those factors relate to commercial availability.
keywords:
biodiversity; native plant nursery; plant trade; plant vendors; restoration
published:
2021-05-09
Zuckermann, Federico
(2021)
Raw data and its analysis collected from a trial designed to test the impact of providing a Bacillus-based direct-fed microbial (DFM) on the syndrome resulting from orally infecting pigs with either Salmonella enterica serotype Choleraesuis (S. Choleraesuis) alone, or in combination with an intranasal challenge, three days later, with porcine reproductive and respiratory syndrome virus (PRRSV).
keywords:
excel file
published:
2025-11-14
Asadian, Marisa; Croslow, Seth; Trinklein, Timothy; Rubakhin, Stanislav; Lam, Fan; Sweedler, Jonathan
(2025)
We developed a sequential single-cell matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS) workflow that enables endogenous lipid profiling in the first step, followed by cell-type classification of the same cells via immunocytochemistry in the second step. This stepwise approach integrates high-throughput single-cell analysis enabled by microMS with multiplex immunolabeling using photocleavable mass tags (PCMTs), which are antibodies conjugated to peptide mass reporters that are photoreleased and then detected by MALDI-MS. This platform combines the strengths of untargeted chemical profiling with targeted marker-based cell identification, allowing characterization of the cells’ endogenous metabolic activity, followed by cell classification using well-established immunomarkers. Here, we provide the raw data, mzML-converted files, and LC-MS/MS data from rodent hippocampal cells as described in the manuscript.
keywords:
Single Cell Mass Spectrometry; MALDI; Hippocampal Cells; Lipidomics; Photocleavable Mass-tags
published:
2025-07-11
Xiang, Jingyi; Dinkel, Holly
(2025)
The MultiDLO data release supports the paper, "MultiDLO: Simultaneous Shape Tracking of Multiple Deformable Linear Objects with Global-Local Topology Preservation," presented in the IEEE International Conference on Robotics and Automation Workshop on Representing and Manipulating Deformable Objects in May 2023. The data release includes the raw image and depth data for simultaneously tracking multiple Deformable Linear Objects (DLOs). The released data are Robot Operating System (ROS1) bag files containing raw color images and point clouds. The data were collected using a static Intel Realsense d-435 RGB-D camera while DLOs in the field of view of the camera were manipulated. The data can be used to benchmark the performance of future DLO tracking or prediction algorithms in two manipulation scenarios relevant to DLOs and to verify existing DLO tracking algorithms. Please see the accompanying extended abstract, the code repository on GitHub, and the conference presentation video referenced in the `multidlo_data_release.pdf` document for more information.
keywords:
rosbag; perception for grasping and manipulation; RGBD perception; visual tracking; deformable linear objects; robotic manipulation
published:
2021-11-16
Prada, Cecilia M.; Turner, Benjamin L.; Dalling, James W.
(2021)
Data from an a field experiment at El Velo, Chiriqui, Republic of Panama. Data contain information about functional traits of seedlings growing in different treatments including type of forest, nitrogen addition and organic matter.
keywords:
Mycorrhiza; nitrogen; oak forest; Panama; plant-soil feedbacks, seedling growth
published:
2022-02-04
Addepalli, Amulya; Ann Subin, Karen; Schneider, Jodi
(2022)
keywords:
retracted papers; knowledge maintenance; keystone citations, Wakefield; misinformation in science; Information Quality Lab
published:
2022-02-09
Kansara, Yogeshwar; Hoang, Khanh Linh
(2022)
The data file contains a list of articles and their RCT Tagger prediction scores, which were used in a project associated with the manuscript "Evaluation of publication type tagging as a strategy to screen randomized controlled trial articles in preparing systematic reviews".
keywords:
Cochrane reviews; automation; randomized controlled trial; RCT; systematic reviews
published:
2019-10-19
Corey, Ryan M.; Skarha, Matthew D.; Singer, Andrew C.
(2019)
Large, distributed microphone arrays could offer dramatic advantages for audio source separation, spatial audio capture, and human and machine listening applications. This dataset contains acoustic measurements and speech recordings from 10 loudspeakers and 160 microphones spread throughout a large, reverberant conference room.
The distributed microphone system contains two types of array: four wearable microphone arrays of 16 sensors each placed near the ears and across the upper body, and twelve tabletop arrays of 8 microphones each in enclosures designed to resemble voice-assistant speakers. The dataset includes recordings of chirps that can be used to measure impulse responses and of speech clips derived from the CSTR VCTK corpus. The speech clips are recorded both individually and as a mixture to support source separation experiments.
The uncompressed files are about 13.4 GB.
keywords:
microphone arrays; audio source separation; augmented listening; wireless sensor networks
published:
2020-10-28
Curtis, Amanda; Tiemann, Jeremy; Douglass, Sarah; Davis, Mark; Larson, Eric
(2020)
We studied we examined the role of stream flow on environmental DNA (eDNA) concentrations and detectability of an invasive clam (Corbicula fluminea), while also accounting for other abiotic and biotic variables. This data includes the eDNA concentrations, quadrat estimates of clam density, and abiotic variables.
keywords:
Corbicula; detection probability; eDNA; invasive species; lotic; occupancy modeling
published:
2023-09-19
Salami, Malik Oyewale; Lee, Jou; Schneider, Jodi
(2023)
We used the following keywords files to identify categories for journals and conferences not in Scopus, for our STI 2023 paper "Assessing the agreement in retraction indexing across 4 multidisciplinary sources: Crossref, Retraction Watch, Scopus, and Web of Science".
The first four text files each contains keywords/content words in the form: 'keyword1', 'keyword2', 'keyword3', .... The file title indicates the name of the category:
file1: healthscience_words.txt
file2: lifescience_words.txt
file3: physicalscience_words.txt
file4: socialscience_words.txt
The first four files were generated from a combination of software and manual review in an iterative process in which we:
- Manually reviewed venue titles were not able to automatically categorize using the Scopus categorization or extending it as a resource.
- Iteratively reviewed uncategorized venue titles to manually curate additional keywords as content words indicating a venue title could be classified in the category healthscience, lifescience, physicalscience, or socialscience. We used English content words and added words we could automatically translate to identify content words. NOTE: Terminology with multiple potential meanings or contain non-English words that did not yield useful automatic translations e.g., (e.g., Al-Masāq) were not selected as content words.
The fifth text file is a list of stopwords in the form: 'stopword1', 'stopword2, 'stopword3', ...
file5: stopwords.txt
This file contains manually curated stopwords from venue titles to handle non-content words like 'conference' and 'journal,' etc.
This dataset is a revision of the following dataset:
Version 1: Lee, Jou; Schneider, Jodi: Keywords for manual field assignment for Assessing the agreement in retraction indexing across 4 multidisciplinary sources: Crossref, Retraction Watch, Scopus, and Web of Science. University of Illinois at Urbana-Champaign Data Bank.
Changes from Version 1 to Version 2:
- Added one author
- Added a stopwords file that was used in our data preprocessing.
- Thoroughly reviewed each of the 4 keywords lists. In particular, we added UTF-8 terminology, removed some non-content words and misclassified content words, and extensively reviewed non-English keywords.
keywords:
health science keywords; scientometrics; stopwords; field; keywords; life science keywords; physical science keywords; science of science; social science keywords; meta-science; RISRS
published:
2021-08-04
Sabrina, Sadia; Lewis, Quinn; Rhoads, Bruce
(2021)
This dataset contains data derived from large-scale particle velocimetry measurements obtained at the confluence of the Saline Branch and an unnamed tributary in Illinois. The data were collected using two cameras positioned about the confluence, one mounted on a cable and the other mounted on a tripod. A description of the content of the files can be found in Description of Files.rtf.
keywords:
confluence; hydrodynamics; LSPIV; flow structure; stagnation
published:
2021-03-23
Zhao, Yifan; Sharif, Hashim; Adve, Vikram; Misailovic, Sasa
(2021)
DNN weights used in the evaluation of the ApproxTuner system. Link to paper: https://dl.acm.org/doi/10.1145/3437801.3446108
published:
2022-07-10
Winogradoff, David; Chou, Han-Yi; Maffeo, Christopher; Aksimentiev, Aleksei
(2022)
keywords:
Nuclear pore complex; system files; trajectory files
published:
2022-07-25
A set of chemical entity mentions derived from an NERC dataset analyzing 900 synthetic biology articles published by the ACS. This data is associated with the Synthetic Biology Knowledge System repository (https://web.synbioks.org/). The data in this dataset are raw mentions from the NERC data.
keywords:
synthetic biology; NERC data; chemical mentions
published:
2025-06-03
Han, Jaeyeong; Ficca, Alyson; Lanzatella, Marissa; Leang, Kanika; Barnum, Matthew; Boudreaux, Jonathan; Schroeder, Nathan
(2025)
This data comprises image files used in the analysis of Analysis of Nematode Ventral Nerve Cords Suggests Multiple Instances of Evolutionary Addition and Loss of Neurons by Han et al. (bioRxiv, 2025: doi: https://doi.org/10.1101/2025.03.20.644414). It is separated into two folders. The first comprise data using DAPI staining to quantify the number of VNC nuclei in diverse nematodes. The second includes dye-filling data of Mononchus aquaticus.
keywords:
C. elegans; Mononchus; neuroanatomy; nematode nervous system; ventral nerve cord; secondary simplification