Illinois Data Bank Dataset Search Results
Results
published:
2022-02-20
Proescholdt, Randi; Hsiao, Tzu-Kun; Schneider, Jodi; Cohen, Aaron; McDonagh, Marian; Smalheiser, Neil
(2022)
This dataset contains the files used to perform the work savings and recall evaluation in the study titled "Data from Testing a filtering strategy for systematic reviews: Evaluating work savings and recall."
keywords:
systematic reviews; machine learning; work savings; recall; search results filtering
published:
2022-01-14
This dataset provides a 50-state (and DC) survey of state-level Opportunity Zones laws, including summaries of states' Opportunity Zone tax preferences, supplemental tax preferences, and approach to Opportunity Zones conformity. Data was last updated on January 14, 2022.
keywords:
Opportunity Zones; tax incentives; state law
published:
2024-12-01
Bishop, Rebecca C.; Kemper, Ann M.; Clark, Lindsay V.; Wilkins, Pamela A.; McCoy, Annette M.
(2024)
Healthy mares were kept at pasture for 3 weeks, stabled for 5 weeks, returned to pasture and an final sample collected 6 weeks later. Samples were collected weekly: gastric fluid by double-tube nasogastric intubation and aspiration, feces by rectal palpation. Microbial DNA was isolated using the QIAamp PowerFecal Pro DNA kit. Full length 16S, ITS and partial 23S rRNA gene libraries were created using the Shoreline Complete ID kit.
published:
2024-10-11
Zinnen, Jack; Barak, Rebecca; Matthews, Jeffrey
(2024)
This is the core data for Influence of ecological characteristics and phylogeny on native plant species’ commercial availability, a manuscript pending publication in Ecological Applications. The data regard ecological characteristics, phenology, and phylogeny of plant species native to the Midwestern United States and how those factors relate to commercial availability.
keywords:
biodiversity; native plant nursery; plant trade; plant vendors; restoration
published:
2021-05-09
Zuckermann, Federico
(2021)
Raw data and its analysis collected from a trial designed to test the impact of providing a Bacillus-based direct-fed microbial (DFM) on the syndrome resulting from orally infecting pigs with either Salmonella enterica serotype Choleraesuis (S. Choleraesuis) alone, or in combination with an intranasal challenge, three days later, with porcine reproductive and respiratory syndrome virus (PRRSV).
keywords:
excel file
published:
2025-11-14
Asadian, Marisa; Croslow, Seth; Trinklein, Timothy; Rubakhin, Stanislav; Lam, Fan; Sweedler, Jonathan
(2025)
We developed a sequential single-cell matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS) workflow that enables endogenous lipid profiling in the first step, followed by cell-type classification of the same cells via immunocytochemistry in the second step. This stepwise approach integrates high-throughput single-cell analysis enabled by microMS with multiplex immunolabeling using photocleavable mass tags (PCMTs), which are antibodies conjugated to peptide mass reporters that are photoreleased and then detected by MALDI-MS. This platform combines the strengths of untargeted chemical profiling with targeted marker-based cell identification, allowing characterization of the cells’ endogenous metabolic activity, followed by cell classification using well-established immunomarkers. Here, we provide the raw data, mzML-converted files, and LC-MS/MS data from rodent hippocampal cells as described in the manuscript.
keywords:
Single Cell Mass Spectrometry; MALDI; Hippocampal Cells; Lipidomics; Photocleavable Mass-tags
published:
2025-07-11
Xiang, Jingyi; Dinkel, Holly
(2025)
The MultiDLO data release supports the paper, "MultiDLO: Simultaneous Shape Tracking of Multiple Deformable Linear Objects with Global-Local Topology Preservation," presented in the IEEE International Conference on Robotics and Automation Workshop on Representing and Manipulating Deformable Objects in May 2023. The data release includes the raw image and depth data for simultaneously tracking multiple Deformable Linear Objects (DLOs). The released data are Robot Operating System (ROS1) bag files containing raw color images and point clouds. The data were collected using a static Intel Realsense d-435 RGB-D camera while DLOs in the field of view of the camera were manipulated. The data can be used to benchmark the performance of future DLO tracking or prediction algorithms in two manipulation scenarios relevant to DLOs and to verify existing DLO tracking algorithms. Please see the accompanying extended abstract, the code repository on GitHub, and the conference presentation video referenced in the `multidlo_data_release.pdf` document for more information.
keywords:
rosbag; perception for grasping and manipulation; RGBD perception; visual tracking; deformable linear objects; robotic manipulation
published:
2021-11-16
Prada, Cecilia M.; Turner, Benjamin L.; Dalling, James W.
(2021)
Data from an a field experiment at El Velo, Chiriqui, Republic of Panama. Data contain information about functional traits of seedlings growing in different treatments including type of forest, nitrogen addition and organic matter.
keywords:
Mycorrhiza; nitrogen; oak forest; Panama; plant-soil feedbacks, seedling growth
published:
2022-02-04
Addepalli, Amulya; Ann Subin, Karen; Schneider, Jodi
(2022)
keywords:
retracted papers; knowledge maintenance; keystone citations, Wakefield; misinformation in science; Information Quality Lab
published:
2022-02-09
Kansara, Yogeshwar; Hoang, Khanh Linh
(2022)
The data file contains a list of articles and their RCT Tagger prediction scores, which were used in a project associated with the manuscript "Evaluation of publication type tagging as a strategy to screen randomized controlled trial articles in preparing systematic reviews".
keywords:
Cochrane reviews; automation; randomized controlled trial; RCT; systematic reviews
published:
2019-10-19
Corey, Ryan M.; Skarha, Matthew D.; Singer, Andrew C.
(2019)
Large, distributed microphone arrays could offer dramatic advantages for audio source separation, spatial audio capture, and human and machine listening applications. This dataset contains acoustic measurements and speech recordings from 10 loudspeakers and 160 microphones spread throughout a large, reverberant conference room.
The distributed microphone system contains two types of array: four wearable microphone arrays of 16 sensors each placed near the ears and across the upper body, and twelve tabletop arrays of 8 microphones each in enclosures designed to resemble voice-assistant speakers. The dataset includes recordings of chirps that can be used to measure impulse responses and of speech clips derived from the CSTR VCTK corpus. The speech clips are recorded both individually and as a mixture to support source separation experiments.
The uncompressed files are about 13.4 GB.
keywords:
microphone arrays; audio source separation; augmented listening; wireless sensor networks
published:
2020-10-28
Curtis, Amanda; Tiemann, Jeremy; Douglass, Sarah; Davis, Mark; Larson, Eric
(2020)
We studied we examined the role of stream flow on environmental DNA (eDNA) concentrations and detectability of an invasive clam (Corbicula fluminea), while also accounting for other abiotic and biotic variables. This data includes the eDNA concentrations, quadrat estimates of clam density, and abiotic variables.
keywords:
Corbicula; detection probability; eDNA; invasive species; lotic; occupancy modeling
published:
2023-09-19
Salami, Malik Oyewale; Lee, Jou; Schneider, Jodi
(2023)
We used the following keywords files to identify categories for journals and conferences not in Scopus, for our STI 2023 paper "Assessing the agreement in retraction indexing across 4 multidisciplinary sources: Crossref, Retraction Watch, Scopus, and Web of Science".
The first four text files each contains keywords/content words in the form: 'keyword1', 'keyword2', 'keyword3', .... The file title indicates the name of the category:
file1: healthscience_words.txt
file2: lifescience_words.txt
file3: physicalscience_words.txt
file4: socialscience_words.txt
The first four files were generated from a combination of software and manual review in an iterative process in which we:
- Manually reviewed venue titles were not able to automatically categorize using the Scopus categorization or extending it as a resource.
- Iteratively reviewed uncategorized venue titles to manually curate additional keywords as content words indicating a venue title could be classified in the category healthscience, lifescience, physicalscience, or socialscience. We used English content words and added words we could automatically translate to identify content words. NOTE: Terminology with multiple potential meanings or contain non-English words that did not yield useful automatic translations e.g., (e.g., Al-Masāq) were not selected as content words.
The fifth text file is a list of stopwords in the form: 'stopword1', 'stopword2, 'stopword3', ...
file5: stopwords.txt
This file contains manually curated stopwords from venue titles to handle non-content words like 'conference' and 'journal,' etc.
This dataset is a revision of the following dataset:
Version 1: Lee, Jou; Schneider, Jodi: Keywords for manual field assignment for Assessing the agreement in retraction indexing across 4 multidisciplinary sources: Crossref, Retraction Watch, Scopus, and Web of Science. University of Illinois at Urbana-Champaign Data Bank.
Changes from Version 1 to Version 2:
- Added one author
- Added a stopwords file that was used in our data preprocessing.
- Thoroughly reviewed each of the 4 keywords lists. In particular, we added UTF-8 terminology, removed some non-content words and misclassified content words, and extensively reviewed non-English keywords.
keywords:
health science keywords; scientometrics; stopwords; field; keywords; life science keywords; physical science keywords; science of science; social science keywords; meta-science; RISRS
published:
2021-08-04
Sabrina, Sadia; Lewis, Quinn; Rhoads, Bruce
(2021)
This dataset contains data derived from large-scale particle velocimetry measurements obtained at the confluence of the Saline Branch and an unnamed tributary in Illinois. The data were collected using two cameras positioned about the confluence, one mounted on a cable and the other mounted on a tripod. A description of the content of the files can be found in Description of Files.rtf.
keywords:
confluence; hydrodynamics; LSPIV; flow structure; stagnation
published:
2021-03-23
Zhao, Yifan; Sharif, Hashim; Adve, Vikram; Misailovic, Sasa
(2021)
DNN weights used in the evaluation of the ApproxTuner system. Link to paper: https://dl.acm.org/doi/10.1145/3437801.3446108
published:
2022-07-10
Winogradoff, David; Chou, Han-Yi; Maffeo, Christopher; Aksimentiev, Aleksei
(2022)
keywords:
Nuclear pore complex; system files; trajectory files
published:
2022-07-25
A set of chemical entity mentions derived from an NERC dataset analyzing 900 synthetic biology articles published by the ACS. This data is associated with the Synthetic Biology Knowledge System repository (https://web.synbioks.org/). The data in this dataset are raw mentions from the NERC data.
keywords:
synthetic biology; NERC data; chemical mentions
published:
2025-06-03
Han, Jaeyeong; Ficca, Alyson; Lanzatella, Marissa; Leang, Kanika; Barnum, Matthew; Boudreaux, Jonathan; Schroeder, Nathan
(2025)
This data comprises image files used in the analysis of Analysis of Nematode Ventral Nerve Cords Suggests Multiple Instances of Evolutionary Addition and Loss of Neurons by Han et al. (bioRxiv, 2025: doi: https://doi.org/10.1101/2025.03.20.644414). It is separated into two folders. The first comprise data using DAPI staining to quantify the number of VNC nuclei in diverse nematodes. The second includes dye-filling data of Mononchus aquaticus.
keywords:
C. elegans; Mononchus; neuroanatomy; nematode nervous system; ventral nerve cord; secondary simplification
published:
2025-08-21
Lu, Yi; Sweedler, Jonathan; Zhou, Shuaizhen; Zhou, Yu
(2025)
Engineering efficient biocatalysts is essential for metabolic engineering to produce valuable bioproducts from renewable resources. However, due to the complexity of cellular metabolic networks, it is challenging to translate success in vitro into high performance in cells. To meet such a challenge, an accurate and efficient quantification method is necessary to screen a large set of mutants from complex cell culture and a careful correlation between the catalysis parameters in vitro and performance in cells is required. In this study, we employed a mass-spectrometry based high-throughput quantitative method to screen new mutants of 2-pyrone synthase (2PS) for triacetic acid lactone (TAL) biosynthesis through directed evolution in E. coli. From the process, we discovered two mutants with the highest improvement (46 fold) in titer and the fastest kcat (44 fold) over the wild type 2PS, respectively, among those reported in the literature. A careful examination of the correlation between intracellular substrate concentration, Michaelis-Menten parameters and TAL titer for these two mutants reveals that a fast reaction rate under limiting intracellular substrate concentrations is important for in-cell biocatalysis. Such properties can be tuned by protein engineering and synthetic biology to adopt these engineered proteins for the maximum activities in different intracellular environments.
keywords:
catalysis; mass spectrometry; metabolic engineering
published:
2020-04-06
McCoy, Annette; Lopp, Christine; Kooy, Sarah; Migliorisi, Alessandro; Austin, Scott; Wilkins, Pamela
(2020)
Raw measurement data for umbilical remnants (umbilical vein, umbilical arteries and urachus) in support of Equine Veterinary Journal publication "Normal Regression of the Internal Umbilical Remnant Structures in Standardbred Foals."
keywords:
equine; umbilicus; ultrasound
published:
2025-03-13
ALMA Band 4 and 7 observations of the dust continuum in the Class 0 protostellar system L1448 IRS3B. We include the selfcal script, imaging scripts, fits files, and the python scripts for the figures in the paper.
keywords:
ALMA; Band 4; Band 6; polarization; L1448 IRS3B
published:
2025-04-29
Bose, Anish; Schuster, Keaton; Sonam, Surabhi; Kodali, Chandril; Smith-Bolton, Rachel
(2025)
This page contains the data for the publication "The pioneer transcription factor Zelda controls the exit from regeneration and restoration of patterning in Drosophila" published in the journal Science Advances.
keywords:
Drosophila; regeneration; wing imaginal disc; Zelda
published:
2022-04-19
Nowak, Romana; Yang, Shuhong; Li, Kailiang; Bi, Jiajia; Drnevich, Jenny
(2022)
List of differentially expressed genes in human endometrial stromal cells with knockdown of Basigin (BSG) gene expression during decidualization.
The BSG siRNA or negative scrambled control siRNA were transfected into human endometrial stromal cells (HESCs) following the protocol of siLentFect™ Lipid (Bio-Rad, Hercules, CA. Following complete knock down of BSG in HESCs (72 hours after adding siRNA), HESCs were treated with medium containing estrogen, progesterone and cAMP to induce decidualization. BSG siRNA and negative control scrambled siRNA were added to the cells every four days (day 0, 4) over the course of the decidualization protocol. Total RNA was harvested at day 6 of the decidualization protocol for microarray analysis. Microarray analysis was performed at the University of Illinois at Urbana-Champaign Roy J. Carver Biotechnology Center. Briefly, 0.2 micrograms of total RNA were labeled using the Agilent two color QuickAmp labeling kit (Agilent Technologies, Santa Clara, CA) according to the manufacturer’s protocol. The optional spike-in controls were not used. Samples were hybridized to Human Gene Expression 4x44K v2 Microarray (Agilent Technologies, Santa Clara, CA) in an Agilent Hybridization Cassette according to standard protocols. The arrays were then scanned on an Axon GenePix 4000B scanner and the images were quantified using Axon GenePix 6.1.
Microarray data pre-processing and statistical analyses were done in R (v3.6.2) using the limma package (3.42.0 (Ritchie et al., 2015). Median foreground and median background values from the 4 arrays were read into R and any spots that had been manually flagged (-100 values) were given a weight of zero. The background values were ignored because investigations showed that trying to use them to adjust for background fluorescence added more noise to the data; background was low and even for all arrays, therefore no background correction was done.
The individual Cy5 and Cy3 fluorescence for each array were normalized together using the quantile method 3 (Yang and Thorne, 2003). Agilent's Human Gene Expression 4x44K v2 Microarray has a total of 45,220 probes: 1224 probes for positive controls, 153 negative control, 823 labeled “ignore” and 43,118 labeled “cDNA”. The pos+neg+ignore probes were used to ascertain the background level of fluorescence (6, on the log2 scale) then discarded. The cDNA probes comprise 34,127 unique 60mer probes, of which 999 probes are spotted 10 times each and the rest one time each. We averaged the replicate probes for those spotted 10 times and then fit a mixed model that had treatment and dye as fixed effects and array pairing as a random effect (Phipson et al., 2016; Smyth et al., 2005). After fitting the model but before False Discovery Rate (FDR) correction (Benjamini and Hochberg, 1995), probes were filtered out by the following criteria: 1) did not have at least 4/8 samples with expression values > 6 (14,105 probes removed), 2) no longer had an assigned Entrez Gene ID in Bioconductor’s HsAgilentDesign026652.db annotation package (v3.2.3; 2,152 probes removed) (Huber et al., 2015), 3) mapped to the same Entrez Gene ID as another probe but had a larger p-value for treatment effect (4,141 probes removed). This left 13,729 probes representing 13,729 unique genes.
<b>*Please note: that there is a discrepancy between the file and the readme as this plain text is the actual data file of this dataset.</b>
keywords:
Basigin; endometrium; decidualization; human
published:
2025-03-19
Bieri, Carolina A.; Dominguez, Francina; Miguez-Macho, Gonzalo; Fan, Ying
(2025)
This repository includes HRLDAS Noah-MP model output generated as part of Bieri et al. (2025) - Implementing deep soil and dynamic root uptake in Noah-MP (v4.5): Impact on Amazon dry-season transpiration.
These data are distributed in two different formats: Raw model output files and subsetted files that include data for a specific variable. All files are .nc format (NetCDF) and aggregated into .tar files to facilitate download. Given the size of these datasets, Globus transfer is the best way to download them.
Raw model output for four model experiments is available: FD (control), GW, SOIL, and ROOT. See the associated publication for information on the different experiments. These data span an approximately 20 year period from 01 Jun 2000 to 31 Dec 2019. The data have a spatial resolution of 4 km and a temporal frequency of 3 hours. These data are for a domain in the southern Amazon basin (see Figure 1 in the associated publication). Data for each experiment is available as a .tar file which includes 3-hourly NetCDF files. All default Noah-MP output variables are included in each file. As a result, the .tar files are quite large and may take many hours or even days to transfer depending on your network speed and local configurations. These files are named 'noahmp_output_2000_2019_EXP.tar', where EXP is the name of the experiment (FD, GW, SOIL, or ROOT).
Subsetted model output at a daily temporal resolution for all four model experiments is also available. These .tar files include the following variables: water table depth (ZWT), latent heat flux (LH), sensible heat flux (HFX), soil moisture (SOIL_M), canopy evaporation (ECAN), ground evaporation (EDIR), transpiration (ETRAN), rainfall rate at the surface (QRAIN), and two variables that are specific to the ROOT experiment: ROOTACTIVITY (root activity function) and GWRD (active root water uptake depth). There is one file for each variable within the tarred files. These files are named 'noahmp_output_subset_2000_2019_EXP.tar', where EXP is the name of the experiment (FD, GW, SOIL, or ROOT).
Finally, there is a sample dataset with raw 3-hourly output from the ROOT experiment for one day. The purpose of this sample dataset is to allow users to confirm if these data meet their needs before initiating a full transfer via Globus. This file is named 'noahmp_output_sample_ROOT.tar'.
The README.txt file provides information on the Noah-MP output variables in these datasets, among other specifications.
Information on HRLDAS Noah-MP and names/definitions of model output variables that are useful in working with these data are available here: http://dx.doi.org/10.5065/ew8g-yr95. Note that some output variables may be listed in this document under a different variable name, so searching for the long name (e.g. 'baseflow' instead of 'QRF') is recommended.
Information on additional output variables that were added to the model as part of this study is available here: https://github.com/bieri2/bieri-et-al-2025-EGU-GMD/tree/DynaRoot.
Model code, configuration files, and forcing data used to carry out the model simulations are linked in the related resources section.
keywords:
Land surface model; NetCDF
published:
2025-04-26
Alvarez, Jennifer; Fraterrigo, Jennifer; Dalling, James; Edgington, John
(2025)
Historical census data collected at Trelease Woods from 1986 to 2004 with information on tree species, diameter at breast height (DBH), and plot location.
keywords:
old-growth; temperate forest; species composition; forest dynamics; historical data