Illinois Data Bank Dataset Search Results
Results
published:
2026-02-25
Bayer, Hugo; Binette , Annalise; Sweck, Samantha; Juliano, Vitor; Plas, Samantha; Ferst, Lara; Hassell Jr, James; Maren, Stephen
(2026)
Raw data from the article "Locus Coeruleus-Amygdala Circuit Disrupts Prefrontal Control to Impair Fear Extinction", which is accepted for publication in PNAS.
keywords:
Basolateral Amygdala; Fear conditioning; Infralimbic cortex; Learning and Memory; Norepinephrine
published:
2026-02-10
Ejiogu, Emmanuel; Peters, Baron
(2026)
This dataset contains the jupyter notebook and microsoft excel data used to reproduce the results from the eponymous paper.
1. "pourahmady data.xlsx" contains NMR data for triad and dyad sequences in a PVC/Polyethylene copolymer.
V is a vinyl chloride segment (-CH2CHCl-) and E is an ethylene segment (-CH2CH2-)
VE is the dyad -CH2CHCl-CH2CH2-
VC_frac_1 = fraction of vinyl chloride segments obtained from 13C-NMR
VC_frac_2 = fraction of vinyl chloride segments obtained from elemental analysis
2. "Triad_Kinetics.ipynb" contains code that fit data from "pourahmady data.xlsx"
published:
2024-12-11
MMAudio pretrained models. These models can be used in the open-sourced codebase https://github.com/hkchengrex/MMAudio
<b>Note:</b> mmaudio_large_44k_v2.pth and Readme.txt are added to this V2. Other 4 files stay the same.
published:
2022-03-25
Kudeki, Erhan; Reyes, Pablo
(2022)
Ground based radar data sets collected during the 2013 NASA EVEX Campaign conducted in Roi-Namur island of the Kwajalein Atoll in the Republic of Marshall Islands are deposited in this databank. Radar data were collected with IRIS VHF and ALTAIR VHF/UHF systems.
published:
2026-02-20
Emran, Shah-Al; Petersen, Bryan M; Roney, Heather Elizabeth ; Masters, Michael David ; Varela, Sebastian; Hedrick, Travis; Leakey, Andrew D.B. ; VanLoocke, Andy; Heaton, Emily A.
(2026)
This dataset contains biomass yield measurements and associated vegetation index data collected from commercial Miscanthus × giganteus fields in eastern Iowa during the 2022–2023 growing seasons.
The data support the analyses presented in the article:
“Yield From Iowa's First Commercial Miscanthus Fields: Implications of Spatial Variability for Productivity and Sustainability Beyond Research Plots.”
We collected 105 ground-truth biomass samples from four mature commercial fields (>4 years old) covering 92.81 ha.
Samples were taken from 3 m² quadrats that were hand-harvested in alignment with commercial harvest timing. Stem biomass (excluding leaves) was weighed, moisture-corrected, and converted to dry-matter yield expressed in Mg DM ha⁻¹.
Sampling locations were selected to capture spatial variability visible in aerial imagery and were recorded using RTK GPS.
Each biomass observation was paired with vegetation indices derived from high-resolution PlanetScope satellite imagery (3 m resolution).
Images were acquired throughout the growing season, and indices were calculated to evaluate their ability to predict end-of-season biomass yield.
Statistical and machine learning approaches were used to identify key predictors, and a linear regression model based on end-of-July Green Normalized Difference Vegetation Index (GNDVI) was developed and evaluated.
This repository includes the data used in that modeling workflow. Management practices, economic data, full imagery time series, and additional methodological details are described in the associated publication and are not included here.
The dataset consists of three comma-separated value (CSV) files:
1. Combine_Groundtruth_Yield_VI_22_23.csv
This file contains ground-truth biomass yield measurements and associated key vegetation index values collected during the 2022 and 2023 growing seasons.
Rows: 105 observations
Columns:
Year — Year of observation (2022 or 2023)
Field — Field location identifier
Sample_number — Unique sample identifier
GNDVI_End_Jul — Green Normalized Difference Vegetation Index calculated at end of July
GNDVI_End_Aug — Green Normalized Difference Vegetation Index calculated at end of August
NDRE_End_Aug — Normalized Difference Red Edge index calculated at end of August
Biomass_Stem_Yield_MgDM/ha — Measured stem biomass yield (megagrams dry matter per hectare)
2. trainData_GNDVI.csv
This file contains the subset of observations used to train the predictive relationship between July GNDVI and biomass yield.
Rows: 76 observations
Columns:
Unnamed: 0 — Row index retained from the original data processing workflow
GNDVI_End_Jul — GNDVI at end of July
Stem_Yield_MgDM/ha — Observed stem biomass yield (Mg DM ha⁻¹)
3. testData_GNDVI.csv
This file contains the test dataset used to evaluate model performance.
Rows: 29 observations
Columns:
Unnamed: 0 — Row index retained from the original data processing workflow
GNDVI_End_Jul — GNDVI at end of July
Predicted_Yield_MgDM/ha — Model-predicted stem biomass yield (Mg DM ha⁻¹)
Observed_Yield_MgDM/ha — Measured stem biomass yield (Mg DM ha⁻¹)
keywords:
Potential yield, yield gap, in-field management, yield prediction, remote sensing, spatial variability, profitability, Miscanthus × giganteus, M×g
published:
2026-02-19
Gurumoorthi, Akshay; Peters, Baron
(2026)
The dataset contains a jupyter notebook intended for anyone who wants to apply the Empirical Bayes method described in the paper titled 'Data for Improving individual committor estimates and data efficiency in reaction coordinate tests with the Empirical Bayes method' to committor data with a simple and lucid python script.
published:
2026-02-11
Hanley, David; Lee, Jongwon; Choi, Su Yeon; Bretl, Timothy
(2026)
If you use this dataset, please cite both the dataset and the associated data paper (bibtex is below).
@ARTICLE{11386847,
author={Hanley, David and Lee, Jongwon and Choi, Su Yeon and Bretl, Timothy},
journal={IEEE Transactions on Instrumentation and Measurement},
title={The MagPIE2 Dataset for Mapping, Localization, and Simultaneous Localization and Mapping Using Magnetic Fields},
year={2026},
volume={},
number={},
pages={1-1},
keywords={Magnetometers;Magnetic field measurement;Magnetic fields;Pedestrians;Location awareness;Buildings;Simultaneous localization and mapping;Measurement errors;Hardware;Calibration;Localization;mapping;SLAM;dataset;benchmark;magnetometer;magnetic field},
doi={10.1109/TIM.2026.3662919}}
We present a dataset for the evaluation of magnetic field-based robotic and pedestrian localization, mapping, and SLAM methods. This dataset contains magnetometer and inertial measurement unit data collected from inside three buildings both a pedestrian and a ground robot. Data were collected at different heights simultaneously, both with and without changes in the placement of objects that may affect magnetometer measurements. In total, approximately 689 square meters of floor space was covered by this dataset.
This dataset is archivally stored. We provide a GitHub site which is meant to serve as a forum to post issues with the dataset, share code using the dataset, and to resolve problems: <a href="https://github.com/hanley6/MagPIE2Forum">https://github.com/hanley6/MagPIE2Forum</a>
Note that while the dataset is meant to be permanently stored, this forum is not meant to guarantee perennial support and its existence will be dependent on the policies of GitHub.
<b>How is the dataset organized?</b> The data is divided into the following parts at a high level and more detailed information can be found in the Readme:
1. The walking portion of the dataset: CSL_WLK.zip, DCL_WLK.zip, Talbot_WLK.zip, and WLK_Misc.zip.
2. The robot portion of the dataset: Robot_Dataset.zip.
3. Motor interference tests: Motor_Interference_Test.zip.
4. Ground truth evaluation: Ground_Truth_Evaluation.zip.
5. Quick start results: Quick_Start_Results.zip.
<b>How is data recorded and stored?</b> Data is generally collected in the form of ROS bag files. Each ROS bag has Intel Realsense camera images, magnetometer readings, IMU readings, timestamps, and more as applicable for each file in the dataset. Each bag file has an associated metadata file written as a YAML file. This contains general information about each bag file including the start and stop time, who collected the bag file (during the pedestrian portion of the dataset), and the approximate location where data was collected. In several cases, additional comma separated (csv) files of the dataset where included either as a convenient supplement to ROS bag files (e.g., csv files of magnetometer calibration data) or because they serve as human readable quick start results.
<b>How does one set up and run files on the dataset?</b> The files are stored in ROS bags and are, therefore, meant to be run using the Robot Operating System. Information regarding how to use the Robot Operating System as well as installation instructions are available at: <a href="https://ros.org/">https://ros.org/</a>
keywords:
Localization; mapping; SLAM; dataset; benchmark; magnetometer; magnetic field
published:
2025-12-23
Aly, Abdallah; A. Saif, M. Taher
(2025)
The uploaded data is part of the paper titled: Self-Modifying Percolation Governs Detachment in Soft Suction Wet Adhesion, which shows the detachment mechanism of liquid suction-based adhesion.
published:
2026-01-28
Nahid, Shahriar Muhammad; Dong, Haiyue; Nolan, Gillian; Nam, Sungwoo; Mason, Nadya; Huang, Pinshane; van der Zande, Arend
(2026)
Room-temperature transfer curves; Benchmarking conductance; STEM images of charged domain walls; Temperature-dependent transfer curves; Scaling of conductance, hopping length, threshold voltage, trap density, and field-effect mobility with temperature; Magnetotransport data; Optical, AFM, and PFM image of different field-effect transistors; STEM images of contacts; Output and transfer curves of FETs; Additional STEM images of charged domain walls; Temperature scaling of subthreshold swing and threshold voltage difference; Comparison of maximum field-effect mobility for different structures
published:
2025-10-29
Chen, Chu-Chun; Dominguez, Francina; Matus, Sean
(2025)
This dataset contains variables from the European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis v5 (ERA5; Hersbach et al., 2020). These data were used for the analysis in “The impact of large-scale land surface conditions on the South American low-level jet” published in Geophysical Research Letters.
Acknowledgments:
This work was supported by NSF Award AGS-1852709. We thank Dr. Zhuo Wang and Dr. Divyansh Chug for their valuable feedback and insightful discussions.
References:
Hersbach H, Bell B, Berrisford P, et al. The ERA5 global reanalysis. Q J R Meteorol Soc. 2020; 146: 1999–2049. https://doi.org/10.1002/qj.3803
keywords:
atmospheric sciences; South American low-level jet; land-atmosphere interactions; soil moisture; regional atmospheric circulation; southeastern South America
published:
2026-01-14
Bansal, Prateek; Shukla, Diwakar
(2026)
This dataset contains the .npy and .pkl files required to reproduce the plots in the study.
keywords:
GPCR; activation; STE2; Class D; molecular dynamics
published:
2026-01-27
Trivellone, Valeria; Canuto, Francesca; Lucetti, Giulia; Dietrich, Christopher H.; Galetto, Luciana; Marzachì, Cristina
(2026)
Trivellone_etal_Full_PaperList_SystRev.xlsx: This dataset contains the list of peer-reviewed studies selected and critically appraised for a systematic review of quantitative PCR (qPCR) investigations tracking phytoplasma load dynamics in insect vectors. The dataset includes bibliographic information and selection status for each study, reflecting the inclusion and exclusion criteria applied during the review process. The literature search was completed on December 15, 2025. The list of inclusion and exclusion criteria are listed in the second spreadsheet.
Further methodological details, including search strategy, screening workflow, and appraisal criteria, are described in the associated paper, “Tracking the early spatio-temporal dynamics of phytoplasma multiplication within its leafhopper vector”, as well as in the Supplementary Materials (see below), by Valeria Trivellone, Francesca Canuto, Giulia Lucetti, Christopher H. Dietrich, Luciana Galetto, Cristina Marzachì.
keywords:
qPCR; systematic review; phytopalsma; multiplication; vector
published:
2025-05-07
Reves, Olivia; Larson, Eric
(2025)
Data collected at 71 study sites from 2023 to 2024 for Reves, Olivia P. (2025): Using Environmental DNA Metabarcoding to Inform Biodiversity Conservation in Agricultural Landscapes. Master's thesis, University of Illinois Urbana-Champaign. Files include study site information, taxa by site matrices for vertebrates from environmental DNA metabarcoding using multiple mitochondrial DNA primers (COI, 12S), and bird species audibly detected by a phone app at study sites.
keywords:
agricultural conservation; biodiversity; eDNA; environmental DNA; Illinois; metabarcoding; riparian buffers; stream flow; vertebrates
published:
2025-02-07
Wang, Binghui; Kudeki, Erhan
(2025)
Incoherent scatter radar datasets collected during the September 2016 campaign at Arecibo have been deposited in this databank. The lag products of the ISR data are stored as lag profile matrices with 5 minutes of integration time. The data is organized in a Python dictionary format, with each file containing 12 lag profile matrices representing one hour of observation. A sample Python script is provided to illustrate its usage.
published:
2025-12-18
Marshalla, Dan; Fraterrigo, Jennifer
(2025)
This dataset includes data from a study conducted in southern Illinois, USA, which was published in the Journal of Applied Ecology. The study investigated the interactive effects of fire history and invasion by the non-native grass Microstegium vimineum on fire intensity and oak regeneration in central hardwood forests. The dataset includes data on environmental conditions, historical fire occurrence, experimental fire intensity and fuel load, seedling and juvenile oak characteristics, Microstegium cover, and plot descriptions.
keywords:
Fire-grass-tree interactions; Historical fire regime; Invasive grasses; Microstegium vimineum, Post-fire oak survival; Prescribed fire
published:
2025-05-14
1228 egg hyperspectral images, the wavelength from 400 nm to 900 nm.
published:
2026-01-22
Edmonds, Devin; Du, Jane; Stickley, Samuel; Sucre, Samuel
(2026)
This dataset contains data and R scripts used to analyze the trade of non-native pet amphibians in the United States by integrating online classified advertisements with U.S. Fish and Wildlife Service import records. The data include records of amphibian advertisements, U.S. imports, taxonomic reference lists, and conservation status information. The dataset supports analyses identifying domestically produced species, species entering U.S. markets through unrecorded or unofficial trade pathways, and price differences associated with documented and undocumented trade. The dataset supports the analyses presented in an associated peer-reviewed publication in Biological Conservation.
keywords:
amphibian; biocommerce; biosecurity; conservation; LEMIS; pet trade; species laundering; wildlife trade
published:
2026-01-23
Kaman, Bobby; Lim, Jinho; Liu, Yingkai; Hoffmann, Axel
(2026)
Data related to a publication, "Emulating 2D Materials with magnons" to be published, but also as a preprint on arXiv https://arxiv.org/abs/2601.03210.
It contains scripts for the simulation program Mumax3, and python scripts for conversion and analysis.
keywords:
micromagnetics; mumax; tight-binding; spin waves; magnons
published:
2026-01-20
Willson, James; Warnow, Tandy
(2026)
Dataset from "CAMUS: Scalable Phylogenetic Network Estimation." This dataset contains simulated phylogenetic networks, gene trees, and sequence data.
- camus-dataset.tar.xz is the main archive containing all the simulated data. More details about the files and directories it contains can be found in README.md
- scripts.zip contains various scripts used in the simulation study.
keywords:
evolution; computational biology; bioinformatics; phylogenetics
published:
2026-01-21
Suthers, Patrick; Maranas, Costas
(2026)
Growth-coupling product formation can facilitate strain stability by aligning industrial objectives with biological fitness. Organic acids make up many building block chemicals that can be produced from sugars obtainable from renewable biomass. Issatchenkia orientalis is a yeast strain tolerant to acidic conditions and is thus a promising host for industrial production of organic acids. Here, we use constraint-based methods to assess the potential of computationally designing growth-coupled production strains for I. orientalis that produce 22 different organic acids under aerobic or microaerobic conditions. We explore native and engineered pathways using glucose or xylose as the carbon substrates as proxy constituents of hydrolyzed biomass. We identified growth-coupled production strategies for 37 of the substrate-product pairs, with 15 pairs achieving production for any growth rate. We systematically assess the strain design solutions and categorize the underlying principles involved.
keywords:
Bioproducts; Modeling
published:
2025-09-18
Chen, Maosi; Parton, William J.; Hartman, Melannie D.; Del Grosso, Stephen J.; Smith, William K.; Knapp, Alan; Lutz, Susan; Derner, Justin; Tucker, Compton; Ojima, Dennis; Volesky, Jerry; Stephenson, Mitchell B.; Schacht, Walter H.; Gao, Wei
(2025)
Productivity throughout the North American Great Plains grasslands is generally considered to be water limited, with the strength of this limitation increasing as precipitation decreases. We hypothesize that cumulative actual evapotranspiration water loss (AET) from April to July is the precipitation‐related variable most correlated to aboveground net primary production (ANPP) in the U.S. Great Plains (GP). We tested this by evaluating the relationship of ANPP to AET, precipitation, and plant transpiration (Tr). We used multi‐year ANPP data from five sites ranging from semiarid grasslands in Colorado and Wyoming to mesic grasslands in Nebraska and Kansas, mean annual NRCS ANPP, and satellite‐derived normalized difference vegetation index (NDVI) data. Results from the five sites showed that cumulative April‐to‐July AET, precipitation, and Tr were well correlated (R2: 0.54–0.70) to annual changes in ANPP for all but the wettest site. AET and Tr were better correlated to annual changes in ANPP compared to precipitation for the drier sites, and precipitation in August and September had little impact on productivity in drier sites. April‐to‐July cumulative precipitation was best correlated (R2 = 0.63) with interannual variability in ANPP in the most mesic site, while AET and Tr were poorly correlated with ANPP at this site. Cumulative growing season (May‐to‐September) NDVI (iNDVI) was strongly correlated with annual ANPP at the five sites (R2 = 0.90). Using iNDVI as a surrogate for ANPP, we found that county‐level cumulative April–July AET was more strongly correlated to ANPP than precipitation for more than 80% of the GP counties, with precipitation tending to perform better in the eastern more mesic portion of the GP. Including the ratio of AET to potential evapotranspiration (PET) improved the correlation of AET to both iNDVI and mean county‐level NRCS ANPP. Accounting for how different precipitation‐related variables control ANPP (AET in drier portion, precipitation in wetter portion) provides opportunity to develop spatially explicit forecasting of ANPP across the GP for enhancing decision‐making by land managers and use of grassland ANPP for biofuels.
keywords:
Sustainability;Field Data;Modeling
published:
2026-01-19
Fourkas, Austen; Looney, Leslie
(2026)
This dataset includes the FITS files for all ALMA images used in the ApJ publication "Multiband ALMA Polarization Observations of BHB 07-11 Reveal Aligned Dust Grains in Complex Spiral Arm Structures". Additionally, this dataset includes details regarding the data reduction process so that interested users can perform the reduction and imaging themselves.
keywords:
FITS files; ALMA data; reduction instructions
published:
2026-01-12
Yan, Qiang; Cordell, William; Jindra, Michael; Pfleger, Brian
(2026)
Microbial lipid metabolism is an attractive route for producing oleochemicals. The predominant strategy centers on heterologous thioesterases to synthesize desired chain-length fatty acids. To convert acids to oleochemicals (e.g., fatty alcohols, ketones), the narrowed fatty acid pool needs to be reactivated as coenzyme A thioesters at cost of one ATP per reactivation – an expense that could be saved if the acyl-chain was directly transferred from ACP- to CoA-thioester. Here, we demonstrate such an alternative acyl-transferase strategy by heterologous expression of PhaG, an enzyme first identified in Pseudomonads, that transfers 3-hydroxy acyl-chains between acyl-carrier protein and coenzyme A thioester forms for creating polyhydroxyalkanoate monomers. We use it to create a pool of acyl-CoA’s that can be redirected to oleochemical products. Through bioprospecting, mutagenesis, and metabolic engineering, we develop three strains of Escherichia coli capable of producing over 1 g/L of medium-chain free fatty acids, fatty alcohols, and methyl ketones.
keywords:
Bioproducts; Metabolomics
published:
2025-10-22
Yan, Qiang; Jacobson, Tyler B.; Ye, Zhou; Cortes-Peña, Yoel R.; Bhagwat, Sarang; Hubbard, Susan; Cordell, William T.; Oleniczak, Rebecca E.; Gambacorta, Francesca V.; Rivera-Vasquez, Julio; Shusta, Eric V.; Amador-Noguez, Daniel; Guest, Jeremy; Pfleger, Brian
(2025)
Plants produce many high-value oleochemical molecules. While oil-crop agriculture is performed at industrial scales, suitable land is not available to meet global oleochemical demand. Worse, establishing new oil-crop farms often comes with the environmental cost of tropical deforestation. The field of metabolic engineering offers tools to transplant oleochemical metabolism into tractable hosts while simultaneously providing access to molecules produced by non-agricultural plants. Here, we evaluate strategies for rewiring metabolism in the oleaginous yeast Yarrowia lipolytica to synthesize a foreign lipid, 3-acetyl-1,2-diacyl-sn-glycerol (acTAG). Oils made up of acTAG have a reduced viscosity and melting point relative to traditional triacylglycerol oils making them attractive as low-grade diesels, lubricants, and emulsifiers. This manuscript describes a metabolic engineering study that established acTAG production at g/L scale, exploration of the impact of lipid bodies on acTAG titer, and a techno-economic analysis that establishes the performance benchmarks required for microbial acTAG production to be economically feasible.
keywords:
Conversion;Sustainability;Biomass Analytics;Lipidomics;Metabolomics
published:
2025-11-20
Yan, Qiang; Cordell, William; Breckner, Christian; Chen, Xuanqi; Jindra, Michael; Pfleger, Brian
(2025)
Medium-chain length methyl ketones are potential blending fuels due to their cetane numbers and low melting temperatures. Biomanufacturing offers the potential to produce these molecules from renewable resources such as lignocellulosic biomass. In this work, we designed and tested metabolic pathways in Escherichia coli to specifically produce 2-heptanone, 2-nonanone and 2-undecanone. We achieved substantial production of each ketone by introducing chain-length specific acyl-ACP thioesterases, blocking the β-oxidation cycle at an advantageous reaction, and introducing active β-ketoacyl-CoA thioesterases. Using a bioprospecting approach, we identified 15 homologs of E. coli β-ketoacyl-CoA thioesterase (FadM) and evaluated the in vivo activity of each against various chain length substrates. The FadM variant from Providencia sneebia produced the most 2-heptanone, 2-nonanone, and 2-undecanone, suggesting it has the highest activity on the corresponding β-ketoacyl-CoA substrates. We tested enzyme variants, including acyl-CoA oxidases, thiolases, and bi-functional 3-hydroxyacyl-CoA dehydratases to maximize conversion of fatty acids to β-keto acyl-CoAs for 2-heptanone, 2-nonanone, and 2-undecanone production. In order to address the issue of product loss during fermentation, we applied a 20% (v/v) dodecane layer in the bioreactor and built an external water cooling condenser connecting to the bioreactor heat-transferring condenser coupling to the condenser. Using these modifications, we were able to generate up to 4.4 g/L total medium-chain length methyl ketones.
keywords:
Metabolomics; Metabolic Engineering