Illinois Data Bank Dataset Search Results
Results
published:
2025-12-14
Fraterrigo, Jennifer; Chen, Weile
(2025)
This dataset contains information about absorptive roots from 170 plots along a latitudinal and temperature gradient in northern Alaska, including tussock sedges and deciduous alder, birch, and willow shrubs. This dataset accompanies the paper "Impacts of Arctic Shrubs on Root Traits and Belowground Nutrient Cycles Across a Northern Alaskan Climate Gradient," which was published in Frontiers in Plant Sciences.
keywords:
absorptive root traits; shrub expansion; Arctic; Alaskan tundra
published:
2025-12-10
Raghavan, Arjun; Bae, Seokjin; Delegan, Nazar; Heremans, F. Joseph; Madhavan, Vidya
(2025)
Data for 'Atomic-scale imaging and charge state manipulation of NV centers by scanning tunneling microscopy' to be published in Nature Communications.
keywords:
STM; scanning tunneling microscopy; nitrogen-vacancy; NV centers
published:
2025-12-09
Hsu, Felicity Ting-Yu; Smith-Bolton, Rachel
(2025)
This page contains the data for the publication "Myc and Tor drive growth and cell competition in the regeneration blastema of Drosophila wing imaginal discs" published in Development, 2025.
keywords:
Drosophila; regeneration; Myc; Tor; blastema; translation; cell competition
published:
2025-12-01
Park, Minhyuk; Yi, Haotian; Warnow, Tandy; Chacko, George
(2025)
This dataset principally consists of four synthetic citation networks that were generated during the preparation of the manuscript Park M, Yi H, Warnow T, and Chacko G (2025). Modeling the Global Citation Network using the Scalable Agent-based Simulator for Citation Analysis with Recency-emphasized Sampling (SASCA-ReS). A preprint is available on Zenodo (below) and the manuscript has been submitted to the MetaRoR platform for review and feedback.
@misc{park_2025_17789558,
author = {Park, Minhyuk and
Yi, Haotian and
Warnow, Tandy and
Chacko, George},
title = {Modeling the Global Citation Network using the
Scalable Agent-based Simulator for Citation
Analysis with Recency-emphasized Sampling (SASCA-
ReS)
},
month = dec,
year = 2025,
publisher = {Zenodo},
doi = {10.5281/zenodo.17789558},
url = {https://doi.org/10.5281/zenodo.17789558},
}
The networks are roughly 14, 76, 161, and 218 million nodes each. Both nodelists with attributes and edge lists are provided as gzipped parquet files along with the configuration file that was passed to the SASCA-ReS software, which can be accessed at: <a href="https://github.com/illinois-or-research-analytics/SASCA-ReS">https://github.com/illinois-or-research-analytics/SASCA-ReS</a>. A copy of the configuration file that was used to generate the network with SASCA-ReS is also provided. For example: abm14_config.ini; abm14_edgelist.parquet.gz; and abm14_nodelist.parquet.gz. The column headers in the edgelists and nodelists and the fields in the configuration file are explained in the Github repository for SASCA-ReS.
In addition, we provide sj_reccount, a table of real world citation frequencies that is an input to the SASCA-Res software. The first column (diff) of sj_reccount lists the difference between the publication year of a citing document and the publication year of a cited document. The second column (count) reports the frequency of such citations across the dataset of 77879427 observations, which is derived from the biomedical literature. Finally, we share data, composite_maverick_disruption.csv , from the mavericks (unconventional citing strategies) experiment reported in the Park et al. (2025) manuscript available at <a href="https://zenodo.org/records/17772113">https://zenodo.org/records/17772113</a>. The columns in the composite_maverick_disruption.csv file are:
node_id -> of agents in the various simulations
n_i, n_j, n_k -> terms used to compute disruption per "Wu, L., Wang, D. & Evans, J.A. Large teams develop and small teams disrupt science and technology. Nature 566, 378–382 (2019). <a href="https://doi.org/10.1038/s41586-019-0941-9">https://doi.org/10.1038/s41586-019-0941-9"</a>
disruption -> the disruption metric of Wu, Wang, and Evans (2019)
type -> maverick type (maximizer, randomnik, or minimizer)
year -> virtual year in the simulation when the maverick was created
alpha -> the alpha parameter of the control agent
pa_weight -> the preferential attachment weight of the control agent phenotype
fit_peak_value -> the fitness value assigned to the control agent
in_degree -> the count of citations accumulated by the maverick or control agent at the end of the simulation
out_degree -> the count of references made by the maverick
tag -> a label for the experiment, e.g. od249_f1 indicates that the mavericks in this experiment made 249 citations and were assigned a fitness value of 1.
keywords:
synthetic networks; agent based models; SASCA-ReS; citation networks
published:
2025-12-09
Chase, Marissa H.; Fraterrigo, Jennifer M.; Charles, Brian; Harmon-Threatt, Alexandra
(2025)
The dataset includes bee community data from a study conducted down in southern Illinois across three forested public land sites. Bee diversity and abundance data, as well as environmental variables, are included for each plot. Each plot was visited a total of four times.
keywords:
wild bees; forest management; resource availability
published:
2023-09-20
Chase, Marissa H. ; Charles, Brian; Harmon-Threatt, Alexandra; Fraterrigo, Jennifer
(2023)
Dataset includes bee trait information and species abundance information for bees collected at 29 forests plots in southern Illinois, USA. Plots are located within three public land sites. Environmental data were also collected for each of the 29 plots.
keywords:
wild bees; forest management; functional traits
published:
2020-09-17
Refsland, Tyler; Knapp, Benjamin; Stephan, Kirsten; Fraterrigo, Jennifer
(2020)
Data are from a long-term fire manipulation experiment in the Missouri Ozarks, USA. Data include the raw, annual ring-width increment (rwl), basal area increment (BAI), population-level annual growth resistance (Drs) and resilience (Drl) to drought, intrinsic water use efficiency values (WUEi) and oxygen isotopic composition of individual radial growth rings (δ18O) from southern red oak (Quercus falcata) and post oak (Q. stellata) trees.
----------------------
TITLE:
Data for "Sixty-five years of fire manipulation reveals climate and fire interact to determine growth rates of Quercus spp."
----------------------
FILE OVERVIEW:
This dataset contains four (4) CSV files as described below:
Refsland_et_al_ECS20-0465_BAI.csv: annual basal area increment between 1948-2015 for trees across the fire manipulation experiment
Refsland_et_al_ECS20-0465_DroughtIndices.csv: population-level drought resistance and resilience of trees during each target drought period
Refsland_et_al_ECS20-0465_WUEi.csv: carbon isotope indicators of drought stress for trees across the fire manipulation experiment
Refsland_et_al_ECS20-0465_d18Or.csv: oxygen isotope indicators of drought stress for trees across the fire manipulation experiment
----------------------
VARIABLE EXPLANATION:
All the variables in those four files are explained as below:
treeID: unique character string that identifies subject tree
block: integer (1, 2) that identifies the study block
plot: integer (1-12) that identifies the plot nested within each study block
trt: character string (Annual, Control, Periodic) that identifies the fire treatment of a given plot
species: character string (Quercus falcata, Quercus stellata) that identifies species of subject tree
year: integer (1948-2015) that identifies the dated year of each tree ring
rwl_mm: numerical value representing the annual tree ring-width, in mm
bai_cm2: numerical value representing the annual basal area increment, in cm2
timeperiod: integer value (1953, 1964, 2007, 2012) representing the periods encompassing target dry and wet years
Drs_2yr: numerical value representing the drought resistance, defined as the population-level annual growth of trees during drought years relative to pre-drought years for a given time period
Drl_2yr: numerical value representing the drought resilience, defined as the population-level annual growth of trees following drought years relative to pre-drought years for a given time period
stand_ba_m2ha: numerical value representing the total basal area of a given plot, in m2 per ha
stand_density_stems_ha: numerical value representing the total stem density of a given plot, in stems per ha
pool: numerical value (1-40) identifying the set of tree ring samples pooled for analysis. Samples were pooled by block, plot, year and species
period: integer value (1953, 1964, 1980, 2007, 2012) representing the periods encompassing target dry and wet years
type: character string (Dry, Wet) indicating the water availability of a given year
d13C: numerical value representing the carbon isotopic composition of radial growth rings within a given sample pool, in per mil
WUEi: numerical value representing the annual intrinsic water use efficiency of radial growth rings within a given sample pool
d18O: numerical value representing the oxygen isotopic composition of radial growth rings within a given sample pool, in per mil
keywords:
climate change adaptation; drought; fire; nitrogen availability; oak-hickory; radial growth; resilience; resistance; stand density; temperate broadleaf forest; water stress
published:
2021-10-15
Perez, Sierra; Dalling, James; Fraterrigo, Jennifer
(2021)
Information on the location, dimensions, time of treefall or death, decay state, wood nutrient, wood pH and wood density data, and soil moisture, slope, distance from forest edge and soil nutrient data associated with the publication "Interspecific wood trait variation predicts decreased carbon residence time in changing forests" authored by Sierra Perez, Jennifer Fraterrigo, and James Dalling.
** <b>Note:</b> Blank cells indicate that no data were collected.
keywords:
wood decay; carbon residence time; coarse woody debris; decomposition, temperate forests
published:
2025-04-26
Alvarez, Jennifer; Fraterrigo, Jennifer; Dalling, James; Edgington, John
(2025)
Historical census data collected at Trelease Woods from 1986 to 2004 with information on tree species, diameter at breast height (DBH), and plot location.
keywords:
old-growth; temperate forest; species composition; forest dynamics; historical data
published:
2025-04-27
Alvarez, Jennifer; Fraterrigo, Jennifer; Dalling, James
(2025)
Soil data for ten soil cores collected at Trelease Woods in 2022. Soil samples were analyzed with an elemental analyzer via combustion to obtain total carbon (C) and nitrogen. A subset of these samples were analyzed using the Walkley-Black method to obtain organic C. A calibration curve relating organic C and total C was created using these data.
keywords:
old-growth; temperate forest; soil carbon; soil nitrogen; nutrient cycling
published:
2025-04-28
Alvarez, Jennifer; Fraterrigo, Jennifer; Dalling, James
(2025)
Dataset of the standing dead trees at Trelease Woods in 2022. Dataset contains volume, biomass, decay class, and GPS coordinates for each standing dead tree.
keywords:
old-growth; temperate forest; standing deadwood; census data
published:
2020-10-01
Fraterrigo, Jennifer; Rembelski, Mara
(2020)
We measured the effects of fire or drought treatment on plant, microbial and biogeochemical responses in temperate deciduous forests invaded by the annual grass Microstegium vimineum with a history of either frequent fire or fire exclusion.
Please note, on Documentation tab / Experimental or Sampling Design, “15 (XVI)” should be “16 (XVI)”.
keywords:
plant-soil interaction; grass-fire cycle; Microstegium; carbon and nitrogen cycling; microbial decomposers
published:
2025-12-08
Li, Shuai; Moller, Christopher; Mitchell, Noah G.; Martin, Duncan; Sacks, Erik; Saikia, Sampurna; Labonte, Nicholas R.; Baldwin, Brian S.; Morrison, Jesse; Ferguson, John; Leakey, Andrew; Ainsworth, Elizabeth
(2025)
The leaf economics spectrum (LES) describes multivariate correlations in leaf structural, physiological and chemical traits, originally based on diverse C3 species grown under natural ecosystems. However, the specific contribution of C4 species to the global LES is studied less widely. C4 species have a CO2 concentrating mechanism which drives high rates of photosynthesis and improves resource use efficiency, thus potentially pushing them towards the edge of the LES. Here, we measured foliage morphology, structure, photosynthesis, and nutrient content for hundreds of genotypes of the C4 grass Miscanthus × giganteus grown in two common gardens over two seasons. We show substantial trait variations across M. × giganteus genotypes and robust genotypic trait relationships. Compared to the global LES, M. × giganteus genotypes had higher photosynthetic rates, lower stomatal conductance, and less nitrogen content, indicating greater water and photosynthetic nitrogen use efficiency in the C4 species. Additionally, tetraploid genotypes produced thicker leaves with greater leaf mass per area and lower leaf density than triploid genotypes. By expanding the LES relationships across C3 species to include C4 crops, these findings highlight that M. × giganteus occupies the boundary of the global LES and suggest the potential for ploidy to alter LES traits.
keywords:
Feedstock Production;Biomass Analytics;Field Data
published:
2025-12-08
Maitra, Shraddha; Viswanathan, Mothi Bharath; Park, Kiyoul; Kannan, Baskaran; Cano Alfanar, Sofia; McCoy, Scott M.; Cahoon, Edgar; Altpeter, Fredy; Leakey, Andrew; Singh, Vijay
(2025)
Plant oils are increasingly in demand as renewable feedstocks for biodiesel and biochemicals. Currently, oilseeds are the primary source of plant oils. Although the vegetative tissues of plants express lipid metabolism pathways, they do not hyper-accumulate lipids. Elevated synthesis, storage, and accumulation of lipids in vegetative tissues have been achieved by metabolic engineering of sugarcane to produce “oilcane.” This study evaluates the potential of oilcane as a renewable feedstock for the co-production of lipids and fermentable sugars. Oilcane was grown under favorable climatic and field conditions in Florida (FLOC) as well as during an abbreviated growing season, outside its typical growing region, in Illinois (ILOC). The potential lipid yield of 0.35 tons/ha was projected from the hyperaccumulation of fatty acids in the stored vegetative biomass of FLOC, which is approaching the lipid yield of soybean (0.44 tons/ha). Processing of the vegetative tissues of oilcane recovered 0.20 tons/ha, which represents the recovery of 55% of the total lipids from FLOC. Chemical-free hydrothermal bioprocessing of ILOC and FLOC bagasse and leaves at 180 °C for 10 min prevented the degeneration of in situ plant lipids. This allowed the recovery of lipids at the end of the bioprocess with a major fraction of lipids remaining in the biomass residues after pretreatment and saccharification. Improvements through refined biomass processing, crop management, and metabolic engineering are expected to boost lipid yields and make oilcane a prime feedstock for the production of biodiesel.
keywords:
Conversion;Feedstock Production;Feedstock Bioprocessing;Lipidomics;Metabolomics
published:
2025-12-05
Sahbaz, Furkan; Bogdanov, Simeon
(2025)
This dataset contains all raw data corresponding to the figures in the main text and appendices of the paper "Dispersion Engineering of Planar Sub-millimeter Wave Waveguides and Resonators with Low Radiation Loss."
keywords:
thz science; quantum information processing; quantum transduction; high energy physics; axion detection; ultra-sensitive detection
published:
2025-12-05
Zhao, Huimin; Litman, Zachary C.; Wang, Yajie; Hartwig, John F.
(2025)
Living organisms rely on simultaneous reactions catalysed by mutually compatible and selective enzymes to synthesize complex natural products and other metabolites. To combine the advantages of these biological systems with the reactivity of artificial chemical catalysts, chemists have devised sequential, concurrent, and cooperative chemoenzymatic reactions that combine enzymatic and artificial catalysts. Cooperative chemoenzymatic reactions consist of interconnected processes that generate products in yields and selectivities that cannot be obtained when the two reactions are carried out sequentially with their respective substrates. However, such reactions are difficult to develop because chemical and enzymatic catalysts generally operate in different media at different temperatures and can deactivate each other. Owing to these constraints, the vast majority of cooperative chemoenzymatic processes that have been reported over the past 30 years can be divided into just two categories: chemoenzymatic dynamic kinetic resolutions of racemic alcohols and amines, and enzymatic reactions requiring the simultaneous regeneration of a cofactor. New approaches to the development of chemoenzymatic reactions are needed to enable valuable chemical transformations beyond this scope. Here we report a class of cooperative chemoenzymatic reaction that combines photocatalysts that isomerize alkenes with ene-reductases that reduce carbon–carbon double bonds to generate valuable enantioenriched products. This method enables the stereoconvergent reduction of E/Z mixtures of alkenes or reduction of the unreactive stereoisomers of alkenes in yields and enantiomeric excesses that match those obtained from the reduction of the pure, more reactive isomers. The system affords a range of enantioenriched precursors to biologically active compounds. More generally, these results show that the compatibility between photocatalysts and enzymes enables chemoenzymatic processes beyond cofactor regeneration and provides a general strategy for converting stereoselective enzymatic reactions into stereoconvergent ones.
keywords:
Conversion;Catalysis
planned publication date:
2026-02-01
Edmonds, Devin A.; Fanomezantsoa, Rebecca E.; Rabibisoa, Nirhy H. C.; Roberts, Sam H.
(2026)
This dataset contains ecological and demographic data for William’s bright‑eyed frog (Boophis williamsi), a critically endangered amphibian restricted to the Ankaratra Massif in Madagascar’s central highlands. Field surveys were conducted between September 2018 – March 2019 and July 2021 across ten 100‑m stream transects to estimate abundance and identify habitat associations for both tadpoles and adult frogs. Data include repeated counts of individuals and associated habitat variables (e.g., canopy cover, substrate type, stream depth, discharge, and temperature). Abundance was estimated using N‑mixture models implemented in R (version 4.3.1) with the ubms package, with separate models for tadpoles and frogs to account for differences in detection probability. The dataset consists of multiple CSV files capturing microhabitat, environmental variables, and raw survey count data (y_frogs.csv and y_tadpoles.csv) and an R script (boophis_abundance.R) used for model fitting. The dataset was compiled for an article accepted in the Herpetological Journal by the British Herpetological Society and is intended to support long‑term monitoring and conservation planning for B. williamsi and other threatened amphibians in Madagascar.
keywords:
amphibian conservation; biodiversity conservation; detection probability; endangered species; N-mixture model
published:
2025-11-06
Sweedler, Jonathan; Rosado Rosa, Joenisse M.
(2025)
SCiLS MSI data files, images used in the figures and table contents for the tables found in the manuscript. The figures are labeled by figure and by their title on each figure set, including those found in the Supplementary Information. The tables are in an MS Excel sheet with the corresponding contents. The tables list the metabolites found in the images. To reduce the number of images in the manuscript, the tables complete the metabolite information not observed in the images. The images can be found using the SCiLS data files. A software license is needed to open these files. The SCiLS data files contains the processed MSI data for all obtained images. All files in the corresponding SCiLS data file must be present to open the individual data file. The feature list used for MSI analysis should be saved on the attached bookmark inside the SCiLS file so it should be available once the file is opened. SCiLS files can only be opened with the Bruker SCiLS software. If using an outdated version (before Version 13.01.17218), the files may not open or show poor quality.
keywords:
Tendrils; Pyocyanin; Quinolones; Spatiochemical; Metabolomics
published:
2021-03-06
Lim, Teck Yian; Markowitz, Spencer Abraham; Do, Minh
(2021)
This dataset consists of raw ADC readings from a 3 transmitter 4 receiver 77GHz FMCW radar, together with synchronized RGB camera and depth (active stereo) measurements.
The data is grouped into 4 distinct radar configurations:
- "indoor" configuration with range <14m
- "30m" with range <38m
- "50m" with range <63m
- "high_res" with doppler resolution of 0.043m/s
# Related code
https://github.com/moodoki/radical_sdk
# Hardware Project Page
https://publish.illinois.edu/radicaldata
keywords:
radar; FMCW; sensor-fusion; autonomous driving; dataset; RGB-D; object detection; odometry
published:
2016-05-19
Donovan, Brian; Work, Dan
(2016)
This dataset contains records of four years of taxi operations in New York City and includes 697,622,444 trips. Each trip records the pickup and drop-off dates, times, and coordinates, as well as the metered distance reported by the taximeter. The trip data also includes fields such as the taxi medallion number, fare amount, and tip amount. The dataset was obtained through a Freedom of Information Law request from the New York City Taxi and Limousine Commission.
The files in this dataset are optimized for use with the ‘decompress.py’ script included in this dataset. This file has additional documentation and contact information that may be of help if you run into trouble accessing the content of the zip files.
keywords:
taxi;transportation;New York City;GPS
published:
2020-06-26
Gasparik, Jessica T.; Ye, Qing; Curtis, Jeffrey H.; Presto, Albert A.; Donahue, Neil M.; Sullivan, Ryan C.; West, Matthew; Riemer, Nicole
(2020)
This dataset contains the PartMC-MOSAIC simulations used in the article "Quantifying Errors in the Aerosol Mixing-State Index Based on Limited Particle Sample Size". The 1000 simulations of output data is organized into a series of archived folders, each containing 100 scenarios. Within each scenario directory are 25 NetCDF files, which are the hourly output of a PartMC-MOSAIC simulation containing all information regarding the environment, particle and gas state. This dataset was used to investigate the impact of sample size on determining aerosol mixing state. This data may be useful as a data set for applying different types of estimators.
keywords:
Atmospheric aerosols; single-particle measurements; sampling uncertainty; NetCDF
published:
2024-02-16
Mohasel Arjomandi, Hossein; Korobskiy, Dmitriy; Chacko, George
(2024)
This dataset contains five files. (i) open_citations_jan2024_pub_ids.csv.gz, open_citations_jan2024_iid_el.csv.gz, open_citations_jan2024_el.csv.gz, and open_citation_jan2024_pubs.csv.gz represent a conversion of Open Citations to an edge list using integer ids assigned by us. The integer ids can be mapped to omids, pmids, and dois using the open_citation_jan2024_pubs.csv and open_citations_jan2024_pub_ids.scv files. The network consists of 121,052,490 nodes and 1,962,840,983 edges. Code for generating these data can be found https://github.com/chackoge/ERNIE_Plus/tree/master/OpenCitations.
(ii) The fifth file, baseline2024.csv.gz, provides information about the metadata of PubMed papers. A 2024 version of PubMed was downloaded using Entrez and parsed into a table restricted to records that contain a pmid, a doi, and has a title and an abstract. A value of 1 in columns indicates that the information exists in metadata and a zero indicates otherwise. Code for generating this data: https://github.com/illinois-or-research-analytics/pubmed_etl. If you use these data or code in your work, please cite https://doi.org/10.13012/B2IDB-5216575_V1.
keywords:
PubMed
published:
2025-09-29
Blanc-Betes, Elena
(2025)
DayCent MUVP version (Methanogenesis, UV litter degradation and Photosynthesis). DAYCENT is the daily time-step version of the CENTURY biogeochemical model (Parton et al., 1994). DAYCENT simulates fluxes of C and N among the atmosphere, vegetation, and soil (Del Grosso et al., 2001a; Parton et al., 1998). Key submodels include soil water content and temperature by layer, plant production and allocation of net primary production (NPP), decomposition of litter and soil organic matter, mineralization of nutrients, N gas emissions from nitrification and denitrification, and CH4 oxidation in non-saturated soils.
keywords:
biogeochemical model
published:
2025-05-01
Wang, Weiwei; Khanna, Madhu
(2025)
BEPAM, Biofuel and Environmental Policy Analysis Model, models the agricultural sector and determines economically optimal land-use and feedstock mix at the US scale by maximizing the sum of agricultural sector consumers’ and producers’ surplus subject to various resource balances, land availability, and technological constraints under a range of biomass prices, from zero to $140 Mg-1 over the 2016-2030 period. Here BEPAM is used to model SAF production using energy crops and crop residues. BEPAM uses the GAMS format and uses yield and GHG balance projections from the biogeochemical model, DayCent.
keywords:
BEPAM; Energy crops; direct and indirect land use change; soil carbon sequestration; fossil fuel displacement; economic incentives
published:
2025-09-25
Vu-Le, The-Anh; Park, Minhyuk; Chen, Ian; Warnow, Tandy
(2025)
Dataset for "Using Stochastic Block Models for Community Detection". This contains synthetic networks with ground-truth community structure generated using synthetic network generators (specifically, ABCD+o) based on real-world networks and computed clusterings on these real-world networks.
Note:
* networks.zip contains the synthetic networks