Illinois Data Bank Dataset Search Results
Results
published:
2026-05-06
Haas, Benjamin; Saif, Faaiza; Doran, Lynn; Burgess, Steven; Long, Stephen
(2026)
Scripts for the manuscript "A fluorescence-based transient expression assay for the analysis of upstream open reading frames in plant" by Haas et al.
Upstream open reading frames (uORFs) are regulatory elements present in the 5′ leaders of mRNA that can significantly impact downstream gene expression in eukaryotes. In crop engineering, editing of uORFs can provide an avenue to upregulate expression of native genes without the need to add persistent transgenic copies. Even with genome- wide methods to identify translated uORFs such as ribosome profiling, their functional characterization depends on validation through reporter gene assays and mutagenesis studies. Current screening methods for plants use luciferases or protoplasts to measure differential gene expression between wild- type and mutated transcript leaders, which requires tissue processing and/or substrate addition. Here, we present a time- and cost- efficient alternative to investigate transcript leaders by co- expression of two fluorescent proteins in Nicotiana benthamiana leaf tissue and test our assay on genes involved in photoprotection, editing of which could provide a pathway to increase CO2 assimilation during sun–shade transitions.
keywords:
Gene Editing; Photosynthesis; Plant Transformation; Transient Expression
published:
2026-05-06
Park, Minhyuk; Yi, Haotian; Chen, Ian; Warnow, Tandy; Chacko, George
(2026)
The dataset contains sample data from those generated for the manuscript "Modeling citations and cartels" by Park et al. (2026), who describe the use of the SASCA-ReSA agent-based model to simulate the growth of citation networks and mimic citation cartels through simulations. The manuscript is presently under review. SASCA-ReSA s the latest stage in a series of progressively complex models of citation dynamics (Chacko et al. 2026 Applied Network Science, Park et al 2025 Proceedings of the XIV International Conference on Complex Networks and their Applications , Park et al 2025 MetaRoR). The model is implemented for high performance computing environments and all the results were generated on the Illinois Campus Cluster. The standard simulation reported in this manuscript results in roughly 1.2M nodes. The input to a simulation is a seed network, a configuration file, and real-world distributions for number of references made per article, and the count of authors per article. The output of a simulation is a larger citation network that includes the input network. Details of the model are described in the manuscript and instructions on how to use the software are available on the SASCA-ReSA GIthub site. We have included annotated nodelists from three different simulations.
<b>a) bsl1 (bsl1.csv.tar.xz):</b> has 1,193,102 rows, output of a standard simulation.
<b>b) p5_1 (p5_1.csv.tar.xz):</b> has 1,193,102 rows, output of a standard simulation with 5 agents "planted" in year 1 of the simulation.
<b>c) ps5_1 (ps5_1.csv.tar.xz):</b> has 1,193,102 rows, output of a standard simulation with one agent planted in each of the first five years of a simulation.
<b>d) sample_config.ini:</b> contains configuration parameters for a simulation
<b>e) louvain.parquet.gz:</b> has 160,714,032 rows, with two columns: node_id, and cluster_id with header row data representing a louvain clustering of the ABM161 network (<a href "https://doi.org/10.13012/B2IDB-9265079_V1">https://doi.org/10.13012/B2IDB-9265079_V1</a>). Generated using the louvain module from through kuzu and compressed using to_parquet module of pandas with gzip internal compression. The largest cluster (cluster id 5) has 81,675,241 nodes. This network was generated under the SASCA-ReS model.
keywords:
citation dynamics; agent-based models
published:
2026-05-05
Lin, Xiaoying; Kim, Chansong; Vo, Thi; Waltmann, Tommy; Liu, Haihua; Lu, Jun; Li, Jiahui; Liu, Yu-Shen; Kannur, Suraj; Lee, Junseo; Hwang, Chu-Yun; Kalutantirige, Falon C.; Yao, Lehan; Kotov, Nicholas A.; Glotzer, Sharon; Chen, Qian
(2026)
This dataset contains the raw transmission electron microscopy (TEM) and scanning electron microscopy (SEM) images used in the main figures of the paper “The Importance of Nano-edges in Atomic Stencilling and Chiroptically Active Assembly of Patchy Gold Tetrahedra (2026).” All the images were acquired at the Materials Research Laboratory, University of Illinois at Urbana-Champaign, by Qian Chen group.
1. We provide five subfolders, each named according to the corresponding figure numbers in the paper.
2. All files in the subfolders for Figures 1–3 and 5 are named as "Panel [letter]_*", where [letter] (e.g., a, b, c) represents the raw images used for the corresponding panels.
3. All files in the subfolder for Figure 4 correspond to panel f and show the configurations of patchy tetrahedra synthesized at varying concentrations of iodide and 2-naphthalenethiol. They are named "Experiment_[number]", where [number] represents the corresponding data points in the phase diagram.
4. In TEM images, the bright and dark regions indicate the polymer patches and nanoparticle cores, respectively.
5. In SEM images, the bright and dark regions indicate the nanoparticle cores and polymer patches, respectively.
6. Abbreviations in file names: HAADF-STEM (high-angle annular dark-field scanning transmission electron microscopy), PINEM (photon-induced near-field electron microscopy), and RCP/LCP (left-/right-handed circularly polarized).
keywords:
Patchy nanoparticle; polymer; synthesis; self-assembly; chirality
published:
2026-04-30
Mitchell, Cheyenne; Dhruva, Dhananjay; Burke, Zachary; Durden, David; Dingilian, Armine; Backlund, Mikael
(2026)
Raw and analyzed data, analysis code for "Quantum-inspired super-resolution of fluorescent point-like sources" (Nature Communications, accepted, 2026).
published:
2026-04-29
Park, Seonyeong; Jeong, Gangwon; Villa, Umberto; Anastasio, Mark
(2026)
This dataset is a subset of a companion dataset to the manuscript:
Seonyeong Park, Gangwon Jeong, Umberto Villa, Mark A. Anastasio, "A Virtual Imaging Framework for Three-Dimensional Quantitative Optoacoustic Tomography Using Stochastic Numerical Breast Phantoms," arXiv preprint arXiv:2510.00189 (2025) <a href="https://doi.org/10.48550/arXiv.2510.00189">https://doi.org/10.48550/arXiv.2510.00189</a>
This subset was specifically used in the following publication:
Refik Mert Cam, Seonyeong Park, Umberto Villa, Mark A. Anastasio, "Application of a Virtual Imaging Framework for Investigating a Deep Learning-based Reconstruction Method for 3D Quantitative Photoacoustic Computed Tomography," Photoacoustics 100792 (2025) <a href="https://doi.org/10.1016/j.pacs.2025.100792">https://doi.org/10.1016/j.pacs.2025.100792</a>
The dataset contains 64 sets of three-dimensional (3D) numerical breast phantoms (NBPs) for use in virtual imaging studies of optoacoustic tomography (OAT), along with the corresponding simulated multi-wavelength optical fluence distributions, induced initial pressure distributions, and OAT measurement data. Each set corresponds to a distinct breast anatomy and includes four anatomy-matched variants: (i) a healthy breast with Fitzpatrick skin tone 1, and (ii-iv) lesion-inserted breasts with Fitzpatrick skin tones 1, 3, and 5.
More detailed information is provided in the accompanying README.txt file.
keywords:
Virtual imaging; In silico imaging; Numerical breast phantoms; Optoacoustic tomography; Photoacoustic computed tomography; Breast imaging
published:
2026-05-04
Tan, Shih-I; Ng, I-Son; Zhao, Huimin
(2026)
Biological production of 5‐aminolevulinic acid (5‐ALA) has received growing attentionover theyears.However, thereis the tradeoff between 5‐ALA biosynthesis and cell growth because the fermentation broth will become acidic due to the production of 5‐ALA. To address this limitation, we engineered an acid‐tolerant yeast, Issatchenkia orientalis SD108, for 5‐ALA production. We first discovered that the cell growth rate of I. orientalis SD108 was boosted by 5‐ALA and its endogenous ALA synthetase (ALAS) showed higher activity than those homologs from other yeasts. The titer of 5‐ALA was improved from 28mg/L to 120‐, 150‐, and 300mg/L, by optimizing plasmid design, overexpressing a transporter, and increasing gene copy number, respectively. After redirecting the metabolic flux using the pyruvate decarboxylase (PDC) knockout strain (SD108ΔPDC) and culturing with urea, we increased the titer of 5‐ALA to 510mg/L, a 13‐fold enhancement, proving the importance of the newly identified IoALAS with higher activity and the strategic selection of nitrogen sources for knockout strains. This study demonstrates the acid‐tolerant I. orientalis SD108ΔPDC has a high potential for 5‐ALA production at a large scale in the future.
keywords:
Bioproducts; Gene Editing; Genome Engineering; Metabolic Engineering
published:
2026-02-17
Peyton, Buddy; Bajjalieh, Joseph; Martin, Michael; Gerald, Andrea
(2026)
Coups d'Ètat are important events in the life of a country. They constitute an important subset of irregular transfers of political power that can have significant and enduring consequences for national well-being. There are only a limited number of datasets available to study these events (Powell and Thyne 2011, Marshall and Marshall 2019, Chin, Carter and Wright 2021). Seeking to facilitate research on post-WWII coups by compiling a more comprehensive list and categorization of these events, the Cline Center for Advanced Social Research (previously the Cline Center for Democracy) initiated the Coup d’État Project as part of its Societal Infrastructures and Development (SID) project. More specifically, this dataset identifies the outcomes of coup events (i.e., realized, unrealized, or conspiracy) the type of actor(s) who initiated the coup (i.e., military, rebels, etc.), as well as the fate of the deposed leader.
Version 2.2.2 corrects an error in version 2.2.1 in which the “conspiracy” designation was mistakenly assigned to coup_id: 40411262025. Version 2.2.2 resolves this issue by removing the incorrect designation.
Version 2.2.1 adds 67 additional coup events. 47 of these came from examining the Colpus dataset (Chin, Carter, and Wright 2021), and 20 of these events were added to the data set in the normal annual review of potential new coup events. This version also updates the coding to events in Mali in 2012, Serbia in 2000 and Chad in 1979.
Version 2.2.0 adds 94 additional coup events. 66 of these came from examining Powell and Thyne’s “discarded” events and 28 of these events were added to the data set in the normal annual review of potential new coup events. This version also updates the coding to events in Brazil in 1945 and the Congo in 1968.
Version 2.1.3 adds 19 additional coup events to the data set, corrects the date of a coup in Tunisia, and reclassifies an attempted coup in Brazil in December 2022 as a conspiracy.
Version 2.1.2 added 6 additional coup events that occurred in 2022 and updated the coding of an attempted coup event in Kazakhstan in January 2022.
Version 2.1.1 corrected a mistake in version 2.1.0, where the designation of “dissident coup” had been dropped in error for coup_id: 00201062021. Version 2.1.1 fixed this omission by marking the case as both a dissident coup and an auto-coup.
Version 2.1.0 added 36 cases to the data set and removed two cases from the v2.0.0 data set. This update also added actor coding for 46 coup events and added executive outcomes to 18 events from version 2.0.0. A few other changes were made to correct inconsistencies in the coup ID variable and the date of the event.
Version 2.0.0 improved several aspects of the previous version (v1.0.0) and incorporated additional source material to include:
• Reconciling missing event data
• Removing events with irreconcilable event dates
• Removing events with insufficient sourcing (each event needs at least two sources)
• Removing events that were inaccurately coded as coup events
• Removing variables that fell below the threshold of inter-coder reliability required by the project
• Removing the spreadsheet ‘CoupInventory.xls’ because of inadequate attribution and citations in the event summaries
• Extending the period covered from 1945-2005 to 1945-2019
• Adding events from Powell and Thyne’s Coup Data (Powell and Thyne, 2011)
Version 1.0.0 was released in 2013. This version consolidated coup data taken from the following sources:
• The Center for Systemic Peace (Marshall and Marshall, 2007)
• The World Handbook of Political and Social Indicators (Taylor and Jodice, 1983)
• Coup d’Ètat: A Practical Handbook (Luttwak, 1979)
• The Cline Center’s Social, Political and Economic Event Database (SPEED) Project (Nardulli, Althaus and Hayes, 2015)
• Government Change in Authoritarian Regimes – 2010 Update (Svolik and Akcinaroglu, 2006)
<br>
<b>Items in this Dataset</b>
1. <i>Cline Center Coup d'État Codebook v.2.2.2 Codebook.pdf</i> - This 18-page document describes the Cline Center Coup d’État Project dataset. The first section of this codebook provides a summary of the different versions of the data. The second section provides a succinct definition of a coup d’état used by the Coup d'État Project and an overview of the categories used to differentiate the wide array of events that meet the project's definition. It also defines coup outcomes. The third section describes the methodology used to produce the data. <i>Revised February 2026</i>
2. <i>Coup Data 2.2.2.csv</i> - This CSV (Comma Separated Values) file contains all of the coup event data from the Cline Center Coup d’État Project. It contains 29 variables and 1,161 observations. <i>Revised February 2026</i>
3. <i>Source Document v2.2.2.pdf</i> - This 365-page document provides the sources used for each of the coup events identified in this dataset. Please use the value in the coup_id variable to identify the sources used to identify that particular event. <i>Revised February 2026</i>
4. <i>README.md</i> - This file contains useful information for the user about the dataset. It is a text file written in Markdown language. <i>Revised February 2026</i>
<br>
<b> Citation Guidelines</b>
1. To cite the codebook (or any other documentation associated with the Cline Center Coup d’État Project Dataset) please use the following citation:
Peyton, Buddy, Joseph Bajjalieh, Dan Shalmon, Michael Martin, Jonathan Bonaguro, and Scott Althaus. 2026. “Cline Center Coup d’État Project Dataset Codebook”. Cline Center Coup d’État Project Dataset. Cline Center for Advanced Social Research. V.2.2.2. February 17. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-9651987_V10
2. To cite data from the Cline Center Coup d’État Project Dataset please use the following citation (filling in the correct date of access):
Peyton, Buddy, Joseph Bajjalieh, Michael Martin, and Andrea Gerald. 2026. Cline Center Coup d’État Project Dataset. Cline Center for Advanced Social Research. V.2.2.2. February 17. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-9651987_V10
published:
2026-04-28
Lee, Jaejin; Villanueva, Paul; Glanville, Kate; VanLoocke, Andy; Yang, Wendy; Kent, Angela; McDaniel, Marshall; Hall, Steven; Howe, Adina
(2026)
Nutrient inputs influence the sustainability of bioenergy crop production through contemporary (shortly after addition) and legacy effects (persisting over years) on microbial nitrogen (N) and carbon cycling, which contribute to greenhouse gas emissions. However, the relative importance of contemporary and legacy effects and how that could vary by crop functional types is poorly understood. Considering its rhizomatous roots and perennial growth, we hypothesized that Miscanthus × giganteus (M×g) would be more sensitive to legacy N fertilization and the historical context of its environment than an annual crop like maize. To test this hypothesis, we examined the effects of legacy and contemporary N inputs on nitrous oxide (N2O) and carbon dioxide (CO2) emissions, as well as key N cycling genes in soils where M×g and maize were grown. A 150-day soil incubation experiment was conducted using soils from a long-term M×g and maize fertility experiment with three historic N fertilization rates (0, 112, and 336 kg N ha−1 year−1) and a contemporary amendment (60 mg N kg−1) with negative control (0 mg N kg−1). We observed significant increases in cumulative N2O emissions in Mxg soils relative to maize soils, particularly at higher legacy fertilization rates, while contemporary N had no significant effect. Bacterial amoA gene abundance, which plays a significant role in nitrification in nutrient-rich soils, also increased with higher legacy fertilization rates in M×g soils but was unaffected by the contemporary N. In maize soils, legacy and contemporary N did not significantly affect N2O emissions, but cumulative CO2 emissions and amoA gene abundance significantly increased. The abundances of norB genes were not significantly influenced by either legacy fertilization or contemporary N amendments in either soil. Our findings demonstrate the greater importance of fertilization history over contemporary N in mediating soil N2O emissions, particularly for perennial bioenergy crops.
keywords:
Carbon; Field Data; Nitrogen; Soil
published:
2016-05-19
Donovan, Brian; Work, Dan
(2016)
This dataset contains records of four years of taxi operations in New York City and includes 697,622,444 trips. Each trip records the pickup and drop-off dates, times, and coordinates, as well as the metered distance reported by the taximeter. The trip data also includes fields such as the taxi medallion number, fare amount, and tip amount. The dataset was obtained through a Freedom of Information Law request from the New York City Taxi and Limousine Commission.
The files in this dataset are optimized for use with the ‘decompress.py’ script included in this dataset. This file has additional documentation and contact information that may be of help if you run into trouble accessing the content of the zip files.
keywords:
taxi;transportation;New York City;GPS
published:
2026-04-23
Lu, Wenyun; McBride, Matthew; Lee, Won Dong; Xing, Xi; Xu, Xincheng; Li, Xi; Oschmann, Anna; Shen, Yihui; Bartman, Caroline; Rabinowitz, Joshua
(2026)
Orbitrap mass spectrometry in full scan mode enables the simultaneous detection of hundreds of metabolites and their isotope-labeled forms. Yet, sensitivity remains limiting for many metabolites, including low-concentration species, poor ionizers, and low-fractional-abundance isotope-labeled forms in isotope-tracing studies. Here, we explore selected ion monitoring (SIM) as a means of sensitivity enhancement. The analytes of interest are enriched in the orbitrap analyzer by using the quadrupole as a mass filter to select particular ions. In tissue extracts, SIM significantly enhances the detection of ions of low intensity, as indicated by improved signal-to-noise (S/N) ratios and measurement precision. In addition, SIM improves the accuracy of isotope-ratio measurements. SIM, however, must be deployed with care, as excessive accumulation in the orbitrap of similar m/z ions can lead, via space-charge effects, to decreased performance (signal loss, mass shift, and ion coalescence). Ion accumulation can be controlled by adjusting settings including injection time and target ion quantity. Overall, we suggest using a full scan to ensure broad metabolic coverage, in tandem with SIM, for the accurate quantitation of targeted low-intensity ions, and provide methods deploying this approach to enhance metabolome coverage.f
keywords:
Mass Spectrometry; Metabolomics
published:
2026-03-01
Edmonds, Devin A.; Fanomezantsoa, Rebecca E.; Rabibisoa, Nirhy H. C.; Roberts, Sam H.
(2026)
This dataset contains ecological and demographic data for William’s bright‑eyed frog (Boophis williamsi), a critically endangered amphibian restricted to the Ankaratra Massif in Madagascar’s central highlands. Field surveys were conducted between September 2018 – March 2019 and July 2021 across ten 100‑m stream transects to estimate abundance and identify habitat associations for both tadpoles and adult frogs. Data include repeated counts of individuals and associated habitat variables (e.g., canopy cover, substrate type, stream depth, discharge, and temperature). Abundance was estimated using N‑mixture models implemented in R (version 4.3.1) with the ubms package, with separate models for tadpoles and frogs to account for differences in detection probability. The dataset consists of multiple CSV files capturing microhabitat, environmental variables, and raw survey count data (y_frogs.csv and y_tadpoles.csv) and an R script (boophis_abundance.R) used for model fitting. The dataset was compiled for an article accepted in the Herpetological Journal by the British Herpetological Society and is intended to support long‑term monitoring and conservation planning for B. williamsi and other threatened amphibians in Madagascar available at https://doi.org/10.33256/36.2.8797
keywords:
amphibian conservation; biodiversity conservation; detection probability; endangered species; N-mixture model
published:
2026-04-22
Tang, Wenhan; Arabas, Sylwester; Curtis, Jeffrey H.; Knopf, Daniel A.; West, Matthew; Riemer, Nicole
(2026)
This dataset contains the values directly shown in the figures of the article "The impact of aerosol mixing state on immersion freezing: Insights from classical nucleation theory and particle-resolved simulations". This article is in preparation for submission to the journal Atmospheric Chemistry and Physics. The dataset consists of 12 NetCDF files processed from the raw output of the PartMC model. It does not include the theoretical values of frozen fraction, which can be computed using the equations provided in the paper.
*New in V2: adding data for a newly included figure (INP_spectrum.nc), removing files that are no longer used in the revised manuscript figures (e.g., UNC_A_ratio=0.9_Dp=0.1.nc, UNC_A_ratio=0.9_Dp=10.0.nc, UNC_A_ratio=0.1_Dp=0.1.nc, and UNC_A_ratio=0.1_Dp=10.0.nc), and updating README.pdf accordingly.
keywords:
Aerosol mixing state; Ice nucleating particles; Classical nucleation theory
published:
2026-03-23
Han, Myung-Ja (MJ); Heng, Greta; Lampron, Patricia; Kudeki, Deren
(2026)
The dataset includes data used for the MARC-to-BIBFRAME conversion, as well as code developed and used for reconciling BIBFRAME Work and Hub data with the Library of Congress BIBFRAME database.
The dataset is organized into three ZIP files: MARC, BIBFRAME, and Work-Reconciliation.
The MARC and BIBFRAME ZIP files each contain three sets of records: Concerto (86 records), Hamlet (8,678 records), and Local (237 records).
The Work-Reconciliation ZIP file includes the following components:
1. reconcileWorks.py: a script that adds links to BIBFRAME records generated using the marc2bibframe2 tool
2. README.md: documentation describing how to run the script, required inputs, and the methodology for selecting links from Library of Congress search results
3. requirements.txt: a list of Python dependencies required to execute the script
4. notes.txt: supplementary notes on converting MARCXML to BIBFRAME using marc2bibframe2, including input requirements and setup considerations
keywords:
MARC to BIBFRAME conversion; reconciliation at scale; BIBFRAME Work; BIBFRAME Hub
published:
2026-04-13
Lin, Oliver; Lyu, Zhiheng; Ni, Hsu-Chih; Wang, Xiaokang; Jia, Yetong; Hwang, Chu-Yun; Yao, Lehan; Mandal, Sohini; Zuo, Jian-Min; Chen, Qian
(2026)
Raw and Processed 4D-STEM datasets organized by particles appeared in each figure in the publication.
1. Figure 1.
2. Figure 2.
3. Figure 3.
4. Figure S7.
5. Readme.txt
keywords:
4D-STEM strain mapping; decahedral nanoparticles; five-twinned nanostructure; geometric frustration; size- dependent pseudosymmetry
published:
2026-04-09
Kessler, Ethan; Colatskie, Shelly; Neier, Brittany; Jellen, Benjamin
(2026)
Code and data to replicate analysis of northern copperhead movement and habitat selection in response to anthropogenic linear features in an urban nature park.
keywords:
Road ecology; step selection function; random steps; wildlife movement; habitat selection; radio telemetry; snake
published:
2025-08-13
Tang, Wenhan; Arabas, Sylwester; Curtis, Jeffrey H.; Knopf, Daniel A.; West, Matthew; Riemer, Nicole
(2025)
This dataset contains the values directly shown in the figures of the article "The impact of aerosol mixing state on immersion freezing: Insights from classical nucleation theory and particle-resolved simulations". This article is in preparation for submission to the journal Atmospheric Chemistry and Physics. The dataset consists of 15 NetCDF files processed from the raw output of the PartMC model. It does not include the theoretical values of frozen fraction, which can be computed using the equations provided in the paper.(These four files — UNC_A_ratio=0.1_Dp=0.1.nc, UNC_A_ratio=0.1_Dp=10.0.nc, UNC_A_ratio=0.9_Dp=0.1.nc, and UNC_A_ratio=0.9_Dp=10.0.nc — were not used in the manuscript. They have the same format and serve the same function as the other UNC_A_ratio=*_Dp=*.nc files, and contain the sensitivity maps for the corresponding combinations of A_ratio and Dp.)
keywords:
Aerosol mixing state; Ice nucleating particles; Classical nucleation theory
published:
2026-04-17
Kleiman, Diego; Feng, Jiangyan; Xue, Zhengyuan; Shukla, Diwakar
(2026)
This repository contains data and model weights associated with the publication "ESMDynamic: Fast and Accurate Prediction of Protein Dynamic Contact Maps from Single Sequences". It includes the datasets used for training and evaluating a dynamic contact prediction model, ESMDynamic, as well as a script for conversion and usage.
keywords:
Computational biology; Structural biology; Molecular dynamics; Machine learning; Protein modeling; Bioinformatics; Biophysics; Artificial intelligence
published:
2026-04-17
Wang, Shiyuan; Christopher, Tessum; Justin, Johnson; Sumil, Thakrar
(2026)
<b>**Data Description:** </b>
This dataset provides country- and sector-level estimates of air-pollution–related health impacts, economic externalities, and associated spatial concentration patterns derived from multi-regional input–output (MRIO) modeling and atmospheric simulations (GTAP and EORA frameworks). Files include production- and consumption-based mortality matrices, gridded PM₂.₅ concentration maps, trade-linked net export metrics, externalities, uncertainty analyses, and cross-model correlation summaries used to generate the figures and tables in the manuscript.
<b>**Citation Requirement:** </b>
If you use this dataset in your research, presentations, or derivative works, please cite both the associated paper and the dataset:
Wang, S., Thakrar, S., Johnson, J. et al. International trade and air-quality-related mortality. Nature Communications, 17, 3518 (2026). https://doi.org/10.1038/s41467-026-71408-w
Wang, Shiyuan; Christopher, Tessum; Justin, Johnson; Sumil, Thakrar (2026): Data Accompanying "International Trade and Air-Quality-Related Mortality". University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-0064792_V2
published:
2026-04-15
Singh, Nilmani; Lane, Stephan; Yu, Tianhao; Lu, Jingxia; Ramos, Adrianna; Cui, Haiyang (Ocean); Zhao, Huimin
(2026)
Proteins are the molecular machines of life with numerous applications in energy, health, and sustainability. However, engineering proteins with desired functions for practical applications remains slow, expensive, and specialist-dependent. Here we report a generally applicable platform for autonomous enzyme engineering that integrates machine learning and large language models with biofoundry automation to eliminate the need for human intervention, judgement, and domain expertise. Requiring only an input protein sequence and a quantifiable way to measure fitness, this automated platform can be applied to engineer a wide array of proteins. As a proof of concept, we engineer Arabidopsis thaliana halide methyltransferase (AtHMT) for a 90-foldimprovement in substrate preference and 16-fold improvement in ethyl-transferase activity, along with developing a Yersinia mollaretii phytase (YmPhytase) variant with 26-fold improvement in activity at neutral pH. This is accomplished in four rounds over 4 weeks, while requiring construction and characterization of fewer than 500 variants for each enzyme. This platform for autonomous experimentation paves the way for rapid advancements across diverse industries, from medicine and biotechnology to renewable energy and sustainable chemistry.
keywords:
AI/ML; Automation
published:
2026-04-15
Li, Kaiyuan; Jiang, Congya; Ma, Zewei; Wang, Sheng; Chen, Jing; Chen, Min; Guan, Kaiyu
(2026)
The clumping index (CI) quantifies the spatial distribution of foliage elements and is essential for accurately estimating the plant area index (PAI), canopy radiative transfer, and photosynthesis. Traditionally, the finite-length averaging method (LX), the gap size distribution method (CC), and a combined approach of CC and LX (CLX) have been applied to instruments like TRAC and digital hemispherical photography to estimate CI. However, a comprehensive evaluation of these methods in row crops remains limited, especially regarding the influence of segment size on CI. Meanwhile, digital cameras offer a cost-effective and user-friendly solution for canopy measurements in row crops, yet their application in this context remains underexplored. In this study, we employed a new approach using a 30°-tilted digital camera to estimate CI in corn and soybean fields, applying the LX, CC, and CLX methods. We systematically assessed the performance of these three methods by combining field measurements in real-world fields with simulations using the LESS 3D radiative transfer model. Our results showed that CLX applied to the whole image and 45° segment offered accurate estimation of CI (bias within ±0.1, RMSE < 0.2) and PAI (bias within ±0.4, RMSE < 1) in real-world fields and LESS simulations. The accuracy of the LX method was highly sensitive to segment size, with the best performance observed at the 15° segment (PAI bias within ±0.4). In contrast, the CC method remained stable across different segment sizes, and its performance was generally comparable to that of LX, except at the 15° segment. Across view zenith angles, CI derived from CC generally showed a continuous increase, while those from LX and CLX followed a rising trend at small zenith angles but began to decline at 68°, likely due to an increasing proportion of no-gap segments. Seasonally, LX tended to show decreasing CI during early growth stages but increased as the canopy matured, whereas CC and CLX showed gradually increasing CI before plateauing at peak PAI. The 30°-tilted camera effectively captured CI variations across different angles and growth stages, making it a practical and robust instrument for row crop canopy structure analysis. Applying these CI methods to digital cameras offers a low-cost and accessible CI estimation alternative, improving canopy structure monitoring accuracy in row crops.
keywords:
Modeling
published:
2026-02-18
Ward, Michael; Slayton, Sarah
(2026)
The datasets are associated with a paper "The Windy City rookery: Movement and activity patterns of Black-crowned Night Herons (Nycticorax nycticorax) in a human-dominated landscape" that will soon be published in the journal "Ecology and Evolution". These are data associated with the movements, behaviors, and morphology of black-crowned night herons
keywords:
black-crowned night heron; urban ecology; avian movement
published:
2026-04-14
Chen, Yunzhu; Park, Kiyoul; Jang, Chunhwa; Lee, Jung Woo; Wang, Mengyuan (Mary); Kim, Hyojin; Quach, Truyen; Guo, Minghao; Sonawane, Balasaheb; Gosa, Sanbon; Clemente, Thomas; Leakey, Andrew; Cahoon, Edgar; Lee, DoKyoung
(2026)
Oil sorghum (OS) has been developed by engineering grain (TX430) and sweet (Ramada) genetic backgrounds to accumulate triacylglycerols (TAG) in vegetative tissues as an energy-dense feedstock for sustainable aviation fuel (SAF) and other biofuels. This study evaluated two TX430 OS lines (TxHO-2, TxHO-3) and two Ramada OS lines (RmHO-1, RmHO-2) alongside wild-type (WT) lines in NE and IL over 2 years (2023–2024) to quantify genotype × environment effects on agronomic performance and TAG accumulation. Across four environments, TX430 OS lines showed average TAG concentrations of 15.0 g kg−1 in leaves and 12.8 g kg−1 in stems, approximately 19-fold higher than WT. Ramada OS lines accumulated 26.1 g kg−1 in leaves and 12.3 g kg−1 in stems, approximately 25-fold and 13-fold increases over WT, respectively. OS lines in TX430 exhibited an 18% reduction in biomass (8.4 vs. 9.9 Mg ha−1 for WT), while Ramada OS lines had similar WT biomass (18.3 vs. 19.9 Mg ha−1 for WT). Among TX430 OS lines, TxHO-2 achieved the highest TAG yield (190 kg ha−1), while RmHO-1 led the Ramada lines (335 kg ha−1) due to higher biomass and similar TAG concentration. Enhanced TAG accumulation increased N, P, and K removal in TX430 lines but not in Ramada lines. Structural carbohydrate and ash concentration were unaffected. Overall, results confirm vegetative lipid accumulation as a viable strategy for high-biomass sorghum, supporting its potential as a dual-purpose feedstock for SAF. Future work should focus on minimizing biomass yield penalties and improving nutrient use efficiency in oil sorghum systems.
keywords:
Agronomy; Field Data; Oil Sorghum; Sorghum; Sustainable Aviation Fuel; Vegetative Oils
published:
2021-03-06
Lim, Teck Yian; Markowitz, Spencer Abraham; Do, Minh
(2021)
This dataset consists of raw ADC readings from a 3 transmitter 4 receiver 77GHz FMCW radar, together with synchronized RGB camera and depth (active stereo) measurements.
The data is grouped into 4 distinct radar configurations:
- "indoor" configuration with range <14m
- "30m" with range <38m
- "50m" with range <63m
- "high_res" with doppler resolution of 0.043m/s
# Related code
https://github.com/moodoki/radical_sdk
# Hardware Project Page
https://publish.illinois.edu/radicaldata
keywords:
radar; FMCW; sensor-fusion; autonomous driving; dataset; RGB-D; object detection; odometry
published:
2025-06-03
Han, Jaeyeong; Ficca, Alyson; Lanzatella, Marissa; Leang, Kanika; Barnum, Matthew; Boudreaux, Jonathan; Schroeder, Nathan
(2025)
This data comprises image files used in the analysis of Analysis of Nematode Ventral Nerve Cords Suggests Multiple Instances of Evolutionary Addition and Loss of Neurons by Han et al. (bioRxiv, 2025: doi: https://doi.org/10.1101/2025.03.20.644414). It is separated into two folders. The first comprise data using DAPI staining to quantify the number of VNC nuclei in diverse nematodes. The second includes dye-filling data of Mononchus aquaticus.
keywords:
C. elegans; Mononchus; neuroanatomy; nematode nervous system; ventral nerve cord; secondary simplification
published:
2026-04-13
Tan, Shi-I; Bhagwat, Sarang; Martin, Teresa; Suthers, Patrick; Tran, Vinh; Tang, Wuying; Fatma, Zia; Maranas, Costas; Guest, Jeremy; Zhao, Huimin
(2026)
Biomanufacturing provides a more sustainable alternative to fossil-based chemical manufacturing. 3-Hydroxypropionic acid (3HP) is a top Department of Energy value-added chemical and precursor to bioplastics, yet cost-effective microbial production remains elusive. Here, we establish the acid-tolerant yeast Issatchenkia orientalis as a robust host for low-pH 3HP biosynthesis. Genome-scale modeling identifies the β-alanine pathway as optimal, offering the highest theoretical yield and lowest oxygen requirement. Thermodynamic analysis confirms its favorability under acidic conditions. Using sequence similarity network analysis, we discover highly active aspartate 1-decarboxylase (PAND), β-alanine-pyruvate aminotransferase (BAPAT), and 3HP dehydrogenase (YDFG), which significantly improve the pathway efficiency. Next, to further elevate the production, pathway optimization through multi-copy PAND integration, byproduct elimination (knockouts of pyruvate decarboxylase and glycerol-3-phosphate dehydrogenase), and reinforcement of aspartate flux by overexpression of pyruvate carboxylase and aspartate amino transferase improves the titer to 29 g/L in shake flasks. Fed-batch fermentation at pH 4 with low-cost corn steep liquor medium further increases the production to 92 g/L with 0.7 g/g yield and 0.55 g/L/h productivity. Techno-economic analysis indicates that such performance could potentially enable a financially viable process for sustainable acrylic acid production. This work establishes I. orientalis as a next-generation platform for cost-effective 3HP production and paves the way toward industrial commercialization.
keywords:
Bioproducts; Metabolic Engineering; Technoeconomic Analysis