Illinois Data Bank
Displaying 26 - 50 of 849 in total
Subject Area
Funder
Publication Year
License
Illinois Data Bank Dataset Search Results

Dataset Search Results

published: 2025-09-15
 
Chemical-free pretreatments are attracting increased interest because they generate less inhibitor in hydrolysates. In this study, pilot-scaled continuous hydrothermal (PCH) pretreatment followed by disk refining was evaluated and compared to laboratory-scale batch hot water (LHW) pretreatment. Bioenergy sorghum bagasse (BSB) was pretreated at 160-190 °C for 10 min with and without subsequent disk milling. Hydrothermal pretreatment and disk milling synergistically improved glucose and xylose release by 10-20% compared to hydrothermal pretreatment alone. Maximum yields of glucose and xylose of 82.55% and 70.78%, respectively were achieved, when BSB was pretreated at 190 °C and 180 °C followed by disk milling. LHW pretreated BSB had 5-15% higher sugar yields compared to PCH for all pretreatment conditions. The surface area improvement was also performed. PCH pretreatment combined with disk milling increased BSB surface area by 31.80-106.93%, which was greater than observed using LHW pretreatment.
keywords: Conversion;Sustainability;Genomics;Hydrolysate
published: 2025-09-15
 
The oleaginous yeast Rhodosporidium toruloides is considered a promising candidate for production of chemicals and biofuels thanks to its ability to grow on lignocellulosic biomass, and its high production of lipids and carotenoids. However, efforts to engineer this organism are hindered by a lack of suitable genetic tools. Here we report the development of a CRISPR/Cas9 system for genome editing in R. toruloides based on a fusion 5S rRNA–tRNA promoter for guide RNA (gRNA) expression, capable of greater than 95% gene knockout for various genetic targets. Additionally, multiplexed double‐gene knockout mutants were obtained using this method with an efficiency of 78%. This tool can be used to accelerate future metabolic engineering work in this yeast.
keywords: Conversion;Genome Engineering;Genomics;Transcriptomics
published: 2025-09-15
 
Recent advancements in monocot transformation, using leaf tissue as explant material, have expanded the number of grass species capable of transgenesis. However, the complexity of vectors and reliance on inducible excision of essential morphogenic regulators have so far limited widespread application. Plant RNA viruses, such as Foxtail Mosaic Virus (FoMV), present a unique opportunity to express morphogenic regulator genes, such as Babyboom (Bbm), Wuschel2 (Wus2), Wuschel-like homeobox protein 2a (Wox2a) and the GROWTH-REGULATING FACTOR 4 (GRF4) GRF-INTERACTING FACTOR 1 (GIF1) fusion protein transiently in leaf explant tissues. Furthermore, altruistic delivery of conventional and viral vectors could provide opportunities to simplify vectors used for leaf transformation—facilitating vector optimization and reducing reliance on morphogenic regulator gene integration. In this study, both viral and conventional T-DNA vectors were tested for their ability to promote the formation of embryonic calli, a critical step in leaf transformation protocols, using Sorghum bicolor leaf explants. Although conventional leaf transformation vectors yielded viable embryonic calli (43.2 ± 2.9%: GRF4-GIF1, 50.2 ± 3%: Bbm/Wus2), altruistic conventional vectors employing the GRF4-GIF1 morphogenic regulator resulted in improved efficiencies (61.3 ± 4.7%). Altruistic delivery was further enhanced with the use of viral vectors employing both GRF4-GIF1 and Bbm/Wus2 regulators, resulting in 75.1 ± 2.3% and 79.2 ± 2.5% embryonic calli formation, respectively. Embryonic calli generated from both conventional and viral vectors produced shoots expressing fluorescent reporters, which were confirmed using molecular analysis. This work provides an important proof-of-concept for the use of both altruistic vectors and viral-expressed morphogenic regulators for improving plant transformation.
keywords: gene editing; sorghum
published: 2025-09-15
 
Golden Gate assembly is one of the most widely used DNA assembly methods due to its robustness and modularity. However, despite its popularity, the need for BsaI-free parts, the introduction of scars between junctions, as well as the lack of a comprehensive study on the linkers hinders its more widespread use. Here, we first developed a novel sequencing scheme to test the efficiency and specificity of 96 linkers of 4-bp length and experimentally verified these linkers and their effects on Golden Gate assembly efficiency and specificity. We then used this sequencing data to generate 200 distinct linker sets that can be used by the community to perform efficient Golden Gate assemblies of different sizes and complexity. We also present a single-pot scarless Golden Gate assembly and BsaI removal scheme and its accompanying assembly design software to perform point mutations and Golden Gate assembly. This assembly scheme enables scarless assembly without compromising efficiency by choosing optimized linkers near assembly junctions.
keywords: Conversion;Genome Engineering;Genomics
published: 2025-09-15
 
Sugarcane, a tropical C4 grass in the genus Saccharum (Poaceae), accounts for nearly 80% of sugar produced worldwide and is also an important feedstock for biofuel production. Generating transgenic sugarcane with predictable and stable transgene expression is critical for crop improvement. In this study, we generated a highly expressed single copy locus as landing pad for transgene stacking. Transgenic sugarcane lines with stable integration of a single copy nptII expression cassette flanked by insulators supported higher transgene expression along with reduced line to line variation when compared to single copy events without insulators by NPTII ELISA analysis. Subsequently, the nptII selectable marker gene was efficiently excised from the sugarcane genome by the FLPe/FRT site-specific recombination system to create selectable marker free plants. This study provides valuable resources for future gene stacking using site-specific recombination or genome editing tools.
keywords: Feedstock Production;Biomass Analytics;Genomics
published: 2025-09-15
 
Data sets for material included in "A 13-year record indicates differences in the duration and depth of soil carbon accrual among potential bioenergy crops" by Kantola et al., 2025, in Global Change Biology Bioenergy. Data include soil organic carbon (SOC), carbon stable isotope ratios, annual belowground biomass, and annual post-harvest litter for four crops, maize/soybean, miscanthus, switchgrass, and prairie, between 2008 and 2021.
keywords: bioenergy crops; soil organic carbon; miscanthus; switchgrass; prairie
published: 2025-09-12
 
Overwintering ability is an important selection criterion for Miscanthus breeding in temperate regions. Insufficient overwintering ability of the currently leading Miscanthus biomass cultivar, M. ×giganteus (M×g) ‘1993–1780′, in regions where average annual minimum temperatures are −26.1°C (USDA hardiness zone 5) or lower poses a pressing need to develop new cultivars with superior cold tolerance. To facilitate breeding of Miscanthus, this study characterized phenotypic and genetic variation of overwintering ability in an M. sinensis germplasm panel consisting of 564 accessions, evaluated in field trials at three locations in North America and two in Asia. Genome‐wide association (GWA) and genomic prediction analyses were performed. The Korea/N China M. sinensis genetic group is a valuable gene pool for cold tolerance. The Yangtze‐Qinling, Southern Japan, and Northern Japan genetic groups were also potential sources of cold tolerance. A total of 73 marker–trait associations were detected for overwintering ability. Estimated breeding value for overwintering ability based on these 73 markers could explain 55% of the variation for first winter overwintering ability among M. sinensis. Average genomic prediction ability for overwintering ability across 50 fivefold cross‐validations was high (~0.73) after accounting for population structure. Common genomic regions for overwintering ability were detected by GWA analyses and a previous parallel QTL mapping study using three interconnected biparental F1 populations. One QTL on Miscanthus LG 8 encompassed five GWA hits and a known cold‐responsive gene, COR47. The other overwintering ability QTL on Miscanthus LG 11 contained two GWA hits and three known cold stress‐related genes, carboxylesterase 13 (CEX13), WRKY2 transcription factor, and cold shock domain (CSDP1). Miscanthus accessions collected from high latitude locations with cold winters had higher rates of overwintering, and more alleles for overwintering, than accessions collected from southern locations with mild winters.
keywords: Feedstock Production;Biomass Analytics;Genomics
published: 2025-09-11
 
Yarrowia lipolytica has been used to produce both citric acid and lipid-based bioproducts at high titers. In this study, we found that pH differentially affects citric acid and lipid production in Y. lipolytica W29, with citric acid production enhanced at more neutral pH’s and lipid production enhanced at more acid pH’s. To determine the mechanism governing this pH-dependent switch between citric acid and lipid production, we profiled gene expression at different pH’s and found that the relative expression of multiple transporters is increased at neutral pH. These results suggest that this pH-dependent switch is mediated at the level of citric acid transport rather than changes in the expression of the enzymes involved in citric acid and lipid metabolism. In further support of this mechanism, thermodynamic calculations suggest that citric acid secretion is more energetically favorable at neutral pH’s, assuming the fully protonated acid is the substrate for secretion. Collectively, these results provide new insights regarding citric acid and lipid production in Y. lipolytica and may offer new strategies for metabolic engineering and process design.
keywords: Conversion;RNA Sequencing;Transcriptomics
published: 2025-09-11
 
We present a three-year archival, longitudinal dataset of YouTube Trending videos, collected from July 1, 2022, to June 30, 2025, four retrieval per day. This collection, a unique historical record of digital culture in transition, includes 446,971 snapshots from 104 countries, encompassing 726,627 unique videos and their associated metadata. Each record includes collection timestamp, geographic region, video ranking, core identifiers (video ID, channel ID, category), content metadata (title, description, tags, localization), language information, live status, view and comment counts. Unlike previous datasets with limited geographic scope or short timeframes, our data offers exceptional coverage for cross-national and longitudinal analyses of digital culture. This non-personalized data corpus provides an irreplaceable baseline for understanding crisis communication, platform governance or temporal shifts in content popularity.
keywords: YouTube; Trending Videos; Digital Culture; Global Trend
published: 2025-09-10
 
Enzymatic reduction of oxyanions such as sulfite (SO32−) requires the delivery of multiple electrons and protons, a feat accomplished by cofactors tailored for catalysis and electron transport. Replicating this strategy in protein scaffolds may expand the range of enzymes that can be designed de novo. Mirts et al. selected a scaffold protein containing a natural heme cofactor and then engineered a cavity suitable for binding a second cofactor—an iron-sulfur cluster (see the Perspective by Lancaster). The resulting designed enzyme was optimized through rational mutation into a catalyst with spectral characteristics and activity similar to that of natural sulfite reductases.
keywords: Conversion;Catalysis
published: 2025-09-10
 
Conversion of corn fiber to ethanol in the dry grind process could increase ethanol yields, reduce downstream processing costs and improve overall process profitability. This work investigates the in-situ conversion of corn fiber into ethanol (cellulase addition during simultaneous saccharification and fermentation) during dry grind process. Addition of 30 FPU/g fiber cellulase resulted in 4.6% increase in ethanol yield compared to the conventional process. Use of excess cellulase (120 FPU/g fiber) resulted in incomplete fermentation and lower ethanol yield compared to the conventional process. Multiple factors including high concentrations of ethanol and phenolic compounds were responsible for yeast stress and incomplete fermentation in excess cellulase experiments.
keywords: Conversion;Feedstock Bioprocessing
published: 2025-09-09
 
Most native producers of ribosomally synthesized and post-translationally modified peptides (RiPPs) utilize N-terminal leader peptides to avoid potential cytotoxicity of mature products to the hosts. Unfortunately, the native machinery of leader peptide removal is often difficult to reconstitute in heterologous hosts. Here we devised a general method to produce bioactive lanthipeptides, a major class of RiPP molecules, in Escherichia coli colonies using synthetic biology principles, where leader peptide removal is programmed temporally by protease compartmentalization and inducible cell autolysis. We demonstrated the method for producing two lantibiotics, haloduracin and lacticin 481, and performed analog screening for haloduracin. This method enables facile, high throughput discovery, characterization, and engineering of RiPPs.
keywords: Conversion;Genome Engineering;Genomics
published: 2024-06-04
 
This dataset contains files and relevant metadata for real-world and synthetic LFR networks used in the manuscript "Well-Connectedness and Community Detection (2024) Park et al. presently under review at PLOS Complex Systems. The manuscript is an extended version of Park, M. et al. (2024). Identifying Well-Connected Communities in Real-World and Synthetic Networks. In Complex Networks & Their Applications XII. COMPLEX NETWORKS 2023. Studies in Computational Intelligence, vol 1142. Springer, Cham. https://doi.org/10.1007/978-3-031-53499-7_1. “The Overview of Real-World Networks image provides high-level information about the seven real-world networks. TSVs of the seven real-world networks are provided as [network-name]_cleaned to indicate that duplicated edges and self-loops were removed, where column 1 is source and column 2 is target. LFR datasets are contained within the zipped file. Real-world networks are labeled _cleaned_ to indicate that duplicate edges and self loops were removed. #LFR datasets for the Connectivity Modifier (CM) paper ### File organization Each directory `[network-name]_[resolution-value]_lfr` includes the following files: * `network.dat`: LFR network edge-list * `community.dat`: LFR ground-truth communities * `time_seed.dat`: time seed used in the LFR software * `statistics.dat`: statistics generated by the LFR software * `cmd.stat`: command used to run the LFR software as well as time and memory usage information
published: 2023-03-16
 
Curated networks and clustering output from the manuscript: Well-Connected Communities in Real-World Networks https://arxiv.org/abs/2303.02813
keywords: Community detection; clustering; open citations; scientometrics; bibliometrics
published: 2024-02-16
 
This dataset contains five files. (i) open_citations_jan2024_pub_ids.csv.gz, open_citations_jan2024_iid_el.csv.gz, open_citations_jan2024_el.csv.gz, and open_citation_jan2024_pubs.csv.gz represent a conversion of Open Citations to an edge list using integer ids assigned by us. The integer ids can be mapped to omids, pmids, and dois using the open_citation_jan2024_pubs.csv and open_citations_jan2024_pub_ids.scv files. The network consists of 121,052,490 nodes and 1,962,840,983 edges. Code for generating these data can be found https://github.com/chackoge/ERNIE_Plus/tree/master/OpenCitations. (ii) The fifth file, baseline2024.csv.gz, provides information about the metadata of PubMed papers. A 2024 version of PubMed was downloaded using Entrez and parsed into a table restricted to records that contain a pmid, a doi, and has a title and an abstract. A value of 1 in columns indicates that the information exists in metadata and a zero indicates otherwise. Code for generating this data: https://github.com/illinois-or-research-analytics/pubmed_etl. If you use these data or code in your work, please cite https://doi.org/10.13012/B2IDB-5216575_V1.
keywords: PubMed
published: 2024-07-29
 
This dataset consists of a citation graph. It was constructed by downloading and parsing the Works section of the Open Alex catalog of the global research system. Open Alex (see citation below) contains detailed information about scholarly research, including articles, authors, journals, institutions, and their relationships. The data were downloaded on 2024-07-15. The dataset comprises two compressed (.xz) files. 1) filename: openalexID_integer_id_hasDOI.parquet.xz. The tabular data within contains three columns: openalex_id, integer_id, and hasDOI. Each row represents a record with the following data types: • openalex_id: A unique identifier from the Open Alex catalog. • integer_id: An integer representing the new identifier (assigned by the authors) • hasDOI: An integer (0 or 1) indicating whether the record has a DOI (0 for no, 1 for yes). 2) filename: citation_table.tsv.xz This edgelist of citations has two columns (no header) of integer values that represent citing and cited integer_id, respectively. Summary Features • Total Nodes (Documents): 256,997,006 • Total Edges (citations): 2,148,871,058 • Documents with DOIs: 163,495,446 • Edges between documents with DOIs: 1,936,722,541 The code used to generate these files can be found here: https://github.com/illinois-or-research-analytics/lorran_openalex/
keywords: citation networks; Open Alex
published: 2025-08-16
 
The data within consist of compressed output files in the form of edgelists (*.edgelist.gz) and nodelists (*.aux.parquet) from large citation network simulations using an agent-based model. The code and instructions are available at: <a href="https://github.com/illinois-or-research-analytics/SASCA">https://github.com/illinois-or-research-analytics/SASCA</a>. In addition, we provide a distribution of citation frequencies drawn from a random sample of PubMed journal articles (pooled_50k_pubmed_unique.csv) and a table of recencies- the frequency with which citations are made to the previous year, the year before that and so on (recency_probs_percent_stahl_filled.csv). A manuscript describing the SASCA-s simulator has been submitted for review and will be referenced in a future version of this data repository if it is accepted. The prefixes sj and er refer to the real world and Erdos-Renyi random graph respectively that were used to initiate simulations. These 'seed' networks are available from the Github site referenced above.
keywords: benchmark networks; agent-based models; simulation; citation
published: 2025-08-17
 
These codes implement the master equation microkinetic modeling (ME-MKM) calculations of Adams et al. (J. Phys. Chem. C 2025, 129, 15, 7285–7294), as well as the automatic derivatives for activation energies and reaction orders in their follow-up work (in review).
keywords: Microkinetic model; master equation; periodic tiling; catalysis; adsorption;
published: 2025-09-08
 
This is the data set for the article entitled "Pollinator seed mixes are phenologically dissimilar to prairie remnants," a manuscript pending publication in Restoration Ecology. This represents the core phenology data of prairie remnant and pollinator seed mixes that were used for the main analyses. Note that additional data associated with the manuscript are intended to be published as a supplement in the journal. * In this V2, a second tab was added to the Rest.Ecol.data.xlsx file. This new sheet listed original data source citations that match the RELIX data base, a sister project.
keywords: native plants; ecological restoration; tallgrass prairie; native plant materials
published: 2025-09-08
 
Purpose-grown perennial herbaceous species are nonfood crops specifically cultivated for bioenergy production and have the potential to secure bioenergy feedstock resources while enhancing ecosystem services. This study assessed soil greenhouse gas emissions (CO2 and N2O), nitrate (NO3-N) leaching reduction potential, evapotranspiration (ET), and water-use efficiency (WUE) of bioenergy switchgrass (Panicum virgatum L.) in comparison to corn (Zea mays L.). The study was conducted on field-scale plots in Urbana, IL, during the 2020–2022 growing seasons. Switchgrass was established in 2020 and urea-fertilized at 56 kg N ha−1 year−1. Corn management followed best management practices for the US Midwest, including no-till and 202 kg N ha−1 year−1 fertilization, applied as urea–ammonium nitrate (32%). Our results showed lower direct N2O emissions in switchgrass compared to corn. Although soil CO2 emissions did not differ significantly during the establishment year, emissions in subsequent years were over 50% higher in switchgrass than in corn, likely due to increased belowground biomass, which was over five times higher in switchgrass. Nitrate-N leaching decreased as the switchgrass stand matured, reaching 80% lower than in corn by the third year. Differences in ET and WUE between corn and switchgrass were not significant; however, results indicate a trend toward reduced WUE in switchgrass under drought, driven by lower aboveground biomass production. Our study demonstrates that switchgrass can be implemented at a commercial scale without negatively impacting the hydrological cycle, while potentially reducing N losses through nitrate-N leaching and soil N2O emissions, and enhancing belowground C storage.
keywords: field data; perennial bioenergy grasses; soil; switchgrass
published: 2025-09-08
 
Miscanthus x giganteus (Mxg) is a promising perennial crop for producing natural colorants, renewable fuels, and bioproducts. However, natural recalcitrance and high pretreatment cost are major barriers to their complete conversion. In this study, a green processing method has been investigated for efficient recovery of natural pigments (anthocyanins), fermentable sugars, and pure lignin from Mxg genotypes using choline chloride-based natural deep eutectic solvents (NADES) systems. Interestingly, choline chloride: lactic acid (ChCl: LA) NADES-processed biomass resulted in 67.8 ± 2.1 μg g−1 of anthocyanins from dry biomass. A maximum of 87.4%–94.1% glucose yield was achieved after enzymatic saccharification. The effective extraction of lignin with high purity with higher β-aryl ether (βO4) bonds from advanced crops is crucial for lignin valorization. Notably, highly pure lignin (≈93.4% ± 1.4%) is achieved after low-temperature NADES pretreatment while retaining lignin’s native structure. 31P nuclear magnetic resonance demonstrated that total phenolics for ChCl: LA-lignin resulted in 1.20 mmol g−1 hydroxyls. The relative monolignol composition of syringyl (S), guaiacyl (G), and p-hydroxyphenyl (H) is 19.0, 65.7, and 14.3%, respectively, as evidenced by heteronuclear single quantum coherence analysis. This study provides a novel approach for obtaining high-purity lignin for catalytic depolymerization for oligomers and bifunctional monoaromatics production and leverages current cellulosic biorefinery technologies.
keywords: biomass analytics; feedstock bioprocessing; inter-brc; miscanthus
published: 2025-09-06
 
4D-STEM datasets for solution-treated (CrCoNi)93Al4Ti2Nb MEA in [111], [112], and [114] zone. Data used for Ultramicroscopy article "Differentiating electron diffuse scattering via 4D-STEM spatial fluctuation and correlation analysis in complex FCC alloys". Experiment details can be found in the paper. Data-specific details are listed in the Readme file.
keywords: 4D-STEM; MEA; Electron Diffuse-Scattering; FluCor
published: 2025-05-27
 
This dataset contains all raw and processed data used to generate the figures in the main text and supplementary material of the paper "High dynamic-range quantum sensing of magnons and their dynamics using a superconducting qubit." The data can be used to reproduce the plots and validate the analysis. Accompanying Jupyter notebooks provide step-by-step analysis pipelines for figure generation. The dataset also includes drawings for the mechanical samples used to perform the experiment. In addition, the dataset provides ANSYS HFSS electromagnetic simulation files used to design and analyze the resonator structures and estimate field distributions.
keywords: superconducting qubit; magnon sensing; hybrid quantum systems; spin-photon coupling; magnon decay; cavity QED
published: 2025-05-21
 
Raw data of Auchenorrhyncha (Hemiptera) species presence and abundance from samples collected as part of Morgan Brown's M.S. thesis entitled "Investigating changes in Auchenorrhyncha (Hemiptera) communities in Illinois prairies over 25 years." Collection_Events_MBrown.pdf contains information that corresponds to each collection event code listed in the raw data files, including coordinates, date of collection, collection method, and name of collector. Each CSV file contains Auchenorrhyncha species presence and abundance data from each sampling area in Illinois: Route 45 Railroad Prairie, Richardson Wildlife Foundation, Mason County nature preserves, and Twelve Mile Prairie. Variables included in the CSV files include: Family: Taxonomic family to which each species belongs Subfamily: Taxonomic subfamily to which each species belongs Tribe: Taxonomic tribe to which each species belongs Species: Lowest taxonomic level to which individuals were identified The first row of column 5 to the end are collection event codes which correspond to each code listed in the PDF * New in V2: The CSV files originally uploaded in V1 contained outdated species names. V2 provides updated CSV files with the corrected names.
keywords: Biodiversity; Entomology; Conservation
Research Data Service Illinois Data Bank
Access and Use Policies Web Privacy Notice Contact Us