Illinois Data Bank Dataset Search Results
Results
published:
2021-06-28
Shen, Chengze; Zaharias, Paul; Warnow, Tandy
(2021)
This dataset contains 1) the cleaned version of 11 CRW datasets, 2) RNASim10k dataset in high fragmentation and 3) three CRW datasets (16S.3, 16S.T, 16S.B.ALL) in high fragmentation.
keywords:
MAGUS;UPP;Multiple Sequence Alignment;PASTA;eHMMs
published:
2022-01-31
Dominguez, Francina
(2022)
This dataset contains results from WRF simulations over northern South America. The Orinoco Low-Level Jet (OLLJ) and the Cross-Equatorial Moisture Transport are important circulation structures of the climate of tropical South America. We explore the sensitivity of the OLLJ and cross-equatorial transport to the representation of surface fluxes and turbulence by using two different Land Surface Model (LSM) schemes (Noah and CLM) and three Planetary Boundary Layer (PBL) schemes (YSU, QNSE and MYNN).
keywords:
WRF; Orinoco LLJ; preicpitation
published:
2024-07-09
Storms, Suzanna; Shisler, Joanna; Nguyen, Thanh H.; Zuckermann, Federico; Lowe, James
(2024)
This dataset includes the RT-PCR results, RT-LAMP results, and the minutes to positive ROC curve calculations. This dataset includes data for the synthetic gBlock, cell culture, and clinical sample assays (nasal swabs and nasal wipes). Also included is a list of FDA approved point of care tests for influenza A virus to date (2-16-2024). MIQE guidelines are also included.
published:
2024-08-17
Storms, Suzanna; Leonardi-Cattolica, Antonio; Prezioso, Tara; Varga, Csaba; Wang, Leyi; Lowe, James
(2024)
This dataset includes the RT-PCR shedding data and primers used for whole genome sequencing of Influenza A virus in swine. It also includes the GenBank accession numbers for all segments generated by Influenza A virus sequencing from nasal swab samples. Additionally, all nucleotide changes are listed by sample.
published:
2024-11-15
BL30K is a synthetic dataset rendered using Blender with ShapeNet's data. We break the dataset into six segments, each with approximately 5K videos. The videos are organized in a similar format as DAVIS and YouTubeVOS, so dataloaders for those datasets can be used directly. Each video is 160 frames long, and each frame has a resolution of 768*512. There are 3-5 objects per video, and each object has a random smooth trajectory -- we tried to optimize the trajectories in a greedy fashion to minimize object intersection (not guaranteed), with occlusions still possible (happen a lot in reality). See [Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion (MiVOS), CVPR 2022] for details.
published:
2025-10-01
Lyu, Mingkuan; Kong, Linggen; Yang, Zhenglin; Wu, Yuting; McGhee, Claire E.; Lu, Yi
(2025)
DNAzymes have been widely used in many sensing and imaging applications but have rarely been used for genetic engineering since their discovery in 1994, because their substrate scope is mostly limited to single-stranded DNA or RNA, whereas genetic information is stored mostly in double-stranded DNA (dsDNA). To overcome this major limitation, we herein report peptide nucleic acid (PNA)-assisted double-stranded DNA nicking by DNAzymes (PANDA) as the first example to expand DNAzyme activity toward dsDNA. We show that PANDA is programmable in efficiently nicking or causing double strand breaks on target dsDNA, which mimics protein nucleases and can act as restriction enzymes in molecular cloning. In addition to being much smaller than protein enzymes, PANDA has a higher sequence fidelity compared with CRISPR/Cas under the condition we tested, demonstrating its potential as a novel alternative tool for genetic engineering and other biochemical applications.
keywords:
Conversion;Genomics;Genome Engineering
published:
2021-07-20
Fu, Yuanxi; Schneider, Jodi
(2021)
This dataset contains data from extreme-disagreement analysis described in paper “Aaron M. Cohen, Jodi Schneider, Yuanxi Fu, Marian S. McDonagh, Prerna Das, Arthur W. Holt, Neil R. Smalheiser, 2021, Fifty Ways to Tag your Pubtypes: Multi-Tagger, a Set of Probabilistic Publication Type and Study Design Taggers to Support Biomedical Indexing and Evidence-Based Medicine.” In this analysis, our team experts carried out an independent formal review and consensus process for extreme disagreements between MEDLINE indexing and model predictive scores. “Extreme disagreements” included two situations: (1) an abstract was MEDLINE indexed as a publication type but received low scores for this publication type, and (2) an abstract received high scores for a publication type but lacked the corresponding MEDLINE index term. “High predictive score” is defined as the top 100 high-scoring, and “low predictive score” is defined as the bottom 100 low-scoring. Three publication types were analyzed, which are CASE_CONTROL_STUDY, COHORT_STUDY, and CROSS_SECTIONAL_STUDY. Results were recorded in three Excel workbooks, named after the publication types: case_control_study.xlsx, cohort_study.xlsx, and cross_sectional_study.xlsx.
The analysis shows that, when the tagger gave a high predictive score (>0.9) on articles that lacked a corresponding MEDLINE indexing term, independent review suggested that the model assignment was correct in almost all cases (CROSS_SECTIONAL_STUDY (99%), CASE_CONTROL_STUDY (94.9%), and COHORT STUDY (92.2%)). Conversely, when articles received MEDLINE indexing but model predictive scores were very low (<0.1), independent review suggested that the model assignment was correct in the majority of cases: CASE_CONTROL_STUDY (85.4%), COHORT STUDY (76.3%), and CROSS_SECTIONAL_STUDY (53.6%).
Based on the extreme disagreement analysis, we identified a number of false-positives (FPs) and false-negatives (FNs). For case control study, there were 5 FPs and 14 FNs. For cohort study, there were 7 FPs and 22 FNs. For cross-sectional study, there were 1 FP and 45 FNs. We reviewed and grouped them based on patterns noticed, providing clues for further improving the models. This dataset reports the instances of FPs and FNs along with their categorizations.
keywords:
biomedical informatics; machine learning; evidence based medicine; text mining
published:
2025-06-03
Okyem, Samuel; Trinklein, Timothy; Stanislav, Rubakhin; Jonathan, Sweedler
(2025)
This is a peptide imaging data obtained by mtarix assisted laser desoption ionization trapped ion mobility datasets from the central nervous sytem and select ganglion of aplysia Californica.
keywords:
Neuropeptides, Iosmerization, D-amino acids, MALDI-TIMS
published:
2025-12-01
Mori, Jameson; Zilinger, Amber; Neumann, Julia; Pentrak, Martin; Paton, Tim; Novakofski, Jan; Mateus-Pinilla, Nohra
(2025)
This dataset measurements for the following soil components from soil samples collected in northern Illinois between 2023 and 2024. Two file formats containing the same data are offered (Excel spreadsheet and CSV):
1. Soil clay minerals (illite, kaolinite, chlorite, and smectite)
2. pH
3. Other soil minerals: aluminum (Al), arsenic (As), barium (Ba), boron aluminide (Bal), calcium (Ca), cadmium (Cd), chloride (Cl), cobalt (Co), chromium (Cr), copper (Cu), iron (Fe), magnesium (Mg), manganese (Mn), mercury (Hg), molybdenum (Mo), nobium (Nb), nickel (Ni), potassium (K), phosphorous (P), lead (Pb), palladium (Pd), rubidium (Rb), silver (Ag), sulfur (S), thorium (Th), titanium (Ti), uranium (U), vanadium (V), yttrium (Y), zinc (Zn), and zirconium (Zr)
Samples were collected on the side of public roads within the right of way. X-ray diffraction was used to quantify soil clay components, while other soil minerals were measured using a Niton XL5 Plus Analyzer. pH was measured using a Yinmik YK-S01 Digital Soil pH Tester. Samples were collected as part of a project funded by the United States Department of Agriculture Animal and Plant Inspection Service (USDA-APHIS) to examine the role of soil characteristics on chronic wasting disease (CWD) persistence in northern Illinois, USA.
keywords:
CWD; chronic wasting disease; soil; clay; pH; mineral; environmental transmission; X-ray diffraction
published:
2025-12-08
Maitra, Shraddha; Viswanathan, Mothi Bharath; Park, Kiyoul; Kannan, Baskaran; Cano Alfanar, Sofia; McCoy, Scott M.; Cahoon, Edgar; Altpeter, Fredy; Leakey, Andrew; Singh, Vijay
(2025)
Plant oils are increasingly in demand as renewable feedstocks for biodiesel and biochemicals. Currently, oilseeds are the primary source of plant oils. Although the vegetative tissues of plants express lipid metabolism pathways, they do not hyper-accumulate lipids. Elevated synthesis, storage, and accumulation of lipids in vegetative tissues have been achieved by metabolic engineering of sugarcane to produce “oilcane.” This study evaluates the potential of oilcane as a renewable feedstock for the co-production of lipids and fermentable sugars. Oilcane was grown under favorable climatic and field conditions in Florida (FLOC) as well as during an abbreviated growing season, outside its typical growing region, in Illinois (ILOC). The potential lipid yield of 0.35 tons/ha was projected from the hyperaccumulation of fatty acids in the stored vegetative biomass of FLOC, which is approaching the lipid yield of soybean (0.44 tons/ha). Processing of the vegetative tissues of oilcane recovered 0.20 tons/ha, which represents the recovery of 55% of the total lipids from FLOC. Chemical-free hydrothermal bioprocessing of ILOC and FLOC bagasse and leaves at 180 °C for 10 min prevented the degeneration of in situ plant lipids. This allowed the recovery of lipids at the end of the bioprocess with a major fraction of lipids remaining in the biomass residues after pretreatment and saccharification. Improvements through refined biomass processing, crop management, and metabolic engineering are expected to boost lipid yields and make oilcane a prime feedstock for the production of biodiesel.
keywords:
Conversion;Feedstock Production;Feedstock Bioprocessing;Lipidomics;Metabolomics
published:
2021-05-10
This dataset contains data used in publication "Institutional Data Repository Development, a Moving Target" submitted to Code4Lib Journal. It is a tabular data file describing attributes of data files in datasets published in Illinois Data Bank 2016-04-01 to 2021-04-01.
keywords:
institutional repository
published:
2022-01-27
Li, Shuai; Moller, Christopher A.; Mitchell, Noah G.; Lee, DoKyoung; Sacks, Erik J.; Ainsworth, Elizabeth A.
(2022)
Twenty-two genotypes of C4 species grown under ambient and elevated O3 concentration were studied at the SoyFACE (40°02’N, 88°14’W) in 2019. This dataset contains leaf morphology, photosynthesis and nutrient contents measured at three time points. The results of CO2 response curves are also included.
keywords:
C4, O3, photosynthesis
published:
2024-10-18
Exhaustive species inventory of suburban wetland complex in northeast Ohio (Cuyahoga County).
keywords:
floristic survey; wetland complex; comprehensive species list
published:
2019-10-03
Choi, Sang Hyun; Rao, Vikyath D.; Gernat, Tim; Hamilton, Adam R.; Robinson, Gene E.; Goldenfeld, Nigel
(2019)
Dataset for F2F events of honeybees. F2F events are defined as face-to-face encounters of two honeybees that are close in distance and facing each other but not connected by the proboscis, thus not engaging in trophallaxis.
The first and the second columns show the unique id's of honeybees participating in F2F events. The third column shows the time at which the F2F event started while the fourth column shows the time at which it ended. Each time is in the Unix epoch timestamp in milliseconds.
keywords:
honeybee;face-to-face interaction
published:
2021-05-14
Liu, Menglin; Gramig, Benjamin
(2021)
Please cite as: Menglin Liu and Benjamin M. Gramig. "Survey of Cover Crop, Conservation Tillage and Nutrient Management Practice Usage in Illinois and 2020 Fall Covers for Spring Savings Crop Insurance Discount Program Participation." Report to the Illinois Department of Agriculture and Fall Covers for Spring Savings working group. Center for the Economics of Sustainability and Department of Agricultural and Consumer Economics, University of Illinois at Urbana-Champaign. 2021. https://doi.org/10.13012/B2IDB-5222984_V1
keywords:
cover crops; Illinois; 2020; conservation tillage; nutrient management practices; farmer survey; NLRS
published:
2025-10-17
Cai, Yingqi; Zhai, Zhiyang; Blanford, Jantana; Liu, Hui; Shi, Hai; Schwender, Jorg; Xu, Changcheng; Shanklin, John
(2025)
Storage lipids (mostly triacylglycerols, TAGs) serve as an important energy and carbon reserve in plants, and hyperaccumulation of TAG in vegetative tissues can have negative effects on plant growth. Purple acid phosphatase2 (PAP2) was previously shown to affect carbon metabolism and boost plant growth. However, the effects of PAP2 on lipid metabolism remain unknown. Here, we demonstrated that PAP2 can stimulate a futile cycle of fatty acid (FA) synthesis and degradation, and mitigate negative growth effects associated with high accumulation of TAG in vegetative tissues. Constitutive expression of PAP2 in Arabidopsis thaliana enhanced both lipid synthesis and degradation in leaves and led to a substantial increase in seed oil yield. Suppressing lipid degradation in a PAP2-overexpressing line by disrupting sugar-dependent1 (SDP1), a predominant TAG lipase, significantly elevated vegetative TAG content and improved plant growth. Diverting FAs from membrane lipids to TAGs in PAP2-overexpressing plants by constitutively expressing phospholipid:diacylglycerol acyltransferase1 (PDAT1) greatly increased TAG content in vegetative tissues without compromising biomass yield. These results highlight the potential of combining PAP2 with TAG-promoting factors to enhance carbon assimilation, FA synthesis and allocation to TAGs for optimized plant growth and storage lipid accumulation in vegetative tissues.
keywords:
Feedstock Production;Biomass Analytics;Lipidomics
published:
2025-10-17
Deewan, Anshu; Liu, Jing-Jing; Jagtap, Sujit Sadashiv; Yun, Eun Ju; Walukiewicz, Hanna E.; Jin, Yong-Su; Rao, Christopher V.
(2025)
Oleaginous yeasts have received significant attention due to their substantial lipid storage capability. The accumulated lipids can be utilized directly or processed into various bioproducts and biofuels. Lipomyces starkeyi is an oleaginous yeast capable of using multiple plant-based sugars, such as glucose, xylose, and cellobiose. It is, however, a relatively unexplored yeast due to limited knowledge about its physiology. In this study, we have evaluated the growth of L. starkeyi on different sugars and performed transcriptomic and metabolomic analyses to understand the underlying mechanisms of sugar metabolism. Principal component analysis showed clear differences resulting from growth on different sugars. We have further reported various metabolic pathways activated during growth on these sugars. We also observed non-specific regulation in L. starkeyi and have updated the gene annotations for the NRRL Y-11557 strain. This analysis provides a foundation for understanding the metabolism of these plant-based sugars and potentially valuable information to guide the metabolic engineering of L. starkeyi to produce bioproducts and biofuels.
keywords:
Conversion;Metabolomics;Transcriptomics
published:
2019-08-29
Nardulli, Peter; Peyton, Buddy; Bajjalieh, Joseph; Singh, Ajay; Martin, Michael; Shalmon, Dan; Althaus, Scott
(2019)
This is part of the Cline Center’s ongoing Social, Political and Economic Event Database Project (SPEED) project. Each observation represents an event involving civil unrest, repression, or political violence in Sierra Leone, Liberia, and the Philippines (1979-2009). These data were produced in an effort to describe the relationship between exploitation of natural resources and civil conflict, and to identify policy interventions that might address resource-related grievances and mitigate civil strife.
This work is the result of a collaboration between the US Army Corps of Engineers’ Construction Engineer Research Laboratory (ERDC-CERL), the Swedish Defence Research Agency (FOI) and the Cline Center for Advanced Social Research (CCASR). The project team selected case studies focused on nations with a long history of civil conflict, as well as lucrative natural resources.
The Cline Center extracted these events from country-specific articles published in English by the British Broadcasting Corporation (BBC) Summary of World Broadcasts (SWB) from 1979-2008 and the CIA’s Foreign Broadcast Information Service (FBIS) 1999-2004. Articles were selected if they mentioned a country of interest, and were tagged as relevant by a Cline Center-built machine learning-based classification algorithm. Trained analysts extracted nearly 10,000 events from nearly 5,000 documents. The codebook—available in PDF form below—describes the data and production process in greater detail.
keywords:
Cline Center for Advanced Social Research; civil unrest; Social Political Economic Event Dataset (SPEED); political; event data; war; conflict; protest; violence; social; SPEED; Cline Center; Political Science
published:
2021-03-31
This archive contains the datasets used in the paper "Recursive MAGUS: scalable and accurate multiple sequence alignment".
- 16S.3, 16S.T, 16S.B.ALL
- HomFam
- RNASim
These can also be found at https://sites.google.com/eng.ucsd.edu/datasets/alignment/pastaupp
published:
2022-01-20
This dataset provides a 50-state (and DC) survey of state-level tax credits modeled after the federal New Markets Tax Credit program, including summaries of the tax credit amount and credit periods, key definitions, eligibility criteria, application process, and degree of conformity to federal law.
keywords:
New Markets Tax Credits; NMTC; tax incentives; state law
published:
2022-01-20
This dataset provides a 50-state (and DC) survey of state-level enterprise zone laws, including summaries and analyses of zone eligibility criteria, eligible investments, incentives to invest in human capital and affordable housing, and taxpayer eligibility.
keywords:
Enterprise Zones; tax incentives; state law
published:
2025-09-18
Saifuddin, Mustafa; Bhatnagar, Jennifer; Segrè, Daniel; Finzi, Adrien C.
(2025)
Respiration by soil bacteria and fungi is one of the largest fluxes of carbon (C) from the land surface. Although this flux is a direct product of microbial metabolism, controls over metabolism and their responses to global change are a major uncertainty in the global C cycle. Here, we explore an in silico approach to predict bacterial C-use efficiency (CUE) for over 200 species using genome-specific constraint-based metabolic modeling. We find that potential CUE averages 0.62 ± 0.17 with a range of 0.22 to 0.98 across taxa and phylogenetic structuring at the subphylum levels. Potential CUE is negatively correlated with genome size, while taxa with larger genomes are able to access a wider variety of C substrates. Incorporating the range of CUE values reported here into a next-generation model of soil biogeochemistry suggests that these differences in physiology across microbial taxa can feed back on soil-C cycling.
keywords:
Sustainability;Metabolomics;Modeling
published:
2021-08-28
Southey, Bruce; Rodriguez-Zas, Sandra
(2021)
Metabolite identifications and profiles of liver samples from 22 day old male and female pigs from gilt that exposed to porcine reproductive and respiratory syndrome virus (P) or not (C) that were weaned at 21 days of age (W) or not (N). Profiles were obtained by University of Illinois Carver Metabolomics Center. Spectrum for each sample was acquired using a gas chromatography mass spectrometry system consisting of an Agilent 7890 gas chromatograph, an Agilent 5975 MSD, and an HP 7683B auto sampler.
keywords:
gas chromatography; mass spectrometry; maternal immune activation; weaning; liver
published:
2025-01-29
Quiroz, Edwin; Ashley, Mary V.; Zaya, David N.
(2025)
These data records weekly aphid and monarch butterfly (Danaus plexippus) neonate counts on individual milkweed plants in multiple raised garden beds in Chicago during the summers of 2023 and 2024. Relationships between aphid infestation and monarch neonates can be investigated along with weekly trends of monarch oviposition and aphid abundances. All gardens included in this study were on the University of Illinois Chicago campus, and within 100 meters of proximity. Data are provided on three milkweed species in 2023, and one milkweed species in 2024.
keywords:
Aphis; Myzocallis; Danaus plexippus; urban gardens; Asclepias syriaca; milkweeds
published:
2021-04-22
All code in Matlab .m scripts or functions (version R2019b)
Affiliated with article “Temperate and chronic virus competition leads to low lysogen frequency” published in the Journal of Theoretical Biology (2021)
Codes simulate and plot the solutions of an Ordinary Differential Equations model and generate bifurcation diagrams.