Home
Deposit
Find
Policies
Guides
Contact
Log in
Toggle navigation
Illinois Data Bank
Deposit Dataset
Find Data
Policies
Guides
Contact Us
Log in with NetID
Displaying 301 - 325 of 754 in total
<
1
2
…
9
10
11
12
13
14
15
16
17
…
30
31
>
25 per page
50 per page
Show All
Go
Clear Filters
Generate Report from Search Results
Subject Area
Life Sciences (399)
Social Sciences (141)
Physical Sciences (108)
Technology and Engineering (68)
Uncategorized
Arts and Humanities (1)
Funder
Other (232)
U.S. National Science Foundation (NSF) (214)
U.S. Department of Energy (DOE) (76)
U.S. National Institutes of Health (NIH) (76)
U.S. Department of Agriculture (USDA) (51)
Illinois Department of Natural Resources (IDNR) (20)
U.S. Geological Survey (USGS) (7)
U.S. National Aeronautics and Space Administration (NASA) (6)
Illinois Department of Transportation (IDOT) (4)
U.S. Army (3)
Publication Year
2021 (108)
2024 (108)
2022 (106)
2020 (96)
2023 (75)
2019 (72)
2018 (61)
2025 (57)
2017 (36)
2016 (30)
2009 (1)
2011 (1)
2012 (1)
2014 (1)
2015 (1)
License
CC0 (414)
CC BY (317)
custom (23)
Illinois Data Bank Dataset Search Results
Dataset Search Results
published: 2023-06-29
Pandit, Akshay; Karakoc, Deniz Berfin; Konar, Megan (2023): Data for: Spatially detailed agricultural and food trade between China and the United States. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-3649756_V1
This database provides estimates of agricultural and food commodity flows [in both tons and $US] between the US and China for the year 2017. Pairwise information is provided between US states and Chinese provinces, and US counties and Chinese provinces for 7 Standardized Classification of Transported Goods (SCTG) commodity categories. Additionally, crosswalks are provided to match Harmonized System (HS) codes and China's Multi-Regional Input Output (MRIO) commodity sectors to their corresponding SCTG commodity codes. The included SCTG commodities are: - SCTG 01: Iive animals and fish - SCTG 02: cereal grains - SCTG 03: agricultural products (except for animal feed, cereal grains, and forage products) - SCTG 04: animal feed, eggs, honey, and other products of animal origin - SCTG 05: meat, poultry, fish, seafood, and their preparations - SCTG 06: milled grain products and preparations, and bakery products - SCTG 07: other prepared foodstuffs, fats and oils For additional information, please see the related paper by Pandit et al. (2022) in Environmental Research Letters. ADD DOI WHEN RECEIVED
keywords:
Food flows; High-resolution; County-scale; Bilateral; United States; China
published: 2024-03-25
Suski, Cory; Dai, Qihong (2024): Data for "Differing physiological performance of coexisting cool- and warmwater fish species under heatwaves in the Midwestern United States". University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-1022017_V1
This is the dataset for the manuscript titled, "Differing physiological performance of coexisting cool- and warmwater fish species under heatwaves in the Midwestern United States"
keywords:
climate change; heat wave; metabolic rate; swimming; predator-prey interaction; thermal tolerance; Sander vitreus; walleye; largemouth bass; species distributions
published: 2024-01-19
Digrado, Anthony; Montes, Christopher; Baxter, Ivan; Ainsworth, Elizabeth (2024): Soybean seed quality response to eCO2 data files. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-6453957_V2
This data set is related to a SoyFACE experiment conducted in 2004, 2006, 2007, and 2008 with the soybean cultivars Loda and HS93-4118. The experiment looked at how seed elements were affected by elevated CO2 and yield. In this V2, 2 new files were added per journal requirement. Total there are 5 data files in text format within the digrado_et_al_gcb_data_V2 and 1 readme file. The name of files are listed below. Details about headers are explained in the readme.txt file. <b>1. ionomic_data.txt file</b> contains the ionomic data (mg/kg) for the two cultivars. The file contains all six technical replicates for each plot. The cultivar, year, treatment, and the plot from which the samples were collected are given for each entry. <b>2. yield_data.txt file</b> contains the yield data for the two cultivars (seed yield in kg/ha, seed yield in bu/a, Protein (%), Oil (%)). The file contains yield data for every plot. The cultivar, year, treatment, and the plot from which the samples were collected are given for each entry. <b>3. mineral_pro_oil_yield.txt file</b> contains the yield per hectare for each mineral (g/ha) along with the yield per hectare for protein and oil (t/ha). This was obtained by multiplying the seed content of each element (minerals, protein, and oil) by the total seed yield. The file contains yield data for every plots. The cultivar, year, treatment, and the plot from which the samples were collected are given for each entry. <b>4. economic_assessment.txt file</b> contains data used to assess the financial impact of altered seed oil content on soybean oil production. <b>5. meteorological_data.txt file</b> contains the meteorological data recorded by a weather station located ~ 3km from the experimental site (Willard Airport Champaign). Data covering the period between May 28 and September 24 were used for 2004; between May 25 and September 24 were used in 2006; between May 23 and September 17 in 2007; and between June 16 and October 24 in 2008.
keywords:
protein; oil; mineral; SoyFACE; nutrient; Glycine max; soybean; yield; CO2; agriculture; climate change
published: 2024-03-28
Zhang, Yue; Zhao, Helin; Huang, Siyuan; Hossain, Mohhamad Abir; van der Zande, Arend (2024): Enhancing Carrier Mobility In Monolayer MoS2 Transistors With Process induced Strain. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-7519929_V1
Read me file for the data repository ******************************************************************************* This repository has raw data for the publication "Enhancing Carrier Mobility In Monolayer MoS2 Transistors With Process Induced Strain". We arrange the data following the figure in which it first appeared. For all electrical transfer measurement, we provide the up-sweep and down-sweep data, with voltage units in V and conductance unit in S. All Raman modes have unit of cm^-1. ******************************************************************************* How to use this dataset All data in this dataset is stored in binary Numpy array format as .npy file. To read a .npy file: use the Numpy module of the python language, and use np.load() command. Example: suppose the filename is example_data.npy. To load it into a python program, open a Jupyter notebook, or in the python program, run: import numpy as np data = np.load("example_data.npy") Then the example file is stored in the data object. *******************************************************************************
published: 2016-12-13
Fraebel, David T.; Kuehn, Seppe (2016): Sequencing data for motility selection experiments. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-3958294_V1
BAM files for founding strain (MG1655-motile) as well as evolved strains from replicate motility selection experiments in low-viscosity agar plates containing either rich medium (LB) or minimal medium (M63+0.18mM galactose)
published: 2022-03-25
Shen, Chengze; Park, Minhyuk; Warnow, Tandy (2022): The 16S.B.ALL dataset in 100-HF condition. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-6604429_V1
This upload includes the 16S.B.ALL in 100-HF condition (referred to as 16S.B.ALL-100-HF) used in Experiment 3 of the WITCH paper (currently accepted in principle by the Journal of Computational Biology). 100-HF condition refers to making sequences fragmentary with an average length of 100 bp and a standard deviation of 60 bp. Additionally, we enforced that all fragmentary sequences to have lengths > 50 bp. Thus, the final average length of the fragments is slightly higher than 100 bp (~120 bp). In this case (i.e., 16S.B.ALL-100-HF), 1,000 sequences with lengths 25% around the median length are retained as "backbone sequences", while the remaining sequences are considered "query sequences" and made fragmentary using the "100-HF" procedure. Backbone sequences are aligned using MAGUS (or we extract their reference alignment). Then, the fragmentary versions of the query sequences are added back to the backbone alignment using either MAGUS+UPP or WITCH. More details of the tar.gz file are described in README.txt.
keywords:
MAGUS;UPP;Multiple Sequence Alignment;eHMMs
published: 2016-06-06
Fegley, Brent D. (2016): Datasets for modeling collaborative formation and collaborative "success". University of Illinois at Urbana-Champaign. https://doi.org/10.13012/J81Z429G
These datasets represent first-time collaborations between first and last authors (with mutually exclusive publication histories) on papers with 2 to 5 authors in years [1988,2009] in PubMed. Each record of each dataset captures aspects of the similarity, nearness, and complementarity between two authors about the paper marking the formation of their collaboration.
published: 2018-05-06
Sukenik, Shahar; Salam, Mohammed; Wang, Yuhan; Gruebele, Martin (2018): Dataset for: In-cell titration of small solutes controls protein stability and aggregation. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-4308433_V1
This deposit contains all raw data and analysis from the paper "In-cell titration of small solutes controls protein stability and aggregation". Data is collected into several types: 1) analysis*.tar.gz are the analysis scripts and the resulting data for each cell. The numbers correspond to the numbers shown in Fig.S1. (in publication) 2) scripts.tar.gz contains helper scripts to create the dataset in bash format. 3) input.tar.gz contains headers and other information that is fed into bash scripts to create the dataset. 4) All rawData*.tar.gz are tarballs of the data of cells in different solutes in .mat files readable by matlab, as follows: - Each experiment included in the publication is represented by two matlab files: (1) a calibration jump under amber illumination (_calib.mat suffix) (2) a full jump under blue illumination (FRET data) - Each file contains the following fields: coordleft - coordinates of cropped and aligned acceptor channel on the original image coordright - coordinates of cropped and aligned donor channel on the original image] dataleft - a 3d 12-bit integer matrix containing acceptor channel flourescence for each pixel and time step. Not available in _calib files dataright - a 3d 12-bit integer matrix containing donor channel flourescence for each pixel and time step. This will be mCherry in _calib files and AcGFP in data files. frame1 - original image size imgstd - cropped dimensions numFrames - number of frames in dataleft and dataright videos - a structure file containing camera data. Specifically, videos.TimeStamp includes the time from each frame.
keywords:
Live cell; FRET microscopy; osmotic challenge; intracellular titrations; protein dynamics
published: 2022-08-08
Shen, Chengze; Liu, Baqiao; Williams, Kelly P.; Warnow, Tandy (2022): Datasets for EMMA: A New Method for Computing Multiple Sequence Alignments given a Constraint Subset Alignment. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2567453_V1
This upload contains all datasets used in Experiment 2 of the EMMA paper (appeared in WABI 2023): Shen, Chengze, Baqiao Liu, Kelly P. Williams, and Tandy Warnow. "EMMA: A New Method for Computing Multiple Sequence Alignments given a Constraint Subset Alignment". The zip file has the following structure (presented as an example): salma_paper_datasets/ |_README.md |_10aa/ |_crw/ |_homfam/ |_aat/ | |_... |_... |_het/ |_5000M2-het/ | |_... |_5000M3-het/ ... |_rec_res/ Generally, the structure can be viewed as: [category]/[dataset]/[replicate]/[alignment files] # Categories: 1. 10aa: There are 10 small biological protein datasets within the `10aa` directory, each with just one replicate. 2. crw: There are 5 selected CRW datasets, namely 5S.3, 5S.E, 5S.T, 16S.3, and 16S.T, each with one replicate. These are the cleaned version from Shen et. al. 2022 (MAGUS+eHMM). 3. homfam: There are the 10 largest Homfam datasets, each with one replicate. 4. het: There are three newly simulated nucleotide datasets from this study, 5000M2-het, 5000M3-het, and 5000M4-het, each with 10 replicates. 5. rec\_res: It contains the Rec and Res datasets. Detailed dataset generation can be found in the supplementary materials of the paper. # Alignment files There are at most 6 `.fasta` files in each sub-directory: 1. `all.unaln.fasta`: All unaligned sequences. 2. `all.aln.fasta`: Reference alignments of all sequences. If not all sequences have reference alignments, only the sequences that have will be included. 3. `all-queries.unaln.fasta`: All unaligned query sequences. Query sequences are sequences that do not have lengths within 25% of the median length (i.e., not full-length sequences). 4. `all-queries.aln.fasta`: Reference alignments of query sequences. If not all queries have reference alignments, only the sequences that have will be included. 5. `backbone.unaln.fasta`: All unaligned backbone sequences. Backbone sequences are sequences that have lengths within 25% of the median length (i.e., full-length sequences). 6. `backbone.aln.fasta`: Reference alignments of backbone sequences. If not all backbone sequences have reference alignments, only the sequences that have will be included. >If all sequences are full-length sequences, then `all-queries.unaln.fasta` will be missing. >If fewer than two query sequences have reference alignments, then `all-queries.aln.fasta` will be missing. >If fewer than two backbone sequences have reference alignments, then `backbone.aln.fasta` will be missing. # Additional file(s) 1. `350378genomes.txt`: the file contains all 350,378 bacterial and archaeal genome names that were used by Prodigal (Hyatt et. al. 2010) to search for protein sequences.
keywords:
SALMA;MAFFT;alignment;eHMM;sequence length heterogeneity
published: 2022-02-14
Yao, Yu; Curtis, Jeffrey; Ching, Joseph; Zheng, Zhonghua; Riemer, Nicole (2022): Data for: Quantifying the effects of mixing state on aerosol optical properties. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-8157303_V1
This dataset contains simulation results from numerical model PartMC-MOSAIC used in the article "Quantifying the effects of mixing state on aerosol optical properties". This article is submitted to the journal Atmospheric Physics and Chemistry. There are total 100 scenario directories in this dataset, denoted from 00-99. Each scenario contains 25 NetCDF files hourly output from PartMC-MOSAIC simulations containing the simulated gas and particle information. The data was produced using version 2.5.0 of PartMC-MOSAIC. Instructions to compile and run PartMC-MOSAIC are available at https://github.com/compdyn/partmc. The chemistry code MOSAIC is available by request from Rahul.Zaveri@pnl.gov. For more details of reproducing the cases, please contact nriemer@illinois.edu and yuyao3@illinois.edu.
keywords:
Aerosol mixing state; Aerosol optical properties; Mie calculation; Black Carbon
published: 2015-12-16
Nguyen, Nam-phuong; Mirarab, Siavash; Kumar, Keerthana; Warnow, Tandy (2015): Data for Ultra-Large Alignments Using Phylogeny-Aware Profiles. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-3174395_V1
This dataset contains the data for PASTA and UPP. PASTA data was used in the following articles: Mirarab, Siavash, Nam Nguyen, Sheng Guo, Li-San Wang, Junhyong Kim, and Tandy Warnow. “PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.” Journal of Computational Biology 22, no. 5 (2015): 377–86. doi:10.1089/cmb.2014.0156. Mirarab, Siavash, Nam Nguyen, and Tandy Warnow. “PASTA: Ultra-Large Multiple Sequence Alignment.” Edited by Roded Sharan. Research in Computational Molecular Biology, 2014, 177–91. UPP data was used in: Nguyen, Nam-phuong D., Siavash Mirarab, Keerthana Kumar, and Tandy Warnow. “Ultra-Large Alignments Using Phylogeny-Aware Profiles.” Genome Biology 16, no. 1 (December 16, 2015): 124. doi:10.1186/s13059-015-0688-z.
published: 2019-08-13
Nowak, Jennifer E.; Sweet, Andrew D.; Weckstein, Jason D.; Johnson, Kevin P. (2019): Data for: A molecular phylogenetic analysis of the genera of fruit doves and their allies using dense taxonomic sampling. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-9797270_V1
Multiple sequence alignments from concatenated nuclear and mitochondrial genes and resulting phylogenetic tree files of fruit doves and their close relatives. Files include: BEAST input XML file (fruit_dove_beast_input.xml); a maximum clade credibility tree from a BEAST analysis (fruit_dove_beast_mcc.tre); concatenated multiple sequence alignment NEXUS files for the novel dataset (fruit_dove_concatenated_alignment.nex, 76 taxa, 4,277 characters) and the dataset with additional sequences (fruit_dove_plus_cibois_data_concatenated_alignment.nex, 204 taxa, 4,277 characters), both of which contain a MrBayes block including partition information; and 50% majority-rule consensus trees generated from MrBayes analyses, using the NEXUS alignment files as inputs (fruit_dove_mrbayes_consensus.tre, fruit_dove_plus_cibois_data_mrbayes_consensus.tre).
keywords:
fruit doves; multiple sequence alignment; phylogeny; Aves: Columbidae
published: 2020-06-02
Xue, Qingquan; Dietrich, Christopher; Zhang, Yalin (2020): NEXUS file for phylogenetic analysis of Eurymelinae (Hemiptera: Cicadellidae). University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-3573054_V1
The text file contains the original data used in the phylogenetic analyses of Xue et al. (2020: Systematic Entomology, in press). The text file is marked up according to the standard NEXUS format commonly used by various phylogenetic analysis software packages. The file will be parsed automatically by a variety of programs that recognize NEXUS as a standard bioinformatics file format. The first six lines of the file identify the file as NEXUS, indicate that the file contains data for 89 taxa (species) and 2676 characters, indicate that the first 2590 characters are DNA sequence and the last 86 are morphological, that gaps inserted into the DNA sequence alignment and inapplicable morphological characters are indicated by a dash, and that missing data are indicated by a question mark. The file contains aligned nucleotide sequence data for 5 gene regions and 86 morphological characters. The positions of data partitions are indicated in the mrbayes block of commands for the phylogenetic program MrBayes at the end of the file (Subset1 = 16S gene; Subset2 = 28S gene; Subset3 = COI gene; Subset 4 = Histone H3 and H2A genes). The mrbayes block also contains instructions for MrBayes on various non-default settings for that program. These are explained in the original publication. Descriptions of the morphological characters and more details on the species and specimens included in the dataset are provided in the supplementary document included as a separate pdf, also available from the journal website. The original raw DNA sequence data are available from NCBI GenBank under the accession numbers indicated in the supplementary file.
keywords:
phylogeny; DNA sequence; morphology; Insecta; Hemiptera; Cicadellidae; leafhopper; evolution; 28S rDNA; 16S rDNA; histone H3; histone H2A; cytochrome oxidase I; Bayesian analysis
published: 2020-02-12
Asplund, Joshua; Karahalios, Karrie (2020): Data for: Auditing Race and Gender Discrimination in Online Housing Markets. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-1408573_V1
This dataset contains the results of a three month audit of housing advertisements. It accompanies the 2020 ICWSM paper "Auditing Race and Gender Discrimination in Online Housing Markets". It covers data collected between Dec 7, 2018 and March 19, 2019. There are two json files in the dataset: The first contains a list of json objects representing advertisements separated by newlines. Each object includes the date and time it was collected, the image and title (if collected) of the ad, the page on which it was displayed, and the training treatment it received. The second file is a list of json objects representing a visit to a housing lister separated by newlines. Each object contains the url, training treatment applied, the location searched, and the metadata of the top sites scraped. This metadata includes location, price, and number of rooms. The dataset also includes the raw images of ads collected in order to code them by interest and targeting. These were captured by selenium and named using a perceptive hash to de-duplicate images.
keywords:
algorithmic audit; advertisement audit;
published: 2021-10-27
de Jesús Astacio, Luis Miguel ; Prabhakara, Kaumudi Hassan; Li, Zeqian; Mickalide, Harry; Kuehn , Seppe (2021): Closed microbial communities self-organize to persistently cycle carbon -- 16S Sequencing data. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-8967648_V1
Shared dataset consists of 16S sequencing data of microbial communities. Each community is composed of heterotrophic bacteria derived from one of two soil samples and the model algae Chlamydomonas reinhardtii. Each comunity was placed in a materially closed environment with an initial supply of carbon in the media and subjected to light-dark cycles. The closed microbial ecosystems (CES) survived via carbon cycling. Each CES was subjected to rounds of dilution, after which the community was sequenced (data provided here). The shared dataset allowed us to conclude that CES consistently self-assembled to cycle carbon (data not provided) via conserved metabolic capabilites (data not provided) dispite differences in taxonomic composition (data provided). --------------------------- Naming convention: [soil sample = A or B][CES replicate = 1,2,3, or 4]_[round number = 1,2,3,or 4]_[reverse read = R or forward read = F]_filt.fastq Example -- A1_r1_F_filt.fastq means soil sample A, CES replicate 1, end of round1, forward read
keywords:
16S seq; .fastq; closed microbial ecosystems; carbon cycling
published: 2018-07-29
Molloy, Erin K.; Warnow, Tandy (2018): NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-1424746_V1
This repository includes scripts, datasets, and supplementary materials for the study, "NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge. ***When downloading datasets, please note that the following errors.*** In README.txt, lines 37 and 38 should read: + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre Note that the file names (fasttree-exon.tre and fasttree-intron.tre) are swapped. In tools.zip, the compare_trees.py and the compare_tree_lists.py scripts incorrectly refer to the "symmetric difference error rate" as the "Robinson-Foulds error rate". Because the normalized symmetric difference and the normalized Robinson-Foulds distance are equal for binary trees, this does not impact the species tree error rates reported in the study. This could impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the normalized symmetric difference is always greater than or equal to the normalized Robinson-Foulds distance, so the gene tree error rates reported in the study are more conservative. In njmerge-supplementary-materials.pdf, the alpha parameter shown in Supplementary Table S2 is actually the divisor D, which is used to compute alpha for each gene as follows. 1. For each gene, a random value X between 0 and 1 is drawn from a uniform distribution. 2. Alpha is computed as -log(X) / D, where D is 4.2 for exons, 1.0 for UCEs, and 0.4 for introns (as stated in Table S2). Note that because the mean of the uniform distribution (between 0 and 1) is 0.5, the mean alpha value is -log(0.5) / 4.2 = 0.16 for exons, -log(0.5) / 1.0 = 0.69 for UCEs, and -log(0.5) / 0.4 = 1.73 for introns.
keywords:
phylogenomics; species trees; incomplete lineage sorting; divide-and-conquer
published: 2024-01-04
Kim, Hyunchul; Zhao, Helin; van der Zande, Arend (2024): Stretchable thin-film transistors based on wrinkled graphene and MoS2 heterostructures. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-7325893_V1
This data set includes all of data related to stretchable TFTs based on 2D heterostructures including optical images of TFTs, Raman and Photoluminescence characteristics data, Transport measurement data, and AFM topography data. Abstract Two-dimensional (2D) materials are outstanding candidates for stretchable electronics, but a significant challenge is their heterogeneous integration into stretchable geometries on soft substrates. Here, we demonstrate a strategy for stretchable thin film transistors (2D S-TFT) based on wrinkled heterostructures on elastomer substrates where 2D materials formed the gate, source, drain, and channel, and characterized them with Raman spectroscopy and transport measurements.
keywords:
2D materials; 2D heterstructures; Stretchable electronics; transistors; buckling engineering
published: 2014-10-29
Nguyen, Nam-phuong; Mirarab, Siavash; Bo, Liu; Pop, Mihai; Warnow, Tandy (2014): Data for Taxonomic Identification and Phylogenetic Profiling. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-8783447_V1
This dataset provides the data for Nguyen, Nam-phuong, et al. "TIPP: taxonomic identification and phylogenetic profiling." Bioinformatics 30.24 (2014): 3548-3555.
published: 2023-03-08
Majeed, Fahd; Khanna, Madhu (2023): Code and Data for "Carbon Mitigation Payments Can Reduce the Riskiness of Bioenergy Crop Production". University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-6296964_V1
A stochastic domination analysis model was developed to examine the effect that emerging carbon markets can have on the spatially varying returns and risk profiles of bioenergy crops relative to conventional crops. The code is written in MATLAB, and includes the calculated output. See the README file for instructions to run the code.
keywords:
bioenergy crops; economic modeling; stochastic domination analysis model;
published: 2018-12-20
Dong, Xiaoru; Xie, Jingyi; Linh, Hoang (2018): Inclusion_Criteria_Annotation. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-5958960_V1
File Name: Inclusion_Criteria_Annotation.csv Data Preparation: Xiaoru Dong Date of Preparation: 2018-12-14 Data Contributions: Jingyi Xie, Xiaoru Dong, Linh Hoang Data Source: Cochrane systematic reviews published up to January 3, 2018 by 52 different Cochrane groups in 8 Cochrane group networks. Associated Manuscript authors: Xiaoru Dong, Jingyi Xie, Linh Hoang, and Jodi Schneider. Associated Manuscript, Working title: Machine classification of inclusion criteria from Cochrane systematic reviews. Description: The file contains lists of inclusion criteria of Cochrane Systematic Reviews and the manual annotation results. 5420 inclusion criteria were annotated, out of 7158 inclusion criteria available. Annotations are either "Only RCTs" or "Others". There are 2 columns in the file: - "Inclusion Criteria": Content of inclusion criteria of Cochrane Systematic Reviews. - "Only RCTs": Manual Annotation results. In which, "x" means the inclusion criteria is classified as "Only RCTs". Blank means that the inclusion criteria is classified as "Others". Notes: 1. "RCT" stands for Randomized Controlled Trial, which, in definition, is "a work that reports on a clinical trial that involves at least one test treatment and one control treatment, concurrent enrollment and follow-up of the test- and control-treated groups, and in which the treatments to be administered are selected by a random process, such as the use of a random-numbers table." [Randomized Controlled Trial publication type definition from https://www.nlm.nih.gov/mesh/pubtypes.html]. 2. In order to reproduce the relevant data to this, please get the code of the project published on GitHub at: https://github.com/XiaoruDong/InclusionCriteria and run the code following the instruction provided.
keywords:
Inclusion criteria, Randomized controlled trials, Machine learning, Systematic reviews
published: 2019-03-25
Clark, Lindsay V.; Dwiyanti, Maria Stefanie; Anzoua, Kossonou G.; Brummer, Joe E.; Ghimire, Bimal Kumar; Głowacka, Katarzyna; Hall, Megan; Heo, Kweon; Jin, Xiaoli; Lipka, Alexander E.; Peng, Junhua; Yamada, Toshihiko; Yoo, Ji Hye; Yu, Chang Yeon; Zhao, Hua; Long, Stephen P.; Sacks, Erik J. (2019): Miscanthus sinensis multi-location trial: phenotypic analysis, genome-wide association, and genomic prediction . University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-0790815_V3
This dataset contains genotypic and phenotypic data, R scripts, and the results of analysis pertaining to a multi-location field trial of Miscanthus sinensis. Genome-wide association and genomic prediction were performed for biomass yield and 14 yield-component traits across six field trial locations in Asia and North America, using 46,177 single-nucleotide polymorphism (SNP) markers mined from restriction site-associated DNA sequencing (RAD-seq) and 568 M. sinensis accessions. Genomic regions and candidate genes were identified that can be used for breeding improved varieties of M. sinensis, which in turn will be used to generate new M. xgiganteus clones for biomass.
keywords:
miscanthus; genotyping-by-sequencing (GBS); genome-wide association studies (GWAS); genomic selection
published: 2024-02-15
Hoggatt, Meredith; Starbuck, Clarissa; O'Keefe, Joy (2024): Data for "Acoustic monitoring yields informative bat population density estimates". University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-7001459_V1
Dataset includes the dataset for estimating bat density from acoustic data and the R code. The data support a publication by Meredith L. Hoggatt, Clarissa A. Starbuck, and Joy M. O'Keefe entitled Acoustic monitoring yields informative bat population density estimates.
keywords:
acoustics; bats; monitoring; population density; random encounter model
published: 2019-02-22
Fernández, Roberto; Parker, Gary; Stark, Colin (2019): Experiments on patterns of alluvial cover and bedrock erosion in a meandering channel. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2-3044828_V1
This dataset includes measurements taken during the experiments on patterns of alluvial cover over bedrock. The dataset includes an hour worth of timelapse images taken every 10s for eight different experimental conditions. It also includes the instantaneous water surface elevations measured with eTapes at a frequency of 10Hz for each experiment. The 'Read me Data.txt' file explains in more detail the contents of the dataset.
keywords:
bedrock; erosion; alluvial; meandering; alluvial cover; sinuosity; flume; experiments; abrasion;
published: 2024-02-16
Zhang, Mingxiao; Sutton, Bradley (2024): Sample Data for “Measuring CSF Shunt Flow with MRI Using Flow Enhancement of Signal Intensity (FENSI)”. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-7252521_V1
Sample data from one typical phantom test and one deidentified shunt patient test (shown in Fig. 8 of the MRM paper), with the corresponding analysis code for the Shunt-FENSI technique. For the MRM paper “Measuring CSF Shunt Flow with MRI Using Flow Enhancement of Signal Intensity (FENSI)”
keywords:
Shunt-FENSI; MRM; Hydrocephalus; VP Shunt; Flow Quantification; Pediatric Neurosurgery; Pulse Sequence; Signal Simulation
published: 2016-12-20
Wickes, Elizabeth; Nakamura, Katia (2016): Supporting data processing scripts and example data for Peru AIDData analysis. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-7860393_V1
Scripts and example data for AIDData (aiddata.org) processing in support of forthcoming Nakamura dissertation. This dataset includes two sets of scripts and example data files from an aiddata.org data dump. Fuller documentation about the functionality for these scripts is within the readme file. Additional background information and description of usage will be in the forthcoming Nakamura dissertation (link will be added when available). Data originally supplied by Nakamura. Python code and this readme file created by Wickes. Data included within this deposit are examples to demonstrate execution. Roughly, there are two python scripts in here: keyword_search.py, designed to assist in finding records matching specific keywords, and matching_tool.ipynb, designed to assist in detection of which records are and are not contained within a keyword results file and an aiddata project data file.
keywords:
aiddata; natural resources