Illinois Data Bank Dataset Search Results
Results
published:
2020-08-22
Qiu, Haoran; Banerjee, Subho S.; Jha, Saurabh; Kalbarczyk, Zbigniew T.; Iyer, Ravishankar K.
(2020)
We are releasing the tracing dataset of four microservice benchmarks deployed on our dedicated Kubernetes cluster consisting of 15 heterogeneous nodes. The dataset is not sampled and is from selected types of requests in each benchmark, i.e., compose-posts in the social network application, compose-reviews in the media service application, book-rooms in the hotel reservation application, and reserve-tickets in the train ticket booking application.
The four microservice applications come from [DeathStarBench](https://github.com/delimitrou/DeathStarBench) and [Train-Ticket](https://github.com/FudanSELab/train-ticket). The performance anomaly injector is from [FIRM](https://gitlab.engr.illinois.edu/DEPEND/firm.git).
The dataset was preprocessed from the raw data generated in FIRM's tracing system. The dataset is separated by on which microservice component is the performance anomaly located (as the file name suggests). Each dataset is in CSV format and fields are separated by commas. Each line consists of the tracing ID and the duration (in 10^(-3) ms) of each component. Execution paths are specified in `execution_paths.txt` in each directory.
keywords:
Microservices; Tracing; Performance
published:
2021-07-22
Hsiao, Tzu-Kun; Schneider, Jodi
(2021)
This dataset includes five files. Descriptions of the files are given as follows:
<b>FILENAME: PubMed_retracted_publication_full_v3.tsv</b>
- Bibliographic data of retracted papers indexed in PubMed (retrieved on August 20, 2020, searched with the query "retracted publication" [PT] ).
- Except for the information in the "cited_by" column, all the data is from PubMed.
- PMIDs in the "cited_by" column that meet either of the two conditions below have been excluded from analyses:
[1] PMIDs of the citing papers are from retraction notices (i.e., those in the “retraction_notice_PMID.csv” file).
[2] Citing paper and the cited retracted paper have the same PMID.
ROW EXPLANATIONS
- Each row is a retracted paper. There are 7,813 retracted papers.
COLUMN HEADER EXPLANATIONS
1) PMID - PubMed ID
2) Title - Paper title
3) Authors - Author names
4) Citation - Bibliographic information of the paper
5) First Author - First author's name
6) Journal/Book - Publication name
7) Publication Year
8) Create Date - The date the record was added to the PubMed database
9) PMCID - PubMed Central ID (if applicable, otherwise blank)
10) NIHMS ID - NIH Manuscript Submission ID (if applicable, otherwise blank)
11) DOI - Digital object identifier (if applicable, otherwise blank)
12) retracted_in - Information of retraction notice (given by PubMed)
13) retracted_yr - Retraction year identified from "retracted_in" (if applicable, otherwise blank)
14) cited_by - PMIDs of the citing papers. (if applicable, otherwise blank) Data collected from iCite.
15) retraction_notice_pmid - PMID of the retraction notice (if applicable, otherwise blank)
<b>FILENAME: PubMed_retracted_publication_CitCntxt_withYR_v3.tsv</b>
- This file contains citation contexts (i.e., citing sentences) where the retracted papers were cited. The citation contexts were identified from the XML version of PubMed Central open access (PMCOA) articles.
- This is part of the data from: Hsiao, T.-K., & Torvik, V. I. (manuscript in preparation). Citation contexts identified from PubMed Central open access articles: A resource for text mining and citation analysis.
- Citation contexts that meet either of the two conditions below have been excluded from analyses:
[1] PMIDs of the citing papers are from retraction notices (i.e., those in the “retraction_notice_PMID.csv” file).
[2] Citing paper and the cited retracted paper have the same PMID.
ROW EXPLANATIONS
- Each row is a citation context associated with one retracted paper that's cited.
- In the manuscript, we count each citation context once, even if it cites multiple retracted papers.
COLUMN HEADER EXPLANATIONS
1) pmcid - PubMed Central ID of the citing paper
2) pmid - PubMed ID of the citing paper
3) year - Publication year of the citing paper
4) location - Location of the citation context (abstract = abstract, body = main text, back = supporting material, tbl_fig_caption = tables and table/figure captions)
5) IMRaD - IMRaD section of the citation context (I = Introduction, M = Methods, R = Results, D = Discussions/Conclusion, NoIMRaD = not identified)
6) sentence_id - The ID of the citation context in a given location. For location information, please see column 4. The first sentence in the location gets the ID 1, and subsequent sentences are numbered consecutively.
7) total_sentences - Total number of sentences in a given location
8) intxt_id - Identifier of a cited paper. Here, a cited paper is the retracted paper.
9) intxt_pmid - PubMed ID of a cited paper. Here, a cited paper is the retracted paper.
10) citation - The citation context
11) progression - Position of a citation context by centile within the citing paper.
12) retracted_yr - Retraction year of the retracted paper
13) post_retraction - 0 = not post-retraction citation; 1 = post-retraction citation. A post-retraction citation is a citation made after the calendar year of retraction.
<b>FILENAME: 724_knowingly_post_retraction_cit.csv</b> (updated)
- The 724 post-retraction citation contexts that we determined knowingly cited the 7,813 retracted papers in "PubMed_retracted_publication_full_v3.tsv".
- Two citation contexts from retraction notices have been excluded from analyses.
ROW EXPLANATIONS
- Each row is a citation context.
COLUMN HEADER EXPLANATIONS
1) pmcid - PubMed Central ID of the citing paper
2) pmid - PubMed ID of the citing paper
3) pub_type - Publication type collected from the metadata in the PMCOA XML files.
4) pub_type2 - Specific article types. Please see the manuscript for explanations.
5) year - Publication year of the citing paper
6) location - Location of the citation context (abstract = abstract, body = main text, back = supporting material, table_or_figure_caption = tables and table/figure captions)
7) intxt_id - Identifier of a cited paper. Here, a cited paper is the retracted paper.
8) intxt_pmid - PubMed ID of a cited paper. Here, a cited paper is the retracted paper.
9) citation - The citation context
10) retracted_yr - Retraction year of the retracted paper
11) cit_purpose - Purpose of citing the retracted paper. This is from human annotations. Please see the manuscript for further information about annotation.
12) longer_context - A extended version of the citation context. (if applicable, otherwise blank) Manually pulled from the full-texts in the process of annotation.
<b>FILENAME: Annotation manual.pdf</b>
- The manual for annotating the citation purposes in column 11) of the 724_knowingly_post_retraction_cit.tsv.
<b>FILENAME: retraction_notice_PMID.csv</b> (new file added for this version)
- A list of 8,346 PMIDs of retraction notices indexed in PubMed (retrieved on August 20, 2020, searched with the query "retraction of publication" [PT] ).
keywords:
citation context; in-text citation; citation to retracted papers; retraction
published:
2021-07-15
Castro, Daniel; Sweedler, Jonathan
(2021)
The dataset contains the high-throughput matrix-assisted laser desorption/ionization mass spectrometry XmL files for the atrial gland and red hemiduct of Aplysia californica.
keywords:
Dense-core vesicle; High-throughput; Mass Spectrometry; MALDI; Organelle; Image-Guided; Atrial gland; red hemiduct; Lucent Vesicle
published:
2024-02-16
Zhang, Mingxiao; Sutton, Bradley
(2024)
Sample data from one typical phantom test and one deidentified shunt patient test (shown in Fig. 8 of the MRM paper), with the corresponding analysis code for the Shunt-FENSI technique.
For the MRM paper “Measuring CSF Shunt Flow with MRI Using Flow Enhancement of Signal Intensity (FENSI)”
keywords:
Shunt-FENSI; MRM; Hydrocephalus; VP Shunt; Flow Quantification; Pediatric Neurosurgery; Pulse Sequence; Signal Simulation
published:
2022-04-20
This is the core data for Zinnen et al., "Functional traits and responses to nutrient and mycorrhizal addition are inconsistently related to wetland plant species’ coefficients of conservatism." This is submitted to Wetlands Ecology and Management.
Two datasets are submitted here. The first is greenhouse-collected data of 9 plant traits and concurrent treatment responses of Illinois wetland plant species. The second are field-collected leaf trait data of Illinois wetland plant species. These data are analyzed in the paper. Please refer to the main manuscript to see how these data were produced and specific analyses.
keywords:
ecological indicators; Floristic Quality Assessment; Floristic Quality Index; wetland degradation
published:
2022-09-08
Hartman, Jordan; Larson, Eric
(2022)
Data associated with the manuscript "Overlooked invaders? Ecological impacts of non-game, native transplant fishes in the United States" by Jordan H. Hartman and Eric R. Larson
keywords:
freshwater; non-game; native transplant; impacts; invasive species
published:
2022-10-27
Holiman, Haley; Kitaif, J. Carson; Fournier, Auriel M.V.; Iglay, Ray; Woodrey, Mark S.
(2022)
keywords:
marsh birds; automated recording units
published:
2023-10-22
Davidson, Ruth; Vachaspati, Pranjal; Mirarab, Siavash; Warnow, Tandy
(2023)
HGT+ILS datasets from Davidson, R., Vachaspati, P., Mirarab, S., & Warnow, T. (2015). Phylogenomic species tree estimation in the presence of incomplete lineage sorting and horizontal gene transfer. BMC genomics, 16(10), 1-12. Contains model species trees, true and estimated gene trees, and simulated alignments.
keywords:
evolution; computational biology; bioinformatics; phylogenetics
published:
2021-11-05
Keralis, Spencer D. C.; Yakin, Syamil
(2021)
This data set contains survey results from a 2021 survey of University of Illinois University Library employees conducted as part of the Becoming A Trans Inclusive Library Project to evaluate the awareness of University of Illinois faculty, staff, and student employees regarding transgender identities, and to assess the professional development needs of library employees to better serve trans and gender non-conforming patrons. The survey instrument is available in the IDEALS repository: http://hdl.handle.net/2142/110080.
keywords:
transgender awareness, academic library, gender identity awareness, professional development opportunities
published:
2021-09-17
Stern, Jessica; Herman, Brook D. ; Matthews, Jeffrey
(2021)
We studied vegetation metric robustness to environmental (season, interannual, and regional) and methodological (observer) variables, as well as adequate sample size for vegetation metrics across four regions of the United States.
keywords:
coefficients of conservatism; floristic quality assessment; restoration; vegetation metric;
published:
2022-03-31
Crawford, Reed D.; Dodd, Luke E.; Tillman, Frank E.; O'Keefe, Joy M.
(2022)
This dataset contains our bi-hourly temperature recordings from 40 rocket box style artificial roosts of 5 designs deployed in Indiana and Kentucky, USA from April through September 2019. This dataset also includes our endothermic and faculatively heterothermic daily energy expenditure datasets used in our bioenergetic analysis, which were calculated from the bi-hourly rocket box temperature data. Lastly, we include our overheating counts dataset which summarizes daily overheating events (i.e., temperatures > 40 Celsius) in each rocket box style bat box over the course of the study period, these daily summaries were also calculated from the bi-hourly rocket box temperature recordings.
keywords:
artificial roost; bat box; microcllimate; temperature
published:
2024-01-01
Christensen, Jacob; Bettler, Simon; Qu, Kejian; Huang, Jeffrey; Kim, Soyeun; Lu, Yinchuan; Zhao, Chengxi; Chen, Jin; Krogstad, Matthew; Woods, Toby; Mahmood, Fahad; Huang, Pinshane; Abbamonte, Peter; Shoemaker, Daniel
(2024)
Contains scattering data obtained for (TaSe4)2I at the Advanced Photon Source at Argonne National Laboratory. Beamline 6ID-D was used with a beam energy of 64.8 keV in a transmission geometry. Data was obtained at temperatures between 28 and 300 K. See the readme.txt file for more information.
keywords:
X-ray diffraction
published:
2025-11-06
Salmonella HilD 3'UTR GRIL-seq sequencing data
keywords:
Salmonella; SPI1; hilD
published:
2023-04-12
Han, Edmund; Nahid, Shahriar Muhammad; Rakib, Tawfiqur; Nolan, Gillian; F. Ferrari, Paolo; Hossain, M. Abir ; Schleife, André ; Nam, SungWoo; Ertekin, Elif; van der Zande, Arend; Huang, Pinshane
(2023)
STEM images of kinks in α-In2Se3, DFT calculation of bending of α-In2Se3, PFM on as exfoliated and controllably bend α-In2Se3
published:
2022-11-09
Wang, Junren; Konar, Megan; Dalin, Carole; Liu, Yu; Stillwell, Ashlynn S.; Xu, Ming; Zhu, Tingju
(2022)
This dataset includes the blue water intensity by sector (41 industries and service sectors) for provinces in China, economic and virtual water network flow for China in 2017, and the corresponding network properties for these two networks.
keywords:
Economic network; Virtual water; Supply chains; Network analysis; Multilayer; MRIO
published:
2023-03-13
Yang, Joyce; Zhao , Lei; Oleson, Keith
(2023)
This dataset contains the historical and future (SSP3 and RCP7.0) CESM climate simulations used in the article "Large humidity effects on urban heat exposure and cooling challenges under climate change" (upcoming). Further details about these simulations can be found in the article. This dataset documents the monthly mean projections of air temperature, wet-bulb temperature, precipitation, relative humidity, and numerous other climatic variables for 2000-2009 (for the historical run) and for 2015-2100 (for the future projection under SSP3-RCP7). This dataset may be useful for urban planners, climate scientists, and decision-makers interested in changes in urban and rural climate under climate change.
keywords:
urban climate; climate change; heat stress; urban heat
published:
2023-03-28
Hsiao, Tzu-Kun; Torvik, Vetle
(2023)
Sentences and citation contexts identified from the PubMed Central open access articles
----------------------------------------------------------------------
The dataset is delivered as 24 tab-delimited text files. The files contain 720,649,608 sentences, 75,848,689 of which are citation contexts. The dataset is based on a snapshot of articles in the XML version of the PubMed Central open access subset (i.e., the PMCOA subset). The PMCOA subset was collected in May 2019.
The dataset is created as described in: Hsiao TK., & Torvik V. I. (manuscript) OpCitance: Citation contexts identified from the PubMed Central open access articles.
<b>Files</b>:
• A_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with A.
• B_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with B.
• C_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with C.
• D_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with D.
• E_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with E.
• F_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with F.
• G_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with G.
• H_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with H.
• I_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with I.
• J_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with J.
• K_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with K.
• L_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with L.
• M_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with M.
• N_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with N.
• O_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with O.
• P_p1_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with P (part 1).
• P_p2_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with P (part 2).
• Q_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with Q.
• R_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with R.
• S_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with S.
• T_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with T.
• UV_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with U or V.
• W_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with W.
• XYZ_journal_IntxtCit.tsv – Sentences and citation contexts identified from articles published in journals with journal titles starting with X, Y or Z.
Each row in the file is a sentence/citation context and contains the following columns:
• pmcid: PMCID of the article
• pmid: PMID of the article. If an article does not have a PMID, the value is NONE.
• location: The article component (abstract, main text, table, figure, etc.) to which the citation context/sentence belongs.
• IMRaD: The type of IMRaD section associated with the citation context/sentence. I, M, R, and D represent introduction/background, method, results, and conclusion/discussion, respectively; NoIMRaD indicates that the section type is not identifiable.
• sentence_id: The ID of the citation context/sentence in the article component
• total_sentences: The number of sentences in the article component.
• intxt_id: The ID of the citation.
• intxt_pmid: PMID of the citation (as tagged in the XML file). If a citation does not have a PMID tagged in the XML file, the value is "-".
• intxt_pmid_source: The sources where the intxt_pmid can be identified. Xml represents that the PMID is only identified from the XML file; xml,pmc represents that the PMID is not only from the XML file, but also in the citation data collected from the NCBI Entrez Programming Utilities. If a citation does not have an intxt_pmid, the value is "-".
• intxt_mark: The citation marker associated with the inline citation.
• best_id: The best source link ID (e.g., PMID) of the citation.
• best_source: The sources that confirm the best ID.
• best_id_diff: The comparison result between the best_id column and the intxt_pmid column.
• citation: A citation context. If no citation is found in a sentence, the value is the sentence.
• progression: Text progression of the citation context/sentence.
<b>Supplementary Files</b>
• PMC-OA-patci.tsv.gz – This file contains the best source link IDs for the references (e.g., PMID). Patci [1] was used to identify the best source link IDs. The best source link IDs are mapped to the citation contexts and displayed in the *_journal IntxtCit.tsv files as the best_id column.
Each row in the PMC-OA-patci.tsv.gz file is a citation (i.e., a reference extracted from the XML file) and contains the following columns:
• pmcid: PMCID of the citing article.
• pos: The citation's position in the reference list.
• fromPMID: PMID of the citing article.
• toPMID: Source link ID (e.g., PMID) of the citation. This ID is identified by Patci.
• SRC: The sources that confirm the toPMID.
• MatchDB: The origin bibliographic database of the toPMID.
• Probability: The match probability of the toPMID.
• toPMID2: PMID of the citation (as tagged in the XML file).
• SRC2: The sources that confirm the toPMID2.
• intxt_id: The ID of the citation.
• journal: The first letter of the journal title. This maps to the *_journal_IntxtCit.tsv files.
• same_ref_string: Whether the citation string appears in the reference list more than once.
• DIFF: The comparison result between the toPMID column and the toPMID2 column.
• bestID: The best source link ID (e.g., PMID) of the citation.
• bestSRC: The sources that confirm the best ID.
• Match: Matching result produced by Patci.
[1] Agarwal, S., Lincoln, M., Cai, H., & Torvik, V. (2014). Patci – a tool for identifying scientific articles cited by patents. GSLIS Research Showcase 2014. http://hdl.handle.net/2142/54885
• intxt_cit_license_fromPMC.tsv – This file contains the CC licensing information for each article. The licensing information is from PMC's file lists [2], retrieved on June 19, 2020, and March 9, 2023. It should be noted that the license information for 189,855 PMCIDs is <b>NO-CC CODE</b> in the file lists, and 521 PMCIDs are absent in the file lists. The absence of CC licensing information does not indicate that the article lacks a CC license. For example, PMCID: 6156294 (<b>NO-CC CODE</b>) and PMCID: 6118074 (absent in the PMC's file lists) are under CC-BY licenses according to their PDF versions of articles.
The intxt_cit_license_fromPMC.tsv file has two columns:
• pmcid: PMCID of the article.
• license: The article’s CC license information provided in PMC’s file lists. The value is nan when an article is not present in the PMC’s file lists.
[2] https://www.ncbi.nlm.nih.gov/pmc/tools/ftp/
• Supplementary_File_1.zip – This file contains the code for generating the dataset.
keywords:
citation context; in-text citation; inline citation; bibliometrics; science of science
published:
2025-04-24
Smith, Rebecca; Chakraborty, Sulagna; Lyons, Lee Ann; Winata, Fikriyah; Mateus-Pinilla, Nohra
(2025)
These are the datasets underlying the figures in the manuscript "Methods of active surveillance for hard ticks and associated tick-borne pathogens of public health importance in the contiguous United States: A Comprehensive Systematic Review".
The review considered only publications reporting on active tick or tick-borne pathogen surveillance in the contiguous United States published between 1944 and 2018. For the purposes of this review, we were only concerned with studies of Ixodidae (hard ticks) and/or studies of tick-borne pathogens (in humans, animals, or hard ticks) of public health importance to humans. Study designs included cross-sectional, serological, epidemiological, ecological, or observational studies. Only peer-reviewed publications published in the English language were included. Studies were excluded if they focused on a tick that is not a vector of a human pathogen or on a pathogen that does not cause disease in humans, if the tick or tick-borne pathogen findings were incidental, or if they did not include quantitative surveillance data. For the purpose of this study, we defined surveillance data as information on ticks or pathogens provided through active sampling in natural areas; it should be noted that this does not match the strict definition used by the CDC, which requires sustained sampling efforts across time. Studies were also excluded if they: explored regions other than the contiguous US; focused on treatment, vaccine, or therapeutics development and/or diagnostics of human disease; focused on tick or pathogen genetics; focused on experimental studies with ticks or hosts; were tick control and/or management studies; performed only passive surveillance; were review articles; were not peer reviewed; were in a language other than English; the full text was not available; and if the disease was not a risk to the general public. In addition, for articles which reported data that had previously been published, we only included previously unreported information collected by the authors, and we referenced the specific period of collection for these data to ensure we were not double-recording data. Due to publication delays, we also performed a non-systematic review of the literature of articles published between 2019 – 2023 on tick and tickborne pathogen surveillance methods conducted in the contiguous United States.
Keyword search was performed in PubMed Central and Web of Science Core Collection databases. The search algorithm keywords included tick(s), Amblyomma, Dermacentor, Ixodes, Rhipicephalus, Acari Ixodidea, tick host(s), Lyme disease, Rocky Mountain Spotted Fever, Spotted Fever Group, Rickettsiosis, Ehrlichiosis, Anaplasmosis, Borreliosis, Tularemia, Babesiosis, tick-borne pathogen, Powassan, Heartland, Bourbon, Colorado tick fever, Pacific Coast tick fever, tick surveillance, surveillance, (sero)epidemiology, prevalence, distribution, ecology, United States. The search algorithm utilized is provided as follows:
TI= ((ticks OR Ixodes OR Amblyomma OR Dermacentor OR Rhipicephalus OR "Acari Ixodidi" OR "tick hosts" OR "tick host") OR ("Lyme Disease" OR "Rocky Mountain Spotted Fever" OR "Spotted Fever Group" OR Rickettsiosis OR Rickettsial OR Ehrlichiosis OR Anaplasmosis OR Borreliosis OR Tularemia OR Babesiosis OR Borrelia OR Ehrlichia OR Anaplasma OR Rickettsia OR Babesia OR "tick-borne pathogen" OR "tick borne pathogen")) AND TS= ("tick surveillance" OR surveillance OR epidemiology OR seroepidemiology OR ecology) AND CU=("United States of America" OR "USA" OR "United States" OR United-States).
These datasets are the collated data underlying the figures in the manuscript. For more details, please see the publication.
The following are explanations for variables used in all the CSV files:
Tick: Species of tick collected
Tick_Method: Method of collecting ticks
Pathogen: Species of pathogen tested for
Path_Method: Method of testing for pathogens
Decade: Decade of publication
n: Number of publications
STATE: state in which study was conducted
COUNTY: county in which study was conducted
1944 - 2018 (Was surveillance performed?): was there at least one publication included with a publication date within the 1944-2018 period in this geographic region?
2019 - 2023 (Was surveillance performed?): was there at least one publication included with a publication date within the 2019-2023 period in this geographic region?
keywords:
ticks; systematic review; surveillance
published:
2016-12-19
Files in this dataset represent an investigation into use of the Library mobile app Minrva during the months of May 2015 through December 2015. During this time interval 45,975 API hits were recorded by the Minrva web server. The dataset included herein is an analysis of the following: 1) a delineation of API hits to mobile app modules use in the Minrva app by month, 2) a general analysis of Minrva app downloads to module use, and 3) the annotated data file providing associations from API hits to specific modules used, organized by month (May 2015 – December 2015).
keywords:
API analysis; log analysis; Minrva Mobile App
published:
2021-06-16
Warnow , Tandy; Wedell, Eleanor
(2021)
Thank you for using these datasets.
These RNAsim aligned fragmentary sequences were generated from the query sequences selected by Balaban et al. (2019) in their variable-size datasets (https://doi.org/10.5061/dryad.78nf7dq). They were created for use for phylogenetic placement with the multiple sequence alignments and backbone trees provided by Balaban et al. (2019).
The file structures included here also correspond with the data Balaban et al. (2020) provided.
This includes:
Directories for five varying backbone tree sizes, shown as 5000, 10000, 50000, 100000, and 200000. These directory names are also used by Balaban et al. (2019), and indicate the size of the backbone tree included in their data.
Subdirectories for each replicate from the backbone tree size labelled 0 through 4. For the smaller four backbone tree sizes there are five replicates, and for the largest there is one replicate.
Each replicate contains 200 text files with one aligned query sequence fragment in fasta format.
keywords:
Fragmentary Sequences; RNAsim
published:
2023-03-16
Aishwarya, Anuva; Madhavan, Vidya
(2023)
This dataset consists of all the figure files that are part of the main text of the manuscript titled "Magnetic-field sensitive charge density waves in the superconductor UTe2". For detailed information on the individual files refer to the readme file.
keywords:
superconductor; spin-triplet; topological; unconventional; CDW; PDW; magnetic field;
published:
2021-03-05
Beilke, Elizabeth; Blakey, Rachel; O'Keefe, Joy
(2021)
Datasets that accompany Beilke, Blakey, and O'Keefe 2021 publication (Title: Bats partition activity in space and time in a large, heterogeneous landscape; Journal: Ecology and Evolution).
keywords:
spatiotemporal; chiroptera
published:
2021-04-18
Lyu, Fangzheng; Kang, Jeon-Young; Wang, Shaohua; Han, Su; Li, Zhiyu; Wang, Shaowen; Padmanabhan, Anand
(2021)
This dataset contains all the code, notebooks, datasets used in the study conducted for the research publication titled "Multi-scale CyberGIS Analytics for Detecting Spatiotemporal Patterns of COVID-19 Data". Specifically, this package include the artifacts used to conduct spatial-temporal analysis with space time kernel density estimation (STKDE) using COVID-19 data, which should help readers to reproduce some of the analysis and learn about the methods that were conducted in the associated book chapter.
## What’s inside - A quick explanation of the components of the zip file
* Multi-scale CyberGIS Analytics for Detecting Spatiotemporal Patterns of COVID-19.ipynb is a jupyter notebook for this project. It contains codes for preprocessing, space time kernel density estimation, postprocessing, and visualization.
* data is a folder containing all data needed for the notebook
* data/county.txt: US counties information and fip code from Natural Resources Conservation Service.
* data/us-counties.txt: County-level COVID-19 data collected from New York Times COVID-19 github repository on August 9th, 2020.
* data/covid_death.txt: COVID-19 death information derived after preprocessing step, preparing the input data for STKDE. Each record is if the following format (fips, spatial_x, spatial_y, date, number of death ).
* data/stkdefinal.txt: result obtained by conducting STKDE.
* wolfram_mathmatica is a folder for 3D visulization code.
* wolfram_mathmatica/Visualization.nb: code for visulization of STKDE result via weolfram mathmatica.
* img is a folder for figures.
* img/above.png: result of 3-D visulization result, above view.
* img/side.png: result of 3-D visulization, side view.
keywords:
CyberGIS; COVID-19; Space-time kernel density estimation; Spatiotemporal patterns
published:
2022-04-11
Liu, Shanshan; Kontou, Eleftheria
(2022)
This data set contains all the map data used for "Quantifying transportation energy vulnerability and its spatial patterns in the United States". The multiple dimensions (i.e., exposure, sensitivity, adaptive capacity) of transportation energy vulnerability (TEV) at the census tract level in the United States, the changes in TEV with electric vehicles adoption, and the detailed data for Chicago, Los Angeles, and New York are in the dataset.
keywords:
Transport energy; Vulnerability; Fuel costs; Electric vehicles
published:
2022-11-28
Zhang, Na; Sharma, Bijay P.; Khanna, Madhu
(2022)
The compiled datasets include county-level variables used for simulating miscanthus and switchgrass production in 2287 counties across the rainfed US including 5-year (2012-2016) averaged growing season degree days (GDD), 5-year (2012-2016) averaged growing season cumulative precipitation, National Commodity Crop Productivity Index (NCCPI) values, regional dummies (only for miscanthus), the regional-level random effect of the yield response function, N price, land cash rent, the first year fixed cost (only for switchgrass), and separate datasets for simulating an alternative model assuming a constant N rate.
The GAMS codes are used to run the simulation to obtain the main results including the age-varying profit-maximizing N rate, biomass yields, and annual profits for miscanthus and switchgrass production across counties in the rainfed US. The STATA codes are used to merge and analyze simulation results and create summary statistics tables and key figures.
keywords:
Age; Miscanthus; Net present value; Nitrogen; Optimal lifespan; Profit maximization; Switchgrass; Yield; Center for Advanced Bioenergy and Bioproducts Innovation