TY  - JOUR
A1  - Wittig, Alice
A1  - Miranda, Fabio Malcher
A1  - Hölzer, Martin
A1  - Altenburg, Tom
A1  - Bartoszewicz, Jakub Maciej
A1  - Beyvers, Sebastian
A1  - Dieckmann, Marius Alfred
A1  - Genske, Ulrich
A1  - Giese, Sven Hans-Joachim
A1  - Nowicka, Melania
A1  - Richard, Hugues
A1  - Schiebenhoefer, Henning
A1  - Schmachtenberg, Anna-Juliane
A1  - Sieben, Paul
A1  - Tang, Ming
A1  - Tembrockhaus, Julius
A1  - Renard, Bernhard Y.
A1  - Fuchs, Stephan
T1  - CovRadar
BT  - continuously tracking and filtering SARS-CoV-2 mutations for genomic surveillance
JF  - Bioinformatics
N2  - The ongoing pandemic caused by SARS-CoV-2 emphasizes the importance of genomic surveillance to understand the evolution of the virus, to monitor the viral population, and plan epidemiological responses. Detailed analysis, easy visualization and intuitive filtering of the latest viral sequences are powerful for this purpose. We present CovRadar, a tool for genomic surveillance of the SARS-CoV-2 Spike protein. CovRadar consists of an analytical pipeline and a web application that enable the analysis and visualization of hundreds of thousand sequences. First, CovRadar extracts the regions of interest using local alignment, then builds a multiple sequence alignment, infers variants and consensus and finally presents the results in an interactive app, making accessing and reporting simple, flexible and fast.
Y1  - 2022
U6  - https://doi.org/10.1093/bioinformatics/btac411
SN  - 1367-4803
SN  - 1367-4811
VL  - 38
IS  - 17
SP  - 4223
EP  - 4225
PB  - Oxford Univ. Press
CY  - Oxford
ER  - 
TY  - JOUR
A1  - Ulrich, Jens-Uwe
A1  - Lutfi, Ahmad
A1  - Rutzen, Kilian
A1  - Renard, Bernhard Y.
T1  - ReadBouncer
BT  - precise and scalable adaptive sampling for nanopore sequencing
JF  - Bioinformatics
N2  - Motivation: 
Nanopore sequencers allow targeted sequencing of interesting nucleotide sequences by rejecting other sequences from individual pores. This feature facilitates the enrichment of low-abundant sequences by depleting overrepresented ones in-silico. Existing tools for adaptive sampling either apply signal alignment, which cannot handle human-sized reference sequences, or apply read mapping in sequence space relying on fast graphical processing units (GPU) base callers for real-time read rejection. Using nanopore long-read mapping tools is also not optimal when mapping shorter reads as usually analyzed in adaptive sampling applications. 

Results: 
Here, we present a new approach for nanopore adaptive sampling that combines fast CPU and GPU base calling with read classification based on Interleaved Bloom Filters. ReadBouncer improves the potential enrichment of low abundance sequences by its high read classification sensitivity and specificity, outperforming existing tools in the field. It robustly removes even reads belonging to large reference sequences while running on commodity hardware without GPUs, making adaptive sampling accessible for in-field researchers. Readbouncer also provides a user-friendly interface and installer files for end-users without a bioinformatics background.
Y1  - 2022
U6  - https://doi.org/10.1093/bioinformatics/btac223
SN  - 1367-4803
SN  - 1460-2059
VL  - 38
IS  - SUPPL 1
SP  - 153
EP  - 160
PB  - Oxford Univ. Press
CY  - Oxford
ER  - 
TY  - JOUR
A1  - Piro, Vitor C.
A1  - Renard, Bernhard Y.
T1  - Contamination detection and microbiome exploration with GRIMER
JF  - GigaScience
N2  - Background:
Contamination detection is a important step that should be carefully considered in early stages when designing and performing microbiome studies to avoid biased outcomes. Detecting and removing true contaminants is challenging, especially in low-biomass samples or in studies lacking proper controls. Interactive visualizations and analysis platforms are crucial to better guide this step, to help to identify and detect noisy patterns that could potentially be contamination. Additionally, external evidence, like aggregation of several contamination detection methods and the use of common contaminants reported in the literature, could help to discover and mitigate contamination. 

Results: 
We propose GRIMER, a tool that performs automated analyses and generates a portable and interactive dashboard integrating annotation, taxonomy, and metadata. It unifies several sources of evidence to help detect contamination. GRIMER is independent of quantification methods and directly analyzes contingency tables to create an interactive and offline report. Reports can be created in seconds and are accessible for nonspecialists, providing an intuitive set of charts to explore data distribution among observations and samples and its connections with external sources. Further, we compiled and used an extensive list of possible external contaminant taxa and common contaminants with 210 genera and 627 species reported in 22 published articles. 

Conclusion:
GRIMER enables visual data exploration and analysis, supporting contamination detection in microbiome studies. The tool and data presented are open source and available at https://gitlab.com/dacs-hpi/grimer.
KW  - Contamination
KW  - Microbiome
KW  - Visualization
KW  - Taxonomy
Y1  - 2023
U6  - https://doi.org/10.1093/gigascience/giad017
SN  - 2047-217X
VL  - 12
PB  - Oxford Univ. Press
CY  - Oxford
ER  - 
TY  - JOUR
A1  - Piro, Vitor C.
A1  - Dadi, Temesgen H.
A1  - Seiler, Enrico
A1  - Reinert, Knut
A1  - Renard, Bernhard Y.
T1  - ganon
BT  - precise metagenomics classification against large and up-to-date sets of reference sequences
JF  - Bioinformatics
N2  - Motivation:
The exponential growth of assembled genome sequences greatly benefits metagenomics studies. However, currently available methods struggle to manage the increasing amount of sequences and their frequent updates. Indexing the current RefSeq can take days and hundreds of GB of memory on large servers. Few methods address these issues thus far, and even though many can theoretically handle large amounts of references, time/memory requirements are prohibitive in practice. As a result, many studies that require sequence classification use often outdated and almost never truly up-to-date indices. 

Results: 
Motivated by those limitations, we created ganon, a k-mer-based read classification tool that uses Interleaved Bloom Filters in conjunction with a taxonomic clustering and a k-mer counting/filtering scheme. Ganon provides an efficient method for indexing references, keeping them updated. It requires <55 min to index the complete RefSeq of bacteria, archaea, fungi and viruses. The tool can further keep these indices up-to-date in a fraction of the time necessary to create them. Ganon makes it possible to query against very large reference sets and therefore it classifies significantly more reads and identifies more species than similar methods. When classifying a high-complexity CAMI challenge dataset against complete genomes from RefSeq, ganon shows strongly increased precision with equal or better sensitivity compared with state-of-the-art tools. With the same dataset against the complete RefSeq, ganon improved the F1-score by 65% at the genus level. It supports taxonomy- and assembly-level classification, multiple indices and hierarchical classification.
Y1  - 2020
U6  - https://doi.org/https://doi.org/10.1093/bioinformatics/btaa458
SN  - 1367-4811
SN  - 1367-4803
VL  - 36
SP  - 12
EP  - 20
PB  - Oxford Univ. Press
CY  - Oxford
ER  - 
TY  - JOUR
A1  - Hiort, Pauline
A1  - Schlaffner, Christoph N.
A1  - Steen, Judith A.
A1  - Renard, Bernhard Y.
A1  - Steen, Hanno
T1  - multiFLEX-LF: a computational approach to quantify the modification stoichiometries in label-free proteomics data sets
JF  - Journal of proteome research
N2  - In liquid-chromatography-tandem-mass-spectrometry-based proteomics, information about the presence and stoichiometry ofprotein modifications is not readily available. To overcome this problem,we developed multiFLEX-LF, a computational tool that builds uponFLEXIQuant, which detects modified peptide precursors and quantifiestheir modification extent by monitoring the differences between observedand expected intensities of the unmodified precursors. multiFLEX-LFrelies on robust linear regression to calculate the modification extent of agiven precursor relative to a within-study reference. multiFLEX-LF cananalyze entire label-free discovery proteomics data sets in a precursor-centric manner without preselecting a protein of interest. To analyzemodification dynamics and coregulated modifications, we hierarchicallyclustered the precursors of all proteins based on their computed relativemodification scores. We applied multiFLEX-LF to a data-independent-acquisition-based data set acquired using the anaphase-promoting complex/cyclosome (APC/C) isolated at various time pointsduring mitosis. The clustering of the precursors allows for identifying varying modification dynamics and ordering the modificationevents. Overall, multiFLEX-LF enables the fast identification of potentially differentially modified peptide precursors and thequantification of their differential modification extent in large data sets using a personal computer. Additionally, multiFLEX-LF candrive the large-scale investigation of the modification dynamics of peptide precursors in time-series and case-control studies.multiFLEX-LF is available athttps://gitlab.com/SteenOmicsLab/multiflex-lf.
KW  - bioinformatics tool
KW  - label-free quantification
KW  - LC-MS
KW  - MS
KW  - post-translational modification
KW  - modification stoichiometry
KW  - PTM
KW  - quantification
Y1  - 2022
U6  - https://doi.org/10.1021/acs.jproteome.1c00669
SN  - 1535-3893
SN  - 1535-3907
VL  - 21
IS  - 4
SP  - 899
EP  - 909
PB  - American Chemical Society
CY  - Washington
ER  -