Authors:
Jolanta Kawulok
and
Michal Kawulok
Affiliation:
Institute of Informatics, Silesian University of Technology, Gliwice, Poland
Keyword(s):
Metagenome, Metagenomic Reads, Hierarchical Clustering, Urban Microbiome, k-mers.
Abstract:
Analysis of metagenomic samples is aimed at extracting relevant information on these samples, including their composition and origin. To determine where a sample comes from, it is commonly compared with a set of reference samples extracted from known locations. However, if such reference samples are unavailable or when the origins of the investigated samples are not covered by the reference set, it may be helpful to identify groups of similar samples that may have a common origin. In this paper, we tackle this problem with hierarchical clustering applied to analyse a matrix of mutual similarities obtained using the Mash and our CoMeta programs. We report initial, yet encouraging results of our experimental study performed for the metagenomic data extracted from two large metropolises, downloaded from the Sequence Read Archive repository. The obtained results indicate that the proposed approach is effective, which justifies further exploration of the topic using more extensive data.