André Atanasio M. Almeida, Zanoni Dias


Sequence alignment is the most common task in the bioinformatics field. It is a required method for the execution of a wide range of procedures such as the search for homologue sequences in a database or protein structure prediction. The main goal of the experiments in this work was to improve on the accuracy of the multiple sequence alignments. Our experiments concentrated on the MUMMALS multiple aligner, experimenting with three distinct modifications to the algorithm. Our first experiment was to modify the substring length of the k-mer count method. The second experiment we attempted was to substitute the commonly used Dayhoff(6) with alternative compressed alphabets. The third experiment was to modify the distance matrix computation and the guide tree construction. Each of the experiments showed a gain in result accuracy.


