Authors:
Helena Cristina da Gama Leitão
1
;
Rafael Felipe Veiga Saracchini
2
and
Jorge Stolfi
3
Affiliations:
1
Federal Fluminense University, Brazil
;
2
Technical Institute of Castilla y León, Spain
;
3
State University of Campinas, Brazil
Keyword(s):
Bio-sequence Analysis, Signal Analysis, Visualization, Filtering, Multi-scale.
Related
Ontology
Subjects/Areas/Topics:
Abstract Data Visualization
;
Computer Vision, Visualization and Computer Graphics
;
General Data Visualization
;
Interpretation and Evaluation Methods
;
Large Data Visualization
;
Visual Data Analysis and Knowledge Discovery
Abstract:
This article describes a three-channel encoding of nucleotide sequences, and proper formulas for filtering and downsampling such encoded sequences for multi-scale signal analysis. With proper interpolation, the encoded sequences can be visualized as curves in three-dimensional space. The filtering uses Gaussian-like smoothing kernels, chosen so that all levels of the multi-scale pyramid (except the original curve) are practically free from aliasing artifacts and have the same degree of smoothing. With these precautions, the overall shape of the space curve is robust under small changes in the DNA sequence, such as single-point mutations, insertions, deletions, and shifts.