dots and there are no differences with the common
genotype-based PC plots.
When using the proposed method to compare ac-
curacy of different in-silico phasing algorithms, we
always need to know the true, or quasi-true, hap-
lotypes. Nowadays there are few publicly-available
data sets of true haplotypes from healthy individ-
uals. The most widely-used data sets come from
HapMap and includes quasi-true haplotypes, inferred
from familial trios, for Caucasians from Northwest
Europe (CEU), Africans from Nigeria (YRI), Kenya
(MKK, Maasai in Kinyawa) and a less-specific ori-
gin (ASW, African Ancestry in SW USA), and an ad-
mixed population (MXL, Mexican Ancestry in LA,
CA, USA). The more recent 1000 Genomes project
(Consortium’, 2010) does not include trios. There-
fore, for samples from other European or African re-
gions, or for Asian individuals it would be more diffi-
cult to find out a large enough data set of true haplo-
types to apply the proposed method.
4 CONCLUSIONS
With this work we have extended the genotype-based
PC plots to use phased genotypes and we have shown
how phased-PC plots may shed new light to this kind
of graphs helping to understand not only population
drift, stratification and admixture but also individual
genetic differences. Moreover, it may be used as a
by-view way to test accuracy of phasing methods at a
long-range haplotype level.
Based on phased-PC plots, we plan to design a
statistical test to compare accuracy between phased
genotypes returned by an in-silico phasing algorithm
and the true or quasi-true phased genotypes.
ACKNOWLEDGEMENTS
The authors were supported by projects CEI-
mic2013-2, CEI-IDi-2013-15, TIN2010-20900-C04-
1 and P08-TIC-03717 and the European Regional De-
velopment Fund (ERDF).
REFERENCES
Brisbin, A. (2010). Linkage analysis for categorical traits
and ancestry assignment in admixed individuals. PhD
thesis, Cornell University, Ithaca, New York.
Browning, B. L. and Browning, S. R. (2009). A unified ap-
proach to genotype imputation and haplotype-phase
inference for large data sets of trios and unrelated in-
dividuals. The American Journal of Human Genetics,
84(2):210–223.
Consortium’, T. . G. P. (2010). A map of human genome
variation from population-scale sequencing. Nature,
467:1061–73.
Delaneau, O., Marchini, J., and Zagury, J.-F. (2011). A
linear complexity phasing method for thousands of
genomes. Nature Methods, 9(2):179–81.
HapMap-Consortium, T. I. (2003). The international
hapmap project. Nature, 426:789–796.
HapMap-Consortium, T. I. (2010). Integrating common and
rare genetic variation in diverse human populations.
Nature, 467(7311):52–58.
Jombart, T., Pontier, D., and Dufour, A.-B. (2009). Ge-
netic markers in the playground of multivariate analy-
sis. Heredity, 102:330–41.
Lao, O., Lu, T. T., Nothnagel, M., et al. (2008). Correlation
between genetic and geographic structure in europe.
Curr. Bio., 18:1241–8.
Nicholson, G., Smith, A., Johnson, F., et al. (2002). As-
sessing population differentiation and isolation from
single-nucleotide polymorphism data. JRSS (B),
64:695–715.
Novembre, J., Toby, Bryc, K., et al. (2008). Genes mirror
geography within europe. Nature, 456(7218):98–101.
Pariset, L., Savarese, M., Cappuccio, I., and Valentini,
A. (2003). Use of microsatellites for genetic varia-
tion and imbreeding analysis in sarda sheep flocks of
central italy. Journal of Animal Breeding Genetics,
120:425–32.
Patterson, N., Price, A. L., and Reich, D. (2006). Pop-
ulation structure and eigenanalysis. PLoS Genetics,
2(12):2074–93.
Sebastiani, P., Abad-Grau, M., Alpargu, G., and Ramoni,
M. F. (2004). Robust transmission/disequilibrium
test for incomplete family genotypes. Genetics,
168(4):2329–37.
Silva-Zolezzi, I., Hidalgo-Miranda, A., Estrada-Gil, J., et al.
(2009). Analysis of genomic diversity in mexican
mestizo populations to develop genomic medicine in
mexico. PNAS, 106(21):8611–16.
Turner, D. J. and Hurles, M. E. (2003). High-throughput
haplotype determiantion over long distances by haplo-
type fusion pcr and ligation haplotyping. Nature Pro-
tocols, 4:1771–83.
Wang, C., Szpiech, Z., Degnan, J., et al. (2010). Comparing
spatial maps of human population-genetic variation
using procrustes analysis. Stat. Appl. Genet. Molec.
Biol., 9(1):13.
SummarizingGenome-widePhasedGenotypesusingPhasedPCPlots
135