OBJECTIVE:In the last decade, haplotype reconstruction in unrelated individuals and haplotype block discovery have riveted the attention of computer scientists due to the involved strong computational aspects. Such tasks are usually addressed separately, but recently, statistical techniques have permitted them to be solved jointly. Following this trend we propose a generative model that permits researchers to solve the two problems jointly.METHOD:The model inference is based on variational learning, which permits one to estimate quickly the model parameters while remaining robust even to local minima. The model parameters are then used to segment genotypes into blocks by thresholding a quantitative measure of boundary presence.RESULTS:Experiments on real data are presented, and state-of-the-art systems for haplotype reconstruction and strategies for block estimation are considered as comparison.CONCLUSIONS:The proposed method can be used for a fast and reliable estimation of haplotype frequencies and the relative block structure. Moreover, the method can be easily used as part of a more complex system. The threshold used for block discovery can be related to the quality-of-fit reached in the model learning, resulting in an unsupervised strategy for block estimation.

Fully non-homogeneous hidden Markov model double net: a generative model for haplotype reconstruction and block discovery

PERINA, Alessandro;CRISTANI, Marco;XUMERLE, Luciano;MURINO, Vittorio;PIGNATTI, Pierfranco;MALERBA, Giovanni
2009-01-01

Abstract

OBJECTIVE:In the last decade, haplotype reconstruction in unrelated individuals and haplotype block discovery have riveted the attention of computer scientists due to the involved strong computational aspects. Such tasks are usually addressed separately, but recently, statistical techniques have permitted them to be solved jointly. Following this trend we propose a generative model that permits researchers to solve the two problems jointly.METHOD:The model inference is based on variational learning, which permits one to estimate quickly the model parameters while remaining robust even to local minima. The model parameters are then used to segment genotypes into blocks by thresholding a quantitative measure of boundary presence.RESULTS:Experiments on real data are presented, and state-of-the-art systems for haplotype reconstruction and strategies for block estimation are considered as comparison.CONCLUSIONS:The proposed method can be used for a fast and reliable estimation of haplotype frequencies and the relative block structure. Moreover, the method can be easily used as part of a more complex system. The threshold used for block discovery can be related to the quality-of-fit reached in the model learning, resulting in an unsupervised strategy for block estimation.
2009
haplotypes gene reconstruction; generative modeling; Hidden Markov Model
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/334046
Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 3
social impact