In this paper, we investigate the central problem of finding recombination events. It is commonly assumed that a present population is a descendent of a small number of specific sequences called founders. The recombination process consists in given two equal length sequences, generates a third sequence of the same length by concatenating the prefix of one sequence with the suffix of the other sequence. Due to recombination, a present sequence (called a recombinant) is thus composed of blocks from the founders. A major question related to founder sequences is the so-called Minimum Mosaic problem: using the natural parsimony criterion for the number of recombinations, find the "best" founders. In this article, we prove that the Minimum Mosaic problem given haplotype recombinants with no missing values is NP-hard when the number of founders is given as part of the input and propose some exact exponential-time algorithms for the problem, which can be considered polynomial provided some extra information. Notice that Rastas and Ukkonen proved that the Minimum Mosaic problem is NP-hard using a somewhat unrealistic mutation cost function. The aim of this paper is to provide a better complexity insight of the problem.
Minimum Mosaic Inference of a Set of Recombinants
RIZZI, ROMEO;
2013-01-01
Abstract
In this paper, we investigate the central problem of finding recombination events. It is commonly assumed that a present population is a descendent of a small number of specific sequences called founders. The recombination process consists in given two equal length sequences, generates a third sequence of the same length by concatenating the prefix of one sequence with the suffix of the other sequence. Due to recombination, a present sequence (called a recombinant) is thus composed of blocks from the founders. A major question related to founder sequences is the so-called Minimum Mosaic problem: using the natural parsimony criterion for the number of recombinations, find the "best" founders. In this article, we prove that the Minimum Mosaic problem given haplotype recombinants with no missing values is NP-hard when the number of founders is given as part of the input and propose some exact exponential-time algorithms for the problem, which can be considered polynomial provided some extra information. Notice that Rastas and Ukkonen proved that the Minimum Mosaic problem is NP-hard using a somewhat unrealistic mutation cost function. The aim of this paper is to provide a better complexity insight of the problem.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.