Multiple Sequence Alignment (MSA for short) is a well known problem in the field of computational biology. In order to evaluate the quality of a solution, many different scoring functions have been introduced, the most widely used being the Sum-of-pairs score (SP-score). It is known that computing the best MSA under the SP-score measure is NP-hard. In this paper, we introduce a variant of the Column score (defined in Thompson et al. 1999), which we refer to as Selective Column score: Given a symbol a ∈ Σ, the score of the i-th column is one if and only if all symbols of the same column are a, and otherwise zero. The acolumn score of an alignment is then the number of columns made of only character a. We show that finding the optimal MSA under the Selective Column Score is NP-hard for all alphabets of size |Σ| ≥ 2 by reducing from MIN-2-SAT.
Hardness of MSA with Selective Column Scoring
Caucchiolo, A.;Cicalese, F.
2021-01-01
Abstract
Multiple Sequence Alignment (MSA for short) is a well known problem in the field of computational biology. In order to evaluate the quality of a solution, many different scoring functions have been introduced, the most widely used being the Sum-of-pairs score (SP-score). It is known that computing the best MSA under the SP-score measure is NP-hard. In this paper, we introduce a variant of the Column score (defined in Thompson et al. 1999), which we refer to as Selective Column score: Given a symbol a ∈ Σ, the score of the i-th column is one if and only if all symbols of the same column are a, and otherwise zero. The acolumn score of an alignment is then the number of columns made of only character a. We show that finding the optimal MSA under the Selective Column Score is NP-hard for all alphabets of size |Σ| ≥ 2 by reducing from MIN-2-SAT.File | Dimensione | Formato | |
---|---|---|---|
ICTCS2021-cameraready.pdf
accesso aperto
Tipologia:
Versione dell'editore
Licenza:
Creative commons
Dimensione
261.35 kB
Formato
Adobe PDF
|
261.35 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.