Multiple Sequence Alignment (MSA for short) is a well known problem in the field of computational biology. In order to evaluate the quality of a solution, many different scoring functions have been introduced, the most widely used being the Sum-of-Pairs score (SP-score). It is known that computing the best MSA under the SP-score measure is NP-hard.In this paper, we introduce a variant of the Column score (defined in Thompson et al. 1999), which we refer to as Selective Column score: Given a symbol a is an element of Sigma, the score of the i-th column is one if and only if all symbols of the same column are a, and otherwise zero. The a-column score of an alignment is then the number of columns made of only character a.We show that finding the optimal MSA under the Selective Column Score is NP-hard for all alphabets of size |Sigma| >= 2, and that the associated maximization problem is poly-APX-hard. We also give an approximation algorithm that almost matches the inapproximability bound. (c) 2023 Elsevier B.V. All rights reserved.
Hardness and approximation of multiple sequence alignment with column score
Caucchiolo, A
;Cicalese, F
2023-01-01
Abstract
Multiple Sequence Alignment (MSA for short) is a well known problem in the field of computational biology. In order to evaluate the quality of a solution, many different scoring functions have been introduced, the most widely used being the Sum-of-Pairs score (SP-score). It is known that computing the best MSA under the SP-score measure is NP-hard.In this paper, we introduce a variant of the Column score (defined in Thompson et al. 1999), which we refer to as Selective Column score: Given a symbol a is an element of Sigma, the score of the i-th column is one if and only if all symbols of the same column are a, and otherwise zero. The a-column score of an alignment is then the number of columns made of only character a.We show that finding the optimal MSA under the Selective Column Score is NP-hard for all alphabets of size |Sigma| >= 2, and that the associated maximization problem is poly-APX-hard. We also give an approximation algorithm that almost matches the inapproximability bound. (c) 2023 Elsevier B.V. All rights reserved.File | Dimensione | Formato | |
---|---|---|---|
1-s2.0-S0304397522007654-main.pdf
embargo fino al 01/01/2025
Descrizione: paper-postprint
Tipologia:
Versione dell'editore
Licenza:
Copyright dell'editore
Dimensione
329.81 kB
Formato
Adobe PDF
|
329.81 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.