We study the complexity of constructing an optimal parsing φ of a string s = s1...sn under the constraint that given a position p in the original text, and the LZ76-like (Lempel Ziv 76) encoding of T based on φ, it is possible to identify/decompress the character sp by performing at most c accesses to the LZ encoding, for a given integer c. We refer to such a parsing φ as a c-bounded access LZ parsing or c-BLZ parsing of s. We show that for any constant c the problem of computing the optimal c-BLZ parsing of a string, i.e., the one with the minimum number of phrases, is NP-hard and also APX-hard, i.e., no PT AS can exist under the standard complexity assumption P ̸= NP. We also study the ratio between the sizes of an optimal c-BLZ parsing of a string s and an optimal LZ76 parsing of s (which can be greedily computed in polynomial time). For this we establish a non-trivial lower bound Ω(c+1 √|s|) on the size of an optimal parsing for a square free string s, and also show that such a lower bound is tight for a large class of (square free) morphic words. 1 Finally, after showing that under ETH, every algorithm for c-BLZ requires time 2Ω(|s| c ) , hence strongly exponential in the special case c = 1, we show an algorithm matching this bound that can compute an optimal 1-BLZ parsing of a string s in time O∗(1.755|s|)
Hardness and approximability of bounded access Lempel Ziv coding
Cicalese, Ferdinando;
2025-01-01
Abstract
We study the complexity of constructing an optimal parsing φ of a string s = s1...sn under the constraint that given a position p in the original text, and the LZ76-like (Lempel Ziv 76) encoding of T based on φ, it is possible to identify/decompress the character sp by performing at most c accesses to the LZ encoding, for a given integer c. We refer to such a parsing φ as a c-bounded access LZ parsing or c-BLZ parsing of s. We show that for any constant c the problem of computing the optimal c-BLZ parsing of a string, i.e., the one with the minimum number of phrases, is NP-hard and also APX-hard, i.e., no PT AS can exist under the standard complexity assumption P ̸= NP. We also study the ratio between the sizes of an optimal c-BLZ parsing of a string s and an optimal LZ76 parsing of s (which can be greedily computed in polynomial time). For this we establish a non-trivial lower bound Ω(c+1 √|s|) on the size of an optimal parsing for a square free string s, and also show that such a lower bound is tight for a large class of (square free) morphic words. 1 Finally, after showing that under ETH, every algorithm for c-BLZ requires time 2Ω(|s| c ) , hence strongly exponential in the special case c = 1, we show an algorithm matching this bound that can compute an optimal 1-BLZ parsing of a string s in time O∗(1.755|s|)| File | Dimensione | Formato | |
|---|---|---|---|
|
BLZ-SpecialIssue-revision 2.pdf
accesso aperto
Descrizione: BLZ pdf preprint
Tipologia:
Documento in Pre-print
Licenza:
Copyright dell'editore
Dimensione
450 kB
Formato
Adobe PDF
|
450 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



