CATALOGO DEI PRODOTTI DELLA RICERCA

Given a string X = X[1..n] of length n, and integers m and d, such that n > m >= 2d > 0, we consider the problem of compressing the string S formed by concatenating the substrings of X of length m starting at positions i congruent 1 mod d. In particular, we provide an upper bound of (2n-m)/d + 2z + (m - d) on the size of the Lempel-Ziv (LZ77) parsing of S, where z is the size of the parsing of X. We also show that a related bound holds regardless of the order in which the substrings are concatenated in the formation of S. If X is viewed as a genome sequence, the above substring sampling process corresponds to an idealized model of short read DNA sequencing.

On Compressing Collections of Substring Samples

Golnaz Badkobeh;Sara Giuliani;Zsuzsanna Liptak;Simon J. Puglisi:

2022-01-01

Abstract

Given a string X = X[1..n] of length n, and integers m and d, such that n > m >= 2d > 0, we consider the problem of compressing the string S formed by concatenating the substrings of X of length m starting at positions i congruent 1 mod d. In particular, we provide an upper bound of (2n-m)/d + 2z + (m - d) on the size of the Lempel-Ziv (LZ77) parsing of S, where z is the size of the parsing of X. We also show that a related bound holds regardless of the order in which the substrings are concatenated in the formation of S. If X is viewed as a genome sequence, the above substring sampling process corresponds to an idealized model of short read DNA sequencing.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2022
			
	Parole Chiave
	
				Lempel-Ziv, LZ77, data compression, strings, repetitiveness, combinatorics on words, parsing, short reads
			
	Appare nelle tipologie:
	
				04.01 Contributo in atti di convegno

File in questo prodotto:

File	Dimensione	Formato
ICTCS2022_1532.pdf accesso aperto Tipologia: Versione dell'editore Licenza: Copyright dell'editore Dimensione 1.5 MB Formato Adobe PDF Visualizza/Apri	1.5 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1091828

Citazioni

ND

1

ND

social impact