The Parikh vector of a string s over a finite ordered alphabet Σ = {a1,...,aσ} is defined as the vector of multiplicities of the characters, i.e. p(s) = (p1,...,pσ), where pi = |{j | sj = ai}|. Parikh vector q occurs in s if s has a substring t with p(t) = q. The problem of searching for a query q in a text s of length n can be solved simply and optimally with a sliding window approach in O(n) time. We present two new algorithms for the case where the text is fixed and many queries arrive over time. The first algorithm finds all occurrences of a given Parikh vector in a text (over a fixed alphabet of size σ ≥ 2) and appears to have a sub-linear expected time complexity. The second algorithm only decides whether a given Parikh vector appears in a binary text; it iteratively constructs a linear size data structure which then allows answering queries in constant time, for many queries even during the construction phase.

Searching for Jumbled Patterns in Strings

Cicalese, Ferdinando;Liptak, Zsuzsanna
2009-01-01

Abstract

The Parikh vector of a string s over a finite ordered alphabet Σ = {a1,...,aσ} is defined as the vector of multiplicities of the characters, i.e. p(s) = (p1,...,pσ), where pi = |{j | sj = ai}|. Parikh vector q occurs in s if s has a substring t with p(t) = q. The problem of searching for a query q in a text s of length n can be solved simply and optimally with a sliding window approach in O(n) time. We present two new algorithms for the case where the text is fixed and many queries arrive over time. The first algorithm finds all occurrences of a given Parikh vector in a text (over a fixed alphabet of size σ ≥ 2) and appears to have a sub-linear expected time complexity. The second algorithm only decides whether a given Parikh vector appears in a binary text; it iteratively constructs a linear size data structure which then allows answering queries in constant time, for many queries even during the construction phase.
2009
978-80-01-04403-2
Average case analysis; Parikh vectors; Pattern matching; Permuted strings; String algorithms
File in questo prodotto:
File Dimensione Formato  
PSC2009_article10.pdf

accesso aperto

Tipologia: Versione dell'editore
Licenza: Copyright dell'editore
Dimensione 649.09 kB
Formato Adobe PDF
649.09 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/391091
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 47
  • ???jsp.display-item.citation.isi??? 34
social impact