Given a string s, the Parikh vector of s, denoted p(s), counts the multiplicity of each character in s. Searching for a match of Parikh vector q (a “jumbled string”) in the text s requires to find a substring t of s with p(t) = q. The corresponding decision problem is to ver- ify whether at least one such match exists. So, for example for the al- phabet Σ = {a,b,c}, the string s = abaccbabaaa has Parikh vector p(s) = (6, 3, 2), and the Parikh vector q = (2, 1, 1) appears once in s in position (1, 4). Like its more precise counterpart, the renown Exact String Matching, Jumbled Pattern Matching has ubiquitous applications, e.g., string matching with a dyslectic word processor, table rearrangements, anagram checking, Scrabble playing and, allegedly, also analysis of mass spectrometry data. We consider two simple algorithms for Jumbled Pat- tern Matching and use very complicated data structures and analytic tools to show that they are not worse than the most obvious algorithm. We also show that we can achieve non-trivial efficient average case behav- ior, but that’s less fun to describe in this abstract so we defer the details to the main part of the article, to be read at the reader’s risk. . . well, at the reader’s discretion.

On Table Arrangements, Scrabble Freaks, and Jumbled Pattern Matching

Cicalese, Ferdinando;Liptak, Zsuzsanna
2010-01-01

Abstract

Given a string s, the Parikh vector of s, denoted p(s), counts the multiplicity of each character in s. Searching for a match of Parikh vector q (a “jumbled string”) in the text s requires to find a substring t of s with p(t) = q. The corresponding decision problem is to ver- ify whether at least one such match exists. So, for example for the al- phabet Σ = {a,b,c}, the string s = abaccbabaaa has Parikh vector p(s) = (6, 3, 2), and the Parikh vector q = (2, 1, 1) appears once in s in position (1, 4). Like its more precise counterpart, the renown Exact String Matching, Jumbled Pattern Matching has ubiquitous applications, e.g., string matching with a dyslectic word processor, table rearrangements, anagram checking, Scrabble playing and, allegedly, also analysis of mass spectrometry data. We consider two simple algorithms for Jumbled Pat- tern Matching and use very complicated data structures and analytic tools to show that they are not worse than the most obvious algorithm. We also show that we can achieve non-trivial efficient average case behav- ior, but that’s less fun to describe in this abstract so we defer the details to the main part of the article, to be read at the reader’s risk. . . well, at the reader’s discretion.
2010
9783642131219
Parikh vectors, jumbled pattern matching, scrabble, approximate pattern matching
File in questo prodotto:
File Dimensione Formato  
FUN2010.pdf

solo utenti autorizzati

Tipologia: Versione dell'editore
Licenza: Copyright dell'editore
Dimensione 275.21 kB
Formato Adobe PDF
275.21 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/391090
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 36
  • ???jsp.display-item.citation.isi??? 30
social impact