We present a new class of binary words: the prefix normal words. They are defined by the property that for any given length k, no factor of length k has more a’s than the prefix of the same length. These words arise in the context of indexing for jumbled pattern match- ing (a.k.a. permutation matching or Parikh vector matching), where the aim is to decide whether a string has a factor with a given multiplicity of characters, i.e., with a given Parikh vector. Using prefix normal words, we give the first non-trivial characterization of binary words having the same set of Parikh vectors of factors. We prove that the language of prefix normal words is not context-free and is strictly contained in the language of pre-necklaces, which are prefixes of powers of Lyndon words. We discuss further properties and state open problems.
On prefix normal words
Liptak, Zsuzsanna
2011-01-01
Abstract
We present a new class of binary words: the prefix normal words. They are defined by the property that for any given length k, no factor of length k has more a’s than the prefix of the same length. These words arise in the context of indexing for jumbled pattern match- ing (a.k.a. permutation matching or Parikh vector matching), where the aim is to decide whether a string has a factor with a given multiplicity of characters, i.e., with a given Parikh vector. Using prefix normal words, we give the first non-trivial characterization of binary words having the same set of Parikh vectors of factors. We prove that the language of prefix normal words is not context-free and is strictly contained in the language of pre-necklaces, which are prefixes of powers of Lyndon words. We discuss further properties and state open problems.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.