A 1-prefix normal word is a binary word with the property that no factor has more 1s than the prefix of the same length; a 0-prefix normal word is defined analogously. These words arise in the context of indexed binary jumbled pattern matching, where the aim is to decide whether a word has a factor with a given number of 1s and 0s (a given Parikh vector). Each binary word has an associated set of Parikh vectors of the factors of the word. Using prefix normal words, we provide a characterization of the equivalence class of binary words having the same set of Parikh vectors of their factors. We prove that the language of prefix normal words is not context-free and is strictly contained in the language of pre-necklaces, which are prefixes of powers of Lyndon words. We give enumeration results on pnw(n), the number of prefix normal words of length n, showing that, for sufficiently large n, 2^{n−4\sqrt{n lg n}} ≤ pnw(n) ≤ 2^{n−lg n+1}. For fixed density (number of 1s), we show that the ordinary generating function of the number of prefix normal words of length n and density d is a rational function. Finally, we give experimental results on pnw(n), discuss further properties, and state open problems.

On prefix normal words and prefix normal forms

Liptak, Zsuzsanna;
2017-01-01

Abstract

A 1-prefix normal word is a binary word with the property that no factor has more 1s than the prefix of the same length; a 0-prefix normal word is defined analogously. These words arise in the context of indexed binary jumbled pattern matching, where the aim is to decide whether a word has a factor with a given number of 1s and 0s (a given Parikh vector). Each binary word has an associated set of Parikh vectors of the factors of the word. Using prefix normal words, we provide a characterization of the equivalence class of binary words having the same set of Parikh vectors of their factors. We prove that the language of prefix normal words is not context-free and is strictly contained in the language of pre-necklaces, which are prefixes of powers of Lyndon words. We give enumeration results on pnw(n), the number of prefix normal words of length n, showing that, for sufficiently large n, 2^{n−4\sqrt{n lg n}} ≤ pnw(n) ≤ 2^{n−lg n+1}. For fixed density (number of 1s), we show that the ordinary generating function of the number of prefix normal words of length n and density d is a rational function. Finally, we give experimental results on pnw(n), discuss further properties, and state open problems.
2017
prefix normal words, prefix normal forms, binary languages, binary jumbled pattern matching, pre-necklaces, Lyndon words, enumeration
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S0304397516305680-main.pdf

solo utenti autorizzati

Tipologia: Versione dell'editore
Licenza: Copyright dell'editore
Dimensione 450.76 kB
Formato Adobe PDF
450.76 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/953122
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 13
  • ???jsp.display-item.citation.isi??? 9
social impact