Average profile and limiting distribution for a phrase size in thw Lempel-Ziv parsing algorithm - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Rapport (Rapport De Recherche) Année : 1993

Average profile and limiting distribution for a phrase size in thw Lempel-Ziv parsing algorithm

Guy Louchard
  • Fonction : Auteur
Wojciec Szpankowski
  • Fonction : Auteur

Résumé

Consider the parsing algorithm due to Lempel and Ziv that partitions a sequence of length n into variable phrases (blocks) such that a new block is the shortest substring not seen in the past as a phrase. In practice the following parameters are of interest : number of phrases, the size of a phrase, the number of phrases of given size and so forth. In this paper, we focus on the size of a randomly selected phrases and the average number of phrases of a given size (the so called average profile of phrase sizes). These parameters can be efficiently analyzed through a digital search tree representation. For a memoryless source with unequal probabilities of symbols generation (the so called asymmetric Bernoulli model) we prove that the size of a typical phrase is asymptotically normally distributed with the average value and the variance explicity computed. In terms of digital search trees, we prove the normal limiting distribution of the typical depth (i.e., the length of a path from the root to a randomly selected node). The latter finding is proved by a technique that belongs to the toolkit of the "analytical analysis of algorithms" but which seems to be novel in the context of data compression.

Domaines

Autre [cs.OH]
Fichier principal
Vignette du fichier
RR-1886.pdf (848.01 Ko) Télécharger le fichier

Dates et versions

inria-00074786 , version 1 (24-05-2006)

Identifiants

  • HAL Id : inria-00074786 , version 1

Citer

Guy Louchard, Wojciec Szpankowski. Average profile and limiting distribution for a phrase size in thw Lempel-Ziv parsing algorithm. [Research Report] RR-1886, INRIA. 1993. ⟨inria-00074786⟩
46 Consultations
27 Téléchargements

Partager

Gmail Facebook X LinkedIn More