Universal lossless compression via multilevel pattern matching

@article{Kieffer2000UniversalLC,
  title={Universal lossless compression via multilevel pattern matching},
  author={John C. Kieffer and En-hui Yang and G. Nelson and Pamela C. Cosman},
  journal={IEEE Trans. Inf. Theory},
  year={2000},
  volume={46},
  pages={1227-1245},
  url={https://api.semanticscholar.org/CorpusID:8191526}
}
A O(1/log n) maximal redundancy/sample upper bound is established for the multilevel pattern matching code with respect to any class of finite state sources of uniformly bounded complexity in processing a finite-alphabet data string of length n.

Context-dependent multilevel pattern matching for lossless image compression

It is shown that among all images of n pixels, the context-dependent 2D MPM code has an O(1/logn) worst case redundancy against any finite-template-based arithmetic code satisfying a mild condition; this redundancy is better than that of the 2DMPM code without context models.

Structured grammar-based codes for universal lossless data compression

A coding theorem is proved which shows that a structured grammar-based code has maximal redundancy/sample O(1=logn) provided that a weak regular structure condition is satisfied.

Universal Lossless Data Compression Via Binary Decision Diagrams

A lossless data compression algorithm in which a binary string of length a power of two is compressed via compression of the ROBDD associated to it as described above, showing that the maximal pointwise redundancy/sample with respect to any s-state binary information source has the upper bound.

Universal lossless data compression with side information by using a conditional MPM grammar transform

A universal lossless data compression algorithm with side information called the CMPM algorithm, which has linear time and storage complexity and asymptotically achieves the conditional entropy rate of any stationary, ergodic source pair.

Grammar-based codes: A new class of universal lossless source codes

It is shown that, subject to some mild restrictions, a grammar-based code is a universal code with respect to the family of finite-state information sources over the finite alphabet.

Efficient universal lossless data compression algorithms based on a greedy sequential grammar transform .2. With context models

It is proved that for some nonstationary sources, the proposed context-dependent algorithms can achieve better expected redundancies than any existing CFG-based codes, including the Lempel-Ziv (1978) algorithm, the multilevel pattern matching algorithm, and the context-free algorithms in Part I of this series of papers.

Efficient Variable-to-Fixe d Length Coding Algorithms for Text Compression

This thesis focuses on lossless compression for text data, that is, text compression, and Variable-to-Fixed-length coding, a coding scheme that segments an input text into a consecutive sequence of substrings and then assigns a fixed length codeword to each substring.

Effective Variable-Length-to-Fixed-Length Coding via a Re-Pair Algorithm

This study proposes a new VF coding method that applies a fixed-length code to the set of rules extracted by the Re-Pair algorithm, a simple off-line grammar-based compression method that has good compression-ratio performance with moderate compression speed.

Lossless Data Compression Via Guided Approximate Bisections

It is shown that the modi ed bisection method yields maximal redundancy/sample O(1= log n) for n data samples, regardless of the manner in which the approximate bisections are guided.

A grammar-based compression using a variation of Chomsky normal form of context free grammar

The proposed method can improve the compression performance of these algorithms by the unified procedure and has an advantage that, transformation from a given sequence to the grammar is quite simple, by using the three-step algorithm through semi-CNF.
...

Grammar-based codes: A new class of universal lossless source codes

It is shown that, subject to some mild restrictions, a grammar-based code is a universal code with respect to the family of finite-state information sources over the finite alphabet.

Redundancy of the Lempel-Ziv incremental parsing rule

It is demonstrated that for unifilar or Markov sources, the redundancy of encoding the first n letters of the source output with the Lempel-Ziv incremental parsing rule, the Welch modification, or a new variant is O((ln n)/sup -1/), and the exact form of convergence is upper-bound.

Compression of individual sequences via variable-rate coding

The proposed concept of compressibility is shown to play a role analogous to that of entropy in classical information theory where one deals with probabilistic ensembles of sequences rather than with individual sequences.

Universal codeword sets and representations of the integers

An application is the construction of a uniformly universal sequence of codes for countable memoryless sources, in which the n th code has a ratio of average codeword length to source rate bounded by a function of n for all sources with positive rate.

Redundancy of MPM data compression system

    J. KiefferE. Yang
    Computer Science
  • 1998
A finite-state information source is losslessly encoded via the multilevel pattern matching data compression system and gives a pointwise redundancy bound better than the bound established by Plotnik et al. (1978 version) for the Lempel-Ziv algorithm.

Progressive lossless image coding via self-referential partitions

The progressive image coder is fast and has a worst-case redundancy performance better than the best currently known worst- case redundancy upper bound for the 2-D Lempel-Ziv algorithm.

Arithmetic coding revisited

A new implementation of arithmetic coding is described that incorporates several improvements over a widely used earlier version by Witten, Neal, and Cleary, which has become a de facto standard and a modular structure that separates the coding, modeling, and probability estimation components of a compression system is described.

On the average redundancy rate of the Lempel-Ziv code

It is proved that for a memoryless source the average redundancy rate attains asymptotically Er/sub n/=(A+/spl delta/(n))/log n+ O(log log n/log/sup 2/ n), where A is an explicitly given constant that depends on source characteristics, and /spl delta/(x) is a fluctuating function with a small amplitude.

Upper Bounds On The Probability Of Sequences Emitted By Finite-state Sources And On The Redundancy Of The Lempel-Ziv Algorithm

An upper bound on the probability of a sequence drawn from a finite-state source is derived. The bound is given in terms of the number of phrases obtained by parsing the sequence according to the…

Elements of Information Theory

The author examines the role of entropy, inequality, and randomness in the design of codes and the construction of codes in the rapidly changing environment.