Data compression with arithmetic coding and source modeling
Abstract (summary)
This thesis considers the problem of splitting date compression into coding and modeling. The design and performance of a practical data compression coding scheme is considered. A more efficient multialphabet multiplication free arithmetic code is developed. Such a code admits simple and fast hardware and software implementations. The redundancy of arithmetic codes is studied. Bounds on the redundancy of the multialphabet arithmetic code presented here and the different approximations of arithmetic codes that appear in the literature are derived and compared. A finite state representation for the Q-coder to eliminate the repeated arithmetic operations in encoding/decoding is studied.
For the modeling aspect of data compression, a state construction technique to model source sequences for adaptive data compression is developed. It increases the flexibility of the source model to allow the statistical structure of the source to be captured more efficiently. Properties of the finite state models, that are constructed using the cloning technique used in the Dynamic Markov Compression (DMC) scheme, have been investigated and the conditions to increase the memory of the model are studied. It is shown that infinite memory can be achieved asymptotically as the number of states approaches infinity.
The problem of splitting data compression into coding and modeling is demonstrated in the study of the CCITT's Group 4 facsimile coding scheme. It is modified by use of the adaptive multiplication free arithmetic coding and the finite state modeling to provide better compression ratios.