next up previous contents index
Next: fvq Up: Reference Previous: flbg   Contents   Index

flbg_train

$ \bigcirc$Name


flbg_train Generates a sequence of codebooks from a training set of vectors using LBG (generalized Lloyd) algorithm




$ \bigcirc$Command Synopsis


flbg_train [-s CodeBookSize] [-W Weight] [-M] [-i InitCodeBook] [-p] [-f NResCB] [-g NResResCB] [-a ResCodeBook] [-b ResResCodeBook] TrainSet . CodeBook



-s CodeBookSize : Size of output codebook

-W Weight : Weighting factors for the components of vector (fsignal)

-M : Generate a sequence of codebooks of size equal to a power of two and <  = Size

-i InitCodeBook : Initial codebook (fimage)

-p : Print number of iterations for each loop instead of distortion rate results

-f NResCB : Index of first residual codebook (in ResCodeBook)

-g NResResCB : Index of second residual codebook (in ResResCodeBook)

-a ResCodeBook : First residual codebook (fimage)

-b ResResCodeBook : Second residual codebook (fimage)

TrainSet : Training set of vectors (fimage)

. (screen output) : Quantization mean square error

CodeBook : Output codebook (fimage)




$ \bigcirc$Function Summary


void flbg_train (Size , Weight , MultiCB , InitCodeBook , NResCB , NResResCB , ResCodeBook , ResResCodeBook , PrintSNR , TrainSet , MSE , CodeBook )

int *Size ;

Fsignal Weight ;

int *MultiCB ;

Fimage InitCodeBook ;

int *NResCB , *NResResCB ;

Fimage ResCodeBook , ResResCodeBook ;

int *PrintSNR ;

Fimage TrainSet ;

float *MSE ;

Fimage CodeBook ;




$ \bigcirc$Description


This module generates a codebook (or a set of codebooks) adapted to a given training set of vectors using the LBG algorithm. The training set is read in the TrainSet file which must have been created with the mk_trainset module (see mk_trainset documentation for further details). The generated codebook is stored in the CodeBook file. It can be used for example for the vector quantization of images with the fvq module.

The LBG or generalized Lloyd's algorithm is a two step iterative algorithm which is meant to adapt an initial codebook to a given probability density function or a training set of vectors (see [GG92], [LBG80]). We are only considering here the training set version of this algorithm.

If $ \cal {Y}$ = {Yi}1 $\scriptstyle \leq$ i $\scriptstyle \leq$ I, is the training set of K-dimensional vectors Yi = (Yi, k)1 $\scriptstyle \leq$ k $\scriptstyle \leq$ K, and if $ \cal {C}$ = {Cm}1 $\scriptstyle \leq$ m $\scriptstyle \leq$ M is a codebook, then the LBG algorithm searches a local minimum of

D($\displaystyle \cal {Y}$, Q) = $\displaystyle {\frac{{1}}{{I}}}$$\displaystyle \sum_{{i=1}}^{I}$d (Yi, Q(Yi)),

over all possible codebooks with a given size M, where Q(Yi) is the quantized approximation of Yi (see the fvq module documentation) and d is the mean square error (m.s.e.). This is done iteratively by repeting until convergence the following steps :
  1. Compute Q(Yi), or more precisely the index of Q(Yi).
  2. Denote Pm = {i : Q(Yi) = Xm}, and compute for each m

    $\displaystyle \tilde{{C}}_{m}^{}$ = $\displaystyle {\frac{{1}}{{{\mathrm Card} P_m}}}$$\displaystyle \sum_{{i\in P_m}}^{}$Yi.

    Replace for each m Cm by $ \tilde{{C}}_{m}^{}$ and go to step 1.
In practice the convergence test is made on the average m.s.e.. More precisely one fixes a threshold $ \epsilon$ and stops the algorithm when

$\displaystyle {\frac{{D({\cal Y}, Q) - D({\cal Y}, \tilde{Q})}}{{D({\cal Y}, Q)}}}$

falls below the threshold $ \epsilon$ (Q and $ \tilde{{Q}}$ being the vector quantizers associated to {Cm} and {$ \tilde{{C}}_{m}^{}$} respectively).

One has to find an initial codebook to feed the iterative LBG algorithm. To do this, the flbg_train module offers two possiblities :

Basically, the splitting trick consists in starting with a size M/2 codebook, optimizing it with the LBG algorithm, and then ``splitting'' each vector Cm in the resulting codebook in two vectors Cm + $ \epsilon_{m}^{}$ and Cm - $ \epsilon_{m}^{}$. One thus obtains a size M codebook which serves as initial data for the LBG algorithm. One can iterate this process, starting with a size one codebook (or in fact any user specified codebook), and get an optimized codebook with arbitrary size M (if M is not a power of 2, then one only splits the ad hoc number of vectors at last step). The resulting algorithm is a nested two-layers iterative algorithm.

The size of the codebook (M) is specified by the -s option. CodeBookSize must be a strictly positive integer.

It is possible to replace the plain m.s.e. by a weighted version of it using the -W option. Weight is a Fsignal file containing the weights which must be distributed to each component of the vectors.

If the -M option is selected, then all the optimized codebooks of size equal to a power of two (or a power of two multiplied by the size of the initial codebook if the -i option has been selected) which are generated before each new splitting are kept in memory and stored together with the final codebook of size CodeBookSize in the CodeBook file. This means that this file contains not one, but several codebooks of different sizes. These codebooks are indexed, starting from 0 for the smallest one, up to $ \lceil$log2CodeBookSize$ \rceil$. If the -M option is not selected, then only the final size CodeBookSize codebook is stored in CodeBook file.

It is possible to generate codebooks which are specially adapted to residual vector quantization (see fvq documentation and [GG92]) with the help of -a, -b, -f and -g options. If the -a option is selected, then the training set is first quantized using the codebook read in ResCodeBook file, and each vector in the training set is replaced by the residue of this quantization, i.e. Yi is replaced by Yi - QRes(Yi). If the -b option is selected, then this resulting training set is further quantized using the codebook read in ResResCodeBook file, and each vector in the training set is replaced by the residue of this quantization, i.e. Yi - QRes(Yi) is replaced by Yi - QRes(Yi) - QResRes(Yi - QRes(Yi)). The resulting training set serves as input for the splitting/LBG loop. If the -f option is selected, this means that ResCodeBook contains in fact several codebooks of different sizes (i.e. it has been generated with the -M option). Then NResCB is the index of the codebook to choose in ResCodeBook in order to quantize the training set. The -g option does the equivalent for ResResCodeBook. To be consistent, ResCodeBook, ResResCodeBook and CodeBook should be generated using the same training set file TrainSet. NResCB and NResResCB must be of course positive integers. Notice that the -a, -b, -f and -g options do not modify the LBG or splitting algorithms at all. They only change the input training set.

The -p option enables to choose the printed information on the generating process. If selected, it gives the constant rate ( = $ {\frac{{1}}{{K}}}$log2M), the entropic rate, and the signal to noise ratio for each generated codebook. If not selected, it only gives the number of iteration in the LBG algorithm for each outer (splitting) loop.



$ \bigcirc$Example


In order to generate a set of codebooks of size 1, 2, 22, ..., 28, adapted to the training set contained in the file TrainSet and put them in the file CodeBook, run



flb_train -s 256 -M TrainSet CodeBook



In order to generate a set of codebooks of size 1, 2, 22, ..., 28, for the residual vector quantization after quantization with the codebook with index 9 in CodeBook file (i.e. the codebook with size 256), and put them in the file ResCodeBook, run



flb_train -s 256 -M -a CodeBook -f 9 TrainSet ResCodeBook



In order to generate a set of codebooks of size 1, 2, 22, ..., 28 and 347 for the residual vector quantization after quantization with the codebook with index 9 in CodeBook file (i.e. the codebook with size 256) and the codebook with index 7 in ResCodeBook file (i.e. the codebook with size 64), and put them in the file ResResCodeBook, run



flb_train -s 347 -M -a CodeBook -f 9 TrainSet -b ResCodeBook -g 7 TrainSet ResResCodeBook







$ \bigcirc$See Also


flbg_adap, flbg.


$ \bigcirc$Version 2.00


Last Modification date : Thu Nov 29 20:23:57 2001


$ \bigcirc$Author


Jean-Pierre D'Ales






next up previous contents index
Next: fvq Up: Reference Previous: flbg   Contents   Index
mw 2004-05-05