**Final Project 1998.2**

**Multiresolution Query by Image Content**

**Objective**

The purpose of the project was to implement a system, based on wavelets, for querying image databases by content.

**Description**

The project was based on the article "**Fast
Multiresolution Image Querying**" by Charles E. Jacobs, Adam Finkelstein,
and David H. Salesin on SIGGRAPH'97 Proceedings.

The following students participated on the project

Alexandre Ferreira (alexgf@tecgraf.puc-rio.br)

Carlos Vitor Alencar (alencar@tecgraf.puc-rio.br)

Marcos Machado (mmachado@tecgraf.puc-rio.br)

Paulo Mattos (pmattos@tecgraf.puc-rio.br)

Rodrigo Toledo (rtoledo@tecgraf.puc-rio.br)

William Lira (william@tecgraf.puc-rio.br)

The source code will be available soon.

The program was implemented in C with IUP (user interface), CD (graphical package) and IM (image file persistence) libraries. It works across all major-computing platforms: Windows 95/98/NT, Silicon Graphics, Sun, Linux e IBM RISC600.

The libraries are freeware for academical purpose. To acquire it, send a message to TeCGraf: tecgraf@tecgraf.puc-rio.br.

The documentation of the used libraries is available at :

IUP: http://www.tecgraf.puc-rio.br/manuais/iup

CD: http://www.tecgraf.puc-rio.br/manuais/cd

IM: http://www.tecgraf.puc-rio.br/manuais/im

The key to the algorithm is the establishment of an effective and
efficient metric capable of computing the distance between a query image Q and a potential
target image T. Wavelet decomposition proved to be a good foundation for this metric, for
several reasons, such as: few coefficients provides a good approximation of the original
image retaining information from existing edges; presents relative invariance to
resolution changes; it is fast to compute, running in linear time in size of the image;
spatial localization of the frequencies. The chosen metric use the YIQ color space and the
Haar wavelet.

For each image we compute its Haar wavelet transform, truncating and quantizing its
coeffecients. Those remaining coeffecients represent the image signature.

A painting | its truncated and quantized wavelet
decomposition with 2000 coefficients (Y color channel) |
the actual decomposition used with 60
coefficients (Y color channel) |

The metric for each color channel can be represented as:

Where Q[0,0] and T[0,0] correspond to the overall average intensity of that color channel; Q'[i,j] and T'[i.j] represent the [i,j]th truncated (the mth greatest), quantized (-1, 0, 1) wavelet coefficients (terms) of Q and T; and w{i,j} the weight of the [i,j]th coefficient.

Some simplifications can be applied:

- Actually, the expression |Q'[i,j] - T'[i.j]|, can be replaced by (Q'[i,j] = T'[i.j]), where it means 0 if it's really different, and 1 otherwise.
- The terms can be grouped into "buckets" so that each group has a weigh instead of each term.
- In order to make the metric even faster, the only terms considered are the non-zero coefficients in the query wavelet (Q'[i,j] is non-zero). (Note that the metric becomes asymmetric after this simplification.)
- Q[0,0] and T[0,0] are reals in the range [0,1], where 0 represents the lowest average color of all the target images and 1 the greatest.

The resulting metric is:

The function bin(i,j) provides a way of grouping different coefficients into a small number of bins, with each bin weighted by some constant w [b]. For a given set of bins, the best weights w [b] were found experimentally,

The application implements both the pre-processing and query interface of the algorithm. The pre-process phase consists in creating an image database containing the signatures of each image. The program offers two query options: the user inputs a pre-existing image or the user draws a sketch of the intended image on the canvas.

The figure below show the screen interface of the program:

**Some improvements**

In our implementation we made two basic changes from the original paper:

- To use a low-resolution database, and also use a low-resolution version of the image provided for querying. This speeds up the process substantially and gives essentially the same results since the continuous wavelet transform is invariant under change of scale and almost all of the largest m coefficients are located in the low resolution.
- We implement a estimated percentual of aproximation between the query image and the images in the data base.
- We also tried to use another Haar decomposition, which is
faster than the one used by Salesin et al.. The results were good, in spite of using the
same weigths of the Haar initially used.

**Some examples are given below**

Data base: Animals (100 images):

Data base: Van Gogh (500 images):

Data base: Clipart (100 images):

The image query algorithm described above can be extended to volumetric graphical objects. The extension is straightforward:

- The volumetric object is given by a matrix representation (voxels);
- We extend the Haar transform to volumetric objects using tensor product;
- The computation of the volumetric signature is also a simple extension of the 2D solution.

The video query consists in considering a video sequence as a volumetric object (see figure below). Two possibilities for que input video query are possible: Use a video shot of the desired sequence (a time normalization should be considered). Use a paint system interface where user strokes, represents the motion path of an essential motion in the required video sequence.