Multiresolution Query by Image Content
Objective
The purpose of the project was to implement a system, based on wavelets, for querying image databases by content. Description The project was based on the article "Fast Multiresolution Image Querying" by Charles E. Jacobs, Adam Finkelstein, and David H. Salesin on SIGGRAPH'97 Proceedings.Alexandre Ferreira (alexgf@tecgraf.puc-rio.br)
Carlos Vitor Alencar (alencar@tecgraf.puc-rio.br)
Marcos Machado (mmachado@tecgraf.puc-rio.br)
Paulo Mattos (pmattos@tecgraf.puc-rio.br)
Rodrigo Toledo (rtoledo@tecgraf.puc-rio.br)
William Lira (william@tecgraf.puc-rio.br)
The program was implemented in C with IUP (user interface), CD (graphical package) and IM (image file persistence) libraries. It works across all major-computing platforms: Windows 95/98/NT, Silicon Graphics, Sun, Linux e IBM RISC600.
The libraries are freeware for academical purpose. To acquire it, send a message to TeCGraf: tecgraf@tecgraf.puc-rio.br.
The documentation of the used libraries is available at :
IUP: http://www.tecgraf.puc-rio.br/manuais/iup
CD: http://www.tecgraf.puc-rio.br/manuais/cd
IM: http://www.tecgraf.puc-rio.br/manuais/im
The key to the algorithm is the establishment of an effective and
efficient metric capable of computing the distance between a query image Q and a potential
target image T. Wavelet decomposition proved to be a good foundation for this metric, for
several reasons, such as: few coefficients provides a good approximation of the original
image retaining information from existing edges; presents relative invariance to
resolution changes; it is fast to compute, running in linear time in size of the image;
spatial localization of the frequencies. The chosen metric use the YIQ color space and the
Haar wavelet.
For each image we compute its Haar wavelet transform, truncating and quantizing its
coeffecients. Those remaining coeffecients represent the image signature.
A painting | its truncated and quantized wavelet
decomposition with 2000 coefficients (Y color channel) |
the actual decomposition used with 60
coefficients (Y color channel) |
The metric for each color channel can be represented as:
Where Q[0,0] and T[0,0] correspond to the overall average intensity of that color channel; Q'[i,j] and T'[i.j] represent the [i,j]th truncated (the mth greatest), quantized (-1, 0, 1) wavelet coefficients (terms) of Q and T; and w{i,j} the weight of the [i,j]th coefficient.
Some simplifications can be applied:
The resulting metric is:
The function bin(i,j) provides a way of grouping different coefficients into a small number of bins, with each bin weighted by some constant w [b]. For a given set of bins, the best weights w [b] were found experimentally,The application implements both the pre-processing and query interface of the algorithm. The pre-process phase consists in creating an image database containing the signatures of each image. The program offers two query options: the user inputs a pre-existing image or the user draws a sketch of the intended image on the canvas.
The figure below show the screen interface of the program:
Some improvements
In our implementation we made two basic changes from the original paper:
Some examples are given below
Data base: Animals (100 images):
Data base: Van Gogh (500 images):
Data base: Clipart (100 images):
The image query algorithm described above can be extended to volumetric graphical objects. The extension is straightforward:
The video query consists in considering a video sequence as a volumetric object (see figure below). Two possibilities for que input video query are possible: Use a video shot of the desired sequence (a time normalization should be considered). Use a paint system interface where user strokes, represents the motion path of an essential motion in the required video sequence.