Feature Extraction and Photometric Normalization

Now we need to do the feature extraction from the frontal car image. In paper [14] the performance of interest point descriptors is evaluated. Many different descriptors have been proposed in the literature. However, it is unclear which descriptors are more appropriate and how their performance depends on the interest point detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the point detector. The following descriptors are compared: SIFT descriptors[25], steerable filters[40], differencial invariants[41], complex filters [42], moment invariants[43] and cross-correlation for different of interest points. The evaluation in [14] uses as a criterion detection rate with respect to false positive rate and is carried out for different image transformations. This is the standard Receiver Operating Characteristics (ROC). Two points $\mathbf{a}$ and $\mathbf{b}$ here are similar if the distance between their descriptors is below an arbitrary threshold $d_M(D_a - D_b)<t$, the value of $t$ is varied to obtain the ROC curves. Given two images representing the same scene the detection rate is the number of correctly matched points with respect to the number of possible matches (true positive rate):


\begin{displaymath}p_{correct}={\char93  correct matches \over \char93  possible matches}\end{displaymath}

The false positive rate is the probability of a false match in a database of descriptors. Each descriptor of the query image is compared with each descriptor of the database and the number of false matches is counted. The probability of false positives is the total number of false matches with respect to the product of the number of database points and the number of image points:

\begin{displaymath}p_{false}={\char93  false matches\over{ (\char93  database points)(\char93  query image points)}}\end{displaymath}

Figure 3.4: ROC curve for evaluated descriptors in [14], here for an rotated image
\includegraphics[width=80mm,height=60mm]{ROC-rot.eps}

The test covered several experiments: rotation, scale changes, affine transformations and illumination changes. In this evaluation, it was observed that the ranking of the descriptors does not depend on the point detector and that SIFT descriptors perform the best in all tests except for lighting illumination change where steerable filters performed better (but they both had a very good and comparable quality). Steerable filters in overall summary come second; they can be considered a good choice given the low dimensionality. Based on this article we chose SIFT. We modified the SIFT for 3 types of representation which will be described later. We simplified the SIFT feature extraction and we will show that the results are comparable. However the easiest SIFT did not perform as good as the SIFT proposed in [25] the algorithm simplification lead to a faster feature extraction. SIFT with overlapping regions was also tried and it performed best. Features similar to those displayed on figure 3.5 on page [*] are obtained. These features serve as an input to the classification learning process. The image is normalized photometrically since part of SIFT feature extraction algorithm is the vector normalization to unit length.

Figure 3.5: Feature representation used for learning
\includegraphics[width=100mm,height=30mm]{featuredemo.eps}

Kocurek 2007-12-17