Image classification with bags of local features
Many classification techniques expect class instances to be represented as feature vectors, i.e. points in a feature space. In computer vision classification problems, it is often possible to generate an informative feature vector representation of an image, for example using global texture or shape descriptors. However, in other cases, it may be beneficial to treat images as variable size unordered sets or bags of features, in which each feature represents a localized salient image structure or patch. These local features do not require a segmentation, and can be useful for object recognition in the presence of occlusion and clutter.
The local features are often used to find point correspondences between images to be later used for 3D reconstruction, object recognition, detection, or image retrieval. However, there are many cases when exact correspondences are difficult or even impossible to compute. Furthermore, point correspondences may not be necessary, unless one is interested in recovering the 3D shape of an object. If the correspondences are not computed, then this representation indeed constitutes an unordered set of local features.
In this dissertation we present methods for object class recognition using bags of features without relying on point correspondences. We also show that using bags of features and more traditional feature vector representation of images together can improve classification accuracy. We then propose and evaluate several methods of combining the two representations. The proposed techniques are applied to a challenging marine science domain.