The details of which descriptor to be used is a tradeoff between computation time and data set – it's a decision between:
The Lowe SIFT (Scale-Invariant Feature Transform) Algorithm. This is non-free (patented in the US) which may restrict the application, but a good general purposes algorithm and one for which GPU accelerated implementations are available.
The SIFT algorithm supports a number of “standard” SIFT parameters (original vs upscale, Max octaves count, Scales per octave, Max ratio of Hessian eigenvalues and Min contrast) defined in the SiftParams structure.
The SIFT_Image_describer class supports also three presets: Normal, High and Ultra which tune the minimum contrast value as well as introducing upscaling at the Ultra value.
The SIFT algorithm is described at: https://en.wikipedia.org/wiki/Scale-invariant_feature_transform
(or A-KAZE, if you prefer), “a novel 2D feature detection and description method that operates completely in a nonlinear scale space”. This uses the AKAZEConfig config structure.
Here we can choose one of three variants
- MSURF (the default “Modified Speeded Up Robust Features”)
- LIOP (“Local Intensity Order Pattern”)
- MLDB (“Modified-Local Difference Binary”)
There are also three preset thresholds (Normal, High & Ultra) which tune the parameters used by the feature recognition algorithm.
There's also AKAZE OCV: This is the OpenCV supplied AKAZE.
The variants are described in the KAZE paper:
And the AKAZE paper http://www.robesafe.com/personal/pablo.alcantarilla/papers/Alcantarilla13bmvc.pdf has comparisons for AKAZE, SIFT and several other feature extraction methods.
And LIOP is explained in more detail here http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.371.519
The gory details
You don't need to know any of this, if you're simply using the stock methods, but...
Each region is a set of 2D features (the PointFeature class) describing a position, scale and orientation, and stored in the “.feat” file. A descriptor is essentially a container of raw data values (the Descriptor class) which corresponds to the description of a single feature. Since the descriptors are dependent on the feature algorithm used they're treated as opaque chunks of data at this point So the descriptor is constructed with a “type, count” template.
For an example of the mapping see the example file main_ComputeFeatures_OpenCV.cpp, which will demonstrate this working with off the shelf descriptors from OpenCV. In this case we see the detectAndCompute() method of the AKAZE extractor, which produces a vector of Keypoints and Descriptor output array. The keypoints are pushed to the feature list, and each Descriptor is serialised as raw data and pushed onto the Descriptors list.