Thursday, 11 December 2014

Automatic Feature Detection

We can use the feature detection algorithms in OpenCV to automatically align some of the image data; this isn't particularly effective on many pictures, but it can work in a few cases.

For this example we'll take three images of the same target taken through different filters and try to align them to produce a single full colour version:
  • W1484593934_1.IMG = Blue
  • W1484593810_1.IMG = Green
  • W1484593662_1.IMG = Red

These show the moon Rhea, and are from the coiss_2009 data set (and are actually the true colour filters).

We follow a standard decode and range optimisation, as for the previous example. However in this case we have to convert the images to 8 bit for the image alignment algorithms to operate on them. This is simply
im16d[i].convertTo(ch8[i], CV_8U, 0.0039);

Where im16d[i] is the source image and ch8[i] the destination. The 0.0039 factor is scaling the range back down from 16 bit to 8 bit values i.e. 256/65536. There's an argument we could have skipped this and just ranged to 8 bits, but this keeps our test code consistent if we always deal with a 16 bit backend.

Following this we align the images. OpenCV offers a standard workflow for this and a number of default algorithms to work with in this case. The basic set of steps is:
  1. Use a feature detector to get the "interesting" points in the image - i.e. features we can match between images. This gets us the KeyPoints List.
  2. Use a descriptor extractor to get a descriptor for each of the KeyPoints in the image, which gives us a Matrix of descriptors, one per keypoint.
  3.  Use a matcher to calculate the matches between our two sets of descriptors. This gives us a set of DMatch matches, containing points in one set and the nearest match in the other. These matches aren't necessarily all correct, just the best available.
  4. Cull the matches, to just use the "strongest" available matches. This gives us a better chance of getting a consistently correct relation between the two images.
  5. Use the findHomography() call to work out the transform between the points of the match.
  6. Apply this transform to the image to be aligned using warpPerspective()
Then we will have a set of aligned images which we can merge together into a single colour version.

For this first attempt we can use the stock feature detector, extractor and matcher components of OpenCV on a pair of images. Breaking the first 5 steps of the matching procedure into a single function to take two images and return the matrix from findHomography() we have:
static cv::Mat GetAlignment(cv::Mat im0, cv::Mat im1)
Then we have the three stage processors:
cv::OrbFeatureDetector detector;
cv::OrbDescriptorExtractor extractor;

cv::BFMatcher matcher(cv::NORM_HAMMING, true);

The detector outputs KeyPoint vectors;
std::vector<cv::KeyPoint> keypoints_0, keypoints_1;
and we run this with
  detector.detect(im0, keypoints_0);
  detector.detect(im1, keypoints_1);
The extractor pulls out those KeyPoints into a matrix:
cv::Mat desc_0, desc_1;
  extractor.compute(im0, keypoints_0, desc_0);
  extractor.compute(im1, keypoints_1, desc_1);
And then the matcher is passed the two descriptor vectors, and produces a set of matches;
std::vector<cv::DMatch> matches;
  matcher.match(desc_0, desc_1, matches);
Following this we need to filter the matches to pick out the "best" ones. In this case we look at the "distance" field of each match where a lower value is "better" (i.e. a closer match).
We check the complete set for the min (and  max, if required), and then only allow through matches that are within a given distance range. In this case we tweak the distance we accept manually (i.e. the magic numbers in the check "matches[i].distance" against "(3*min_dist)" || "< 10".
double min_dist = 9999;
  for (uint i=0; i< matches.size(); i++)
    double dist = matches[i].distance;
    if(dist < min_dist)
      min_dist = dist;

  std::vector<cv::DMatch> good_match;
  for (uint i=0; i< matches.size(); i++)
    if ((matches[i].distance < (3*min_dist)) || (matches[i].distance < 10))
At this point we can visually review the matches by passing the vector, along with the keypoints and source images to drawMatches(), using this piece of boilerplate code:
 cv::Mat imgMatch;
  cv::drawMatches(im0, keypoints_0, im1, keypoints_1, good_match, imgMatch,
    cv::Scalar::all(-1), cv::Scalar::all(-1),
    std::vector<char>(), cv::DrawMatchesFlags::NOT_DRAW_SINGLE_POINTS );
  cv::namedWindow("matches", CV_WINDOW_KEEPRATIO);
  cv::imshow("matches", imgMatch);


For example between Red and Blue we see
This shows that we have a few stray points. Annoyingly these are "strong" matches as far as OpenCV is concerned, and we could be smarter and try to eliminate them, but for the simple example we've actually got a good enough set to work with here given the number of genuine matches, so we can get a translation matrix from findHomography(), operating on the points in this match: i.e.
std::vector<cv::Point2f> pts1;
std::vector<cv::Point2f> pts2;
for(uint i = 0; i < good_match.size(); i++)  {   pts1.push_back(keypoints_0[good_match[i].queryIdx].pt);    pts2.push_back(keypoints_1[good_match[i].trainIdx].pt);
return cv::findHomography(pts2, pts1, CV_RANSAC);
So using the above function we can calculate the transform required to align the Red image to the green and blue image respectively. As a result if we have ch8[0] as Blue, ch8[1] as green and ch8[2] as red we can obtain the transform matrices as:
cv::Mat HgB, HgG;
  HgB = GetAlignment(ch8[2], ch8[0]);
  HgG = GetAlignment(ch8[2], ch8[1]);
And then build an image list which contains the Red image, and the Blue and Green images warped through  warpPerspective() to align them with the Red channel image:
std::vector<cv::Mat> imlist(3);
  imlist[2] = im16d[2];
  warpPerspective(im16d[0], imlist[0], HgB, cv::Size(im16d[2].cols, im16d[2].rows));
  warpPerspective(im16d[1], imlist[1], HgG, cv::Size(im16d[2].cols, im16d[2].rows));
And combine this into a single output image as before:
cv::Mat out;
cv::merge(imlist, out);
cv::imshow("combined", out);

And... it's Grey. Well, at least we're done.

This isn't great - we have some obvious colour fringes in the output, although this is also partly due to the time between images and relative movement of the moon and the camera. And we ideally want to cull the outlying points. And, well, it's just grey.
But for an automatic alignment using the "stock" calls it's not that bad.... Next up we can look at being a little smarter.