Wednesday, 27 January 2016

More Rosetta

The Navcam, or “sometimes things are easy”

The Rosetta Navcam is the “Navigation Camera”. As we mentioned, it isn't quite in the same data format as the WAC & NAC data. However it is a lot simpler to process.

The Navcam data archives pulled from the Small Bodies archive hold three files (under data/cam1) for each image
  • a .fit file: a FITS (Flexible Image Transport System) file with a header and image data
  • a .lbl file: a PDS format label only file
  • a .img file: the raw image data only file

We ignore the FITS file, since there isn't anything in there we can't get from the .lbl and .img file pair.

In theory for a given prefix the .lbl file and .img don't have to correspond directly (the .lbl file has an ^IMAGE field with and explicit target file name and offset), but in practice they always line up for the Navcam set.

And the .img files are all 2MByte files containing a single image of 1024*1024*16bpp – although the SOURCE_SAMPLE_BITS field in the .lbl file indicates these are actually 12bpp ranges we don't much care, since we read these as 16 bit values and can scale the dynamic range of the output anyway.

So to parse this all we have to do is: for a given file, load the .img file directly, pack the 16bpp data up and then move it all out into an openCV image structure, fiddle the gain to use the full 16bpp range, and we're done.

The code is essentially identical to the OSIRIS processing, except we don't even have to work out the offset in the .img file: we always load the complete file from the start, and the images are all the same size.

We can pull any other metadata on the image from the corresponding .lbl file, but otherwise that's it.

Tuesday, 26 January 2016


The Rosetta mission made headlines last year, at the rendezvous with comet 67P/Churyumov-Gerasimenko, and the deployment of the Philae lander. Some of the raw data is becoming available from the ESA, so let's take a look at that.

Imaging Systems

There are three imaging systems on Rosetta:
  • Navcam: The navigation camera
  • OSIRIS Wide Angle Camera
  • OSIRIS Narrow Angle Camera

The Navcam is a 1024*1024 pixel 16bpp instrument. Initially we're going to ignore this, since the data is distributed in a slightly different format than the primary (OSIRIS) instruments.

The two main imaging systems are really a single system: the “Optical, Spectroscopic, and Infrared Remote Imaging System” (OSIRIS). The output of the OSIRIS imaging systems is a maximum resolution of 2048x2048 pixels at 16bpp.

OSIRIS actually has two front-ends: a Narrow Angle Camera Unit, and a Wide Angle Camera Unit. These two units have different optical designs and filter sets, but then feed into a common electronics back end. However from our perspective these are distributed as separate, if similar, record sets and we can treat them as independent instruments.

The planetary society have a good background article on the instrument:

The Data Sets

These are available from ESA via FTP, 

The main project page is:

And the FTP site is 

However the NASA PDS Small Bodies node has a more convenient archive, since it offers tar.gz archives of complete data sets – these contain the same data as the ESA site in an easily downloadable form.

The main site is 

And scroll down to the section "67P/C-G Prelanding (Orbiter only)" for the sets.

One minor gotcha is that the cases of the files and patch have been transformed to lowercase, where the original references all use uppercase, but that's not a big deal.

The data sets have a common naming scheme similar to:

In this name “osinac” tells us this is the OSIRIS Narrow Camera, and this would be osiwac for the wide angle. “The “-2” indicates this is the basic un-calibrated data available. A “-3” here indicates calibrated records, but these are more complex to process, so we'll stick with the "-2" sets for now. 

In the PDS Small Bodies links The "-2" sets are referred to as the EDR sets, and the "-3" sets the CDR (Experimental and Calibrated Data Record sets respectively).

Finally the remainder of the name tells us the target and data volume, and each volume covers a range of dates for data acquisition, with a summary of the details in the top level file AAREADME.TXT. In general the “later” the volume the more interesting it is (i.e. closer to the comet).

What's in them

For the data sets then the main data is held in the “data/” subdirectory, and is further grouped in the directory tree by date.

So for the m07-v1.0 then the data is held in:

These files are “.img” files – they're PDS3 data records, where the name encodes the date and type of file acquisition. The complete details are in the “catalog/” file.

A lot of the details of PDS3 we went over as part of the Venus Express parser, however in this case the PDS header itself is more important – we need to parse it correctly to understand the data layout of the images.

However – let's be hacky.

Quickly Picking Apart the Files

We can quickly extract the image data as follows:

The PDS header contains top level file information, however it contains two key fields
  • ^IMAGE 
The RECORD_BYTES value tells you how big each record is, and the ^IMAGE tells you which one contains the start of the image data – so this quickly tells you that the offset to the start of image data in the file is:


Since record bytes is essentially just "512" you really only need the value of ^IMAGE.

However you still need a couple of additional pieces of information to extract the data: Particularly you need to know the resolution of the image within the file.

This information is held inside the section of the PDS header that runs between

And from this section of data we need to extract:
  • LINES: The number of vertical lines in the image data
  • LINE_SAMPLES: The horizontal resolution
  • SAMPLE_BITS: The bpp
  • SAMPLE_TYPE: how it's formatted.
  • FIRST_LINE: The first row (Y value) in the image data
  • FIRST_LINE_SAMPLE: The first X co-ordinate in the image data

However we can be really hacky here, and assume that everything starts in the top corner (starting x & Y of 1) and that the samples are all raw 16bpp unsigned LSB values – this will work for (almost) all of the images in the level 2 datasets.

This still leaves us needing to get LINES and LINE_SAMPLES from the IMAGE section of the header – we can pull these using our existing PDS parser, although we have to be careful to make sure we pull the IMAGE version of these numbers – some files contain additional objects which mean the header will have different LINES/LINE_SAMPLES values for the different objects.

However here we can be even hackier and sort the files based on size: 2048*2048 images will be about 8M, and 1024*1024 will be about 2M. We can ignore anything smaller, and assumes the image sizes (and therefore data available) in those two groupings.

So, an example: The file: N20140806T062000104ID20F28.IMG

Parsing the header (eliminating white-space to make this easier to read) then we see
^IMAGE       = 42

Also we get the IMAGE section

    LINES                    = 2048 
    LINE_SAMPLES             = 2048 
    FIRST_LINE               = 1 
    FIRST_LINE_SAMPLE        = 1 
    SAMPLE_BITS              = 16 
    BANDS                    = 1 
END_OBJECT                   = IMAGE  

So this has the image data start at (42-1)*512 = 20992 bytes. The image data size is 8388608 bytes (2048*2048*2), and our assumptions about the format (512 byte records, 16bpp LSB image data, starting at 1,1) are good.

Reading the file we simply seek to the correct location, and load the data with code like:

  QFile fi;
  QByteArray imd;
   imd =;

And the just load up the 16bpp values from the file data with code like:

void RawImage::SetImageData(QByteArray im, int width, int height)
  _width = width;
  _height = height;
   int cursor =0;
  for (int y=0; y < height; y++)
    for (int x=0; x < width; x++)
      int v;
      int vh,vl;
        vh = (unsigned char);
        vl = (unsigned char);
        v = (vh << 8) | vl;
        cursor += 2;
        if (cursor >= im.size())
Where “_data”is just a Qlist<int>. 

From this we can drop the values into an OpenCV image structure:

void RawImage::RegenerateImage()
int x,y;
int cursor;
int16_t val;
  cursor = 0;
  _im16 = cv::Mat(_height, _width, CV_16UC1);
  for (y=0; y < _height; y++)
    for (x=0; x < _width; x++)
      if (cursor >= _data.length())
        return; // Sanity bound check
      val =;<uint16_t>(cv::Point(x,y)) = val;

And at this point we can correct our levels, save out a tiff, etc as we've done with previous image sets.

So running this over the data we have and viewing the resulting .tiff files in an image browser (XFCE's ristretto in this screenshot) gets us:

And that's looking enough like the comet we expect that we can't have screwed up that badly...

(just scrolling through the image sets after processing and seeing the slow rotation of 67P here is actually sort of awesome...)

We can also do a similar processing sequence on Wide Angle images for similar results...

One major step we haven't taken is image rotation: The WAC images should be flipped vertically, and the NAC angle both vertically and horizontally to get the correct alignment – we could do this when building the OpenCV image by messing around with the X & Y stepping.

Also we're fairly clearly messing up the processing of smaller images, and we don't cope with all the options we could encounter in the PDS data set, so we will be messing up some content.
But we've got a whole list of missing features, and really need to build a “proper” PDS parser anyway, so given this is a ten minute hack on top of the code we already had for Venus Express then let's call it good enough for now.

We can keep up to date with the releases from the PDS Small bodies node at for new releases and data sets