LizardTech.com

Archive for December, 2008

Only in the Age of Google

Friday, December 12th, 2008

One of our engineers was investigating compiler error C2766 (for a friend offsite, of course. We never get compiler errors ourselves, you understand). He did what anyone would do, he googled it. His monitor was suddenly awash in returns linking to pages about the assassination of JFK. Turns out the error number is the same as the serial number of the Italian rifle used in that doleful outing.

After we all had a good macabre snort about this, one of our guys commented that there are alarming juxtapositions that would only come to light in the Google Age.

Thought I: Touché! That’s worth a post.

No data, yes data: NODATA

Thursday, December 4th, 2008

This sort of email comes across our desks pretty regularly:

I’ve got a GeoTIFF image that has some transparent background areas. When I view it with the Acme GIS Viewer, those transparent areas look black, as they should. But when I compress the image with MrSID, those areas aren’t all black anymore – they look all dotty, spotty and mottled! What am I doing wrong?!

This is not a bug, and you’re not doing anything wrong either: it is, alas, a part of the compression process. Long ago when your mother told you that you would have some good days and some bad days, this is what she meant.

Allow me to explain.

Consider Exhibit A:

Exhibit A

We see an image that has been rotated several degrees and the exposed triangular areas in the corners are a solid color. These corner pixels are intended to be treated as transparent – the term of art here is “NODATA” – and it just so happens that I chose to present the image to you against a black background. If I chose to present the exact same image to you against a fuchsia background, you’d get Exhibit B:

Exhibit B

NODATA pixels are an essential part of the mosaicking process. Were we to construct a mosaic by placing the Exhibit A image on top of another adjoining second image, which was similarly rotated, our transparent areas allow the underlying (“real”) pixels from the second image to show through.

All pixels in an image have to be assigned numerical color values – even the NODATA ones. In Exhibit A, the guy who encoded the image happened to choose black, i.e. (0,0,0), which is pretty typical, but he could have also chosen mauve or puce. The NODATA value is encoded in the image’s header somewhere, so that when a program wants to display the image, it knows that any pixel with that special NODATA color should not be shown and instead let whatever pixel is “underneath” shine through. Nine times out of ten, that will be black – but it could be a fuchsia background or the pixels from another image tile.

(There’s a smart guy in the back of the room mumbling about alpha channels and using shapefiles for masks and stuff like that. Thank you, yes, thank you, those are also ways to mosaic images too, but we’re not using that method today. Let’s move on, please, we have a lot to cover today.)

Right. So far so good.

Now for the tricky bit.

Let’s say you compress Exhibit A to MrSID with a compression ratio of 30:1. The output image looks almost identical to the input image – great. But look at Exhibit C, in which that same output image is displayed against a white background:

Exhibit C

“Yikes!” you say. “What am I doing wrong?!” you ask.

Because we compressed the image at 30:1 we’re doing lossy compression, meaning that each pixel’s color values in the output image will be close to, but not exactly equal to, the corresponding pixel’s color values in the input image. The compression process will change some pixels by just a few units, such as changing a (0,0,0) pixel to a (1,0,2) pixel. We won’t get into the algorithmic details of how and why the compression does this, we’ll just assert that it does and tell you that it happens most often at the “edges” of images or when you’re looking at zoomed-out, reduced-resolution versions of images. Fortunately, this change is typically imperceptible to the casual observer.

Unfortunately, the replacer of NODATA pixels is not a casual observer. The replacer is rather tetchy, actually. Keen eye for detail. Type A, O/C, that sort of thing. Which means that it’s gonna treat only the (0,0,0) pixels as transparent and none others. Which means that those pixels that got nudged up to being (1,0,2) and (0,1,1) and such are gonna stay right where they are. Which means that our putative, erstwhile transparent layer is gonna look funny. Especially at the edges of the images or when you’re looking at zoomed-out, reduced-resolution versions of images.

It’s not a bug in MrSID, and you’re not doing anything wrong: it is an unfortunate but necessary side-effect of the compression algorithms we use. Such side-effects are called “artifacts”, and we call this kind of artifact “speckling”.

All is not completely lost, however.

One way to alleviate the problem is to get the NODATA replacer to relax a little – loosen its tie, roll its shoulders a few times, take a couple deep cleansing breaths. Instead of only looking at the (0,0,0) NODATA pixels, it can use a “fuzzy” replacement algorithm, meaning that those pixel values relatively close to (0,0,0) will also be replaced, like (1,0,2) and (0,1,1). Our Express Server, for example, allows you to use this feature when mosaicking layers and even control the “degree” of fuzziness to use. (See the manual for details.) This works great, but keep in mind that you’re replacing more pixels now, and you might have pixels with value (0,1,1) that are supposed to be actual data, such as a dark shadow or black rooftop: these will get treated as NODATA and allow the underlying background data to show through.

Another approach is to use the “despeckle tool” in GeoExpress 7, as described in an earlier blog post. This new feature allows you to handle the whole issue a little more cleanly at encode time, or to clean up existing imagery you’ve already encoded. It works by attempting to find the “real” edges of images like those in Exhibit A, or more complexly polygonal mosaics, and then inhibiting the compression algorithms from touching pixels outside of those edges. Essentially what the propeller-head in the back of the room was going on about earlier. It is not always perfect, but it works for most cases. Again, see the manual for details.

Next time, perhaps, we’ll talk about some other cool artifacts like “ringing” and “grout”. Thanks for tuning in.

-mpg