First posted August 27, 2016 by Stacy Muller (camera guy).
A HUMAN Visual Zone System Approach to Digital Image Editing and Enhancement
(and why this is more useful and simpler than a "digital zone system" at the image capture stage
- even one where a problem of "tonal stop accuracy" in other such systems is solved).
aka
"The Brief Rise and Fall of the 'Tonally Stop-Accurate HDR Zone System'"
HERE WE EXPLORE AND EXPLOIT AN IMPORTANT UNDERSTANDING OF HOW A DIGITAL CAMERA RESPONDS TO EXPOSURE IN COMPARISON TO HOW THE HUMAN EYE DOES, AS WELL AS CONCEPTS OF "TONAL COMPRESSION" AND "TONAL EXPANSION"!
It's an odd aspect of the human visual system that we can at least somewhat sense the difference between a bright object that registers as 100% pure white (or nearly so, like some highly lit white powder) and another that goes "beyond" 100% - "superwhite" if you like, as with direct light sources and specular (i.e. mirror-reflected) highlights. Ideally in order to accurately recreate a scene you are looking at, the camera should capture it and the final display medium should be able to replicate in each pixel the same amount of light that entered your eyes in the equivalent parts of the scene - but current technology falls short of this ideal. Even if the mid-tones are accurately exposed in the camera and reasonably displayed, the display medium will likely not be able to show you specular highlights that are as bright as you actually saw on the scene and probably SHOULD not (even on a backlit monitor) in order to preserve your eye's comfort, but the result is that you might notice a loss of detail with highlights appearing to blow out more than they should - especially when the image is set up to try to be accurate with the amount of light it shows you for other tones. Similarly you might not see darks in the resulting image that are as dark as you actually saw on the scene, and so you might notice a loss of detail there too - although some of this might be necessary to preserve contrast in your image and give it a 3D-ish "pop", whereas overly blown out highlights are usually less forgivable.
As we are about to see in a diagram, the difference in lightness between the darkest tone and brightest tone - or the "dynamic range" or contrast - that exists in a scene might be so high that it exceeds the dynamic range that your camera can capture - and, and possibly even that which your eye can capture. So long as the camera is reasonably close to capturing the dynamic range that your eye sees of a scene, the resulting image should look good, but often the camera can NOT capture this much dynamic range. What's more, the dynamic range of the final display medium likely can not cover that of what your sensor captures in terms of actual "stops" (i.e. doubling or halving of an amount of light) and commits to a RAW file, so some "tonal remapping" is done to make the image look good either by the camera into a JPEG image or outside of the camera through some software- based processing of the RAW file. Your camera by its default settings tries to render an image in a reasonable way at least to its display monitor - which essentially previews what the resulting JPEG image might be. The following diagram recounts this process more visually:
You can see more of the actual recorded detail in both the highlights and darks than what the camera displays by its default settings if you electronically darken the highlights and lighten the darks post- capture, but too much of this will give you an unnaturally low contrast or "flat" looking image. Your camera by its default settings suggests a balance of sorts meant for both its own display and an accurately set-up computer monitor, but these settings may not result in what you feel is an accurate rendition of the scene, or a possibly "enhanced" rendition of the scene that you might desire. Nor might the resulting image be set up by default to print as you want - where the print medium has its own set of limitations. Still, such printed pictures or even backlit pictures on a computer monitor look more natural in most cases with at least some modification to the highlights - especially the specular ones that might normally register as a maximum "pure white" and other highlights that register nearly as such - such that they are electronically brought down in exposure so that they gently "roll off" in gradations instead of all looking starkly pure white. So, there is good reason why a camera has an exposure profile of sorts that deviates from showing you tones with light levels that exactly match what you see on the scene.
Theoretically the RAW file records values that are closer to the actual amount of light measured from the scene prior to the exposure profile changing those values. And far more than just theoretically, the RAW file offers a tremendous amount of latitude - especially in the highlights - by which you can the lightness levels of tones for whatever purpose you might have in mind. If you are not happy with the highlight rolloff for instance, you could change this. It's likely not just the specular and near-specular highlights that get "compressed" by your camera's exposure profile in a deviation from the equivalent human vision lightness values. Other tones might be compressed as well, or expanded. You might be completely happy with that. Or, you might want the control to be able to change this either in the service of accuracy toward human vision, or toward "enhancing" parts of the picture with exposure adjustment in a way reminiscent of the Zone System for film - perhaps as a form of "altered reality". Even if you are more concerned about preserving accuracy towards human vision, it may still seem counterintuitive, but you will likely not be happier to match the tonal values that the camera records to what human vision might expect, but rather you will likely prefer again being able to see more detail in shadows and/or highlights through embracing and controlling the imposed DIVERSION here made possible through the camera's exposure profile.
So-called digital zone systems often have a drawback in that they do not respect the particular way that a specific camera responds to exposure and processes its RAW files based on its exposure profile. Some such systems establish exposure zones based on assumptions - without even testing the camera. We can not assume that all cameras work the same in this regard. But we can assume that most human eyes work the same - such that we can establish exposure zones for the typical human eye. Thus it can be instructive to establish a zone system of sorts for the human eye before we even attempt to develop one for a particular digital camera or even assume that we need one (which I would argue usually we do not) - noting that I am not challenging the original Zone System for film as the exposure response of film is different than that of a digital sensor and closer to that of the human eye. That said, both the human eye and the digital sensor are more sensitive to a "stop" difference of light in the non-specular highlights than in the shadows - a fact that as we'll see later can be exploited!
it is beneficial that we understand that the values recorded by a digital sensor must go through conversion dictated by at least one "curve" of sorts in order to better match the values we want for human vision.
In other words, it is good to first understand the difference between roughly the values that the sensor records and those, after conversion, that comprise tonal lightness values that are easier to understand (as measured on a greyscale) and are more friendly towards human vision when displayed as pixels, and then furthermore another curve may be applied for the tonal compression and expansion I mentioned before - especially the tonal compression in the highlights, and then even after that point we may want to impose yet more modification based on our own adjustment of a "tone curve". But let's start with the first curve - a conversion of the "linear" data recorded by the sensor (which is essentially a count of photons hitting a part of the sensor) to a "non-linear" set of lightness values that are more easily mapped to what appear to our (oddly non-linearly exposure responsive) eyes as a linear greyscale - simplified to include tones from pure black (0) to pure white (255) and for now now specular/superwhite highlights beyond that.
To further explain, we start with "reflectance" values that are essentially values taken from a camera's sensor that has a so-called "linear" response to light, and then account for how our eyes, which have a so-called "non- linear" response to light, would see these values as translated to say tonal percentages in a greyscale or equivalent tones as too dark ... UNLESS that is, these tones were also raised to a non-linear curve of LIGHTER tonal values especially where the mid-tones are. Allow me illustrate this to you, with this graph - replacing any actual sensor values or equivalent percentages from 0% to 100% with RGB tones ranging from 0 to 255. As well, the data points you see will be explained soon enough:
You might be familiar with this kind of graph that's called a tone curve, but it's ok if you are not. Basically the tone curve allows us to take an image with both colour and black and white tones equivalent to those in our greyscale at the bottom of the graph, and map those tones as we see fit into others - particularly a range of them. That's ANY range, into any DIFFERENT range.
This graph demonstrates our possible greyscale range of linear sensor data as linearly converted to RGB tones on the bottom greyscale, and the blue horizontal line represents a range of such linear tones - quite LITERALLY linearly, on that line. If we use this line to map our tones of the bottom greyscale to the other greyscale on the right, NOTHING is changed. That's ok. We're just starting. You'll see that I have a sampling of tones as points on the blue line. These represent linear tones derived from various half stops of light, or increments if you will that each differ by a half stop from each other. This progression actually starts at the top right of the graph at full bright white at tone RBG tone 255, and then gets progressively more compact especially as we get closer to the black RGB tone 0, which I stopped short of to avoid an even greater clutter in that area.
I mentioned before that we want to literally raise this line into a curve to lighten up the tones, especially the mid-tones, and you can see that my sample points have been vertically raised to points on the green curve, and particularly dramatically around the middle of the range That's our non-linear curve that we use to do the proper tone-mapping that to create tones that are more friendly to our eyes. 100% reflectance as RGB tone 255 of course has no where to be raised and so is not, and although I omitted it, 0% reflectance as RBG tone 0 is not raised either.
Just to clarify, even though we MIGHT use the final results of the tone mapping to RGB tones in, say, a JPEG image, realistically we would not actually STORE RGB tone equivalents on the camera of the light it captured, nor would we necessarily translate our sensor's photon counts into actual reflectance percentages, but it's easier to work here with those kinds of simplified equivalents of the typically much larger numbers that our camera actually IS storing, and in a way that gives us results equivalent to what we want anyway. So this, then, is the linear-to-non-linear conversion we need, in a nutshell - or the curve for this, really. By the way, I mentioned my points represented half stop increments. So what would the tones at those points look like to us? The points on the green curve tell the tale. Just imagine the tones on the greyscale to the right of them, aligned with them. A slight alteration to our prior graph then gives us the half stop divisions:
...which will become VERY important in just a while.
The tone-raising conversion I've taken you through, by the way, has MUCH to do with a common question. Why is a reference grey card that we might use to calibrate our camera, and that LOOKS to be about 50% grey between a scale of 0% for black and 100% for white, called an 18% grey card instead of 50%? More specifically, it might be called an 18% REFLECTANCE card, and that's more than a clue to the answer. Some such cards are rated at a fractionally smaller reflectance to better match being closer to exactly 2.5 stops from a reference 100% reflectance for white. If you feel keen, try to locate this on the tone map above. But don't worry. We'll soon get into this a little more - deriving our conversion with a little math. But for now, it may be suffice to say that much of the math is derived from using something called the "CIELAB" colour space! Details of that aside for now, it might be helpful to show another graph of curves that better demonstrate what I've been talking about in terms of half stops while also introducing where a "grey card" might come in, and for that matter, where a scale of exposure "zones" might come in with the grey card's reflectance value placed squarely in the middle of the classical "Zone V" middle (or at least middle grey) zone. If these zone divisions look different to you than what you've seen from any other zone systems (not that I'm expecting or requiring that you've looked at any) particularly with a progression of tonal compression getting into the darks, that's because here we are introducing zones that are more "tonally stop-accurate" than what other zone scales show at least based on how stops directly translate to human vision - PRIOR to further tonal mapping by the camera or suggested external RAW processing via the camera's profile for such processing which would alter the tonal compression further - most likely and particularly with tonal compression going from mid or light tones into lighter highlights (which we'll account for later):
I know the curves look different than the ones of the prior graphs but believe me they actually hold the same values and demonstrate the same conversion, but I changed the bottom axis to accommodate stops rather than a more linear spread of tones, which does tend to change a few things. But let's look back again at the greyscale divisions at the right, based on RGB values, which by the way could be adjusted to slightly different sRGB values for people who might need those instead, which I list in the upcoming table. Either way, you can see these divisions and the progressive tonal compression in the greyscale, and that should tell you how we might start to divide our greyscale into half-stop exposure zones - albeit with later adjustment to how your camera might deviate from this particularly in the range of mid-tones to highlights (while there is still tonal compression progression into the darker zones). The rest of the graph hints at how the divisions were derived mathematically - performing that eye-friendly linear-to-non-linear conversion I discussed earlier. It also indicates where reference grey cards come in to play. Here is that table of sRGB values - with values that again divide a greyscale through tonal compression albeit not likely as accurately as values that could be derived from testing more specifically how YOUR camera reacts to exposure:
I'm going to go into a little more detail here, but first I've got to admit something. I'd be too embarrassed to tell you how long I've been curious to find a viable translation between stops and a greyscale like I've shown you, and how long it took me to figure this out in lieu of easily finding anything, or for that matter how long it took me to figure out what I call my "master table", from which the zone divisions and much of the math comes from. If you're curious to have a look, here's my eyesight-centric master table:
Note the right side of the table for an example of actual camera f-stop settings relating to the various reflectance values. I wanted to understand this relationship, but at the same time please realize that adjusting your f-stop for various zones for the purpose of blending multiple exposures together is NOT a good idea, as it makes your depth of (in-focus) field inconsistent. Shutter speed and even ISO are better choices to go with. You'll notice the aperture area and the reflectance values are the same, both columns shaded in grey. Even such a simple relationship to tie in f-stops took me more than a while to figure out. The REAL goal though was to find out what IDEALLY the tones or brightness or lightness values for each half stop, including a reference white and reference grey, the camera should record ASSUMING its goal is to MIMIC human eyesight, which is so far what I have done and presented. This gives us a starting point or reference from which to proceed with analyzing our camera and establishing useable zone scales.
BUT WILL MY CAMERA REALLY RECORD HALF STOP TONES THIS "IDEAL" WAY, AND IS THAT WAY REALLY "IDEAL" FOR THE FINAL DISPLAY? (- LIKELY NO, and NO):
I've already mentioned how tonal compression of the highlights post-capture can create a nice highlight rolloff on the display medium, but sometimes there can be other tonal remapping/compression/expansion going on that, depending on what we want to do, may be to our advantage or detriment. It is to our great advantage that the camera sensor records tones linearly and that we have some control over how WE want to perform the tonal remapping. It is also to our great convenience that when we do post-capture exposure adjustment, any tonal compression or expansion required is taken into account if we are using proper (and not necessarily expensive) software. This can seem magical for instance when we can recover detail through tonal expansion in electronically darkening the exposure of otherwise tonally compressed highlights! Still, one can be skeptical that the sensor records tones as we might expect or even as linearly as we've heard it does when taking into account other factors, and I don't proclaim to know them all.
Without getting too technical, I think there is one important question to be asked - playing devil's advocate to my own work here. You or I might well ask: From camera to image, is what we are referring to as "stops" in terms of camera settings or readings off of the sensor actually translating on a resulting image into something that could be measured in that image as being completely ACCURATE as stops in terms of tones as our eye sees them (with reference to the above table)? Obviously NO, with any additional tonal remapping as with the highlight compression, but even without THAT, it's hard to say if a particular camera could create such an image without US meddling with it creating our OWN tonal remapping (or we'll say "human vision correction") curve. Likely it's not worth our bother to attempt this anyway, but it's worth understanding that there could be many diversions from human vision due to many factors not all of which are easily known. But at least we do have some control over this diversion - either allowing it or countering it, or even adding to it.
The data I've shown you so far assumes a camera that perfectly counts the photons hitting its sensor and, either through the camera or some RAW processing software, does a perfect conversion of linear data to non-linear tones for us - or as perfect as one can expect within a so-called "L.a.b." (/"Lab"/"CIELAB") colour space which, from what I've heard, is perfect enough. This would seem to be a simple, expected norm of how we might expect a camera to operate, but it is an unrealistic expectation of probably most cameras though. This is partly because often the camera manufacturer will determine that a more "pleasing" image can be obtained by giving the camera a different "character" (as I've hinted with the compression of highlights); and also because often the photon counts, which are NOT completely accurate, are converted to tones most popularly for the so-called sRGB colour space using a slightly less accurate formula than Lab; and also because, even if we instead use the sRGB values I've worked out, those will not likely work as they are usually in a sense coloured by a so- called "s-curve". What's an s-curve? Well do you remember that last graph of curves I showed you?
You might imagine the green of that graph turned into an inverted S shape at its top, or otherwise flipped on the horizontal axis with the same treatment at the top - thus turning the curve into what we would call an s-curve. If nothing else, the s-curve bends the top of what we might call a "characteristic" or "exposure response" curve into a curve shoulder to complete the S shape, which then in that area presents greater tonal compression in the highlights - in much the same way as I've pointed out we find in the darks but using a much smaller part of the overall curve. In this small area that deviates from the usual curve, what we're calling a "stop", as it may be adjusted on the camera regulating how much light is hitting the sensor or even adjusting the ISO, is very much no longer acting as an accurate stop as it may be translated to our image - although typically this is more of a concern for JPEG files than RAW files, and a matter of what kind of RAW conversion outside of the camera we area using in our workflow. At the same time, even though the camera's sensor is linear, there CAN be a slight s- curve applied to the RAW data that can't be prevented or undone, and that is for reasons that have more to do with the hardware and electronics of the camera than any software or firmware.
But this effect, which is partly a tonal compression effect, DOES have the advantage of gentler highlight rolloff much like some film stocks - especially if, by the way, we add extra stops of dynamic range to record specular highlights, but then allow either the camera or our RAW conversion, preferably as a non-mandatory option mind you, to tone compress these specular highlights back down so they don't exceed the maximum 100% tone we can display at least when 49.1% tone is at zone V's centre. Yeah, I know that's quite a lot to chew on. I've also observed what I have to assume is the s-curve introducing at least a little tonal expansion and therefore higher contrast from the simple norm in the mid-tones and even to some extent into our darker tones (albeit without reversing the overall tendency of progressive tonal compression into the darker tones as I've discussed). This can give us an image that is more pleasing where higher contrast helps, while combining with the highlight rolloff and similar tonal compression left for the darker shadows that are also pleasing.
But all this agin veers away from that simple norm or "ideal" of mimicking the mathematical model we have of how human eyesight works - although again for the purpose of displaying a pleasing image this diversion might actually be the greater "ideal", while a frame of reference in recorded RAW data that has less or no such diversion is an additional ideal. This all makes it difficult to create a one-size-fits-all scale of zones with clearly defined tones if this is something we want to do - but perhaps not so difficult that we can't come "close enough" at least for a general rule-of-thumb zone scale that's suitable for most digital cameras! We'd likely have to make a zone scale that is "safe" or "cautious" though to avoid highlight blowout - much like the histogram on a camera that doesn't touch the "hidden" highlights in the RAW file - whereas a more customized zone scale would better measure just how far you can push your camera towards highlights ... made farther than you might think thanks to a modern camera with greater dynamic range and an s-curve that deals with that.
To get a better idea of what the s-curve looks like, the green curve you saw above, then, is thus converted, after the bottom axis is flipped, to a different curve that's coming up next, with a similar explanation to what I just gave you embedded in it. But at this point that text is a bit of a digression, or maybe a "bonus" to you depending on your level of interest:
TOWARD "MORE CORRECT" ZONE SCALES:
For reasons I've already mentioned, a zone scale based on human eyesight likely does not show EXACTLY how YOUR particular camera handles exposure tonally, but it is a more educated GUESS as a one-size-nearly-fits-all zone scale. It takes into account eyesight mimicry, but frankly not any s-curve that would be applied for highlight rolloff as I've discussed or for comparative tonal expansion in the mid-tones and even in the darks to give the image some contrast or "pop". So in taking out the mysterious s-curve, our scale is made rather conservative in avoiding trouble with highlights while some cameras we use could likely push this further. I'm guessing that any digital camera out there that more closely resembles our eyesight model is likely an older one that is limited in recording specular highlights as anything other than clipped data, and hardly applies an s-curve. Our scale is also made conservatively safe for shadows - assuming as do most zone systems that there are few stops from our mid grey reference before the tones are black or nearly so with noise and grain and no detail, when some cameras might show at least slightly more detail or texture here - although the s-curve might actually make the tones darker here again with tonal expansion. Perhaps we are being too conservative here then for how our camera records detail - although perhaps not if we also want to account for what darker tones are "useable" in our final display medium. Either way, the scale is a starting point that needs not to be written in stone.
For those seeking absolute accuracy from camera to image editor though, as you should, I caution that this scale, despite being potentially more accurate than other generic scales you would see elsewhere (as I will soon enough demonstrate), could still be WAY off for your particular camera. But again it helps to have an "ideal" reference to start with and compare things to. So in that vein, make sure your web browser is set wide across your screen as I present TWO tables side-by-side here - the one I promised, and beside that, an example of a custom zone scale made for a specific camera after testing it, as I first introduced in my video. The tones were measured from multiple RAW exposures at different shutter speeds taken at a base ISO using ACR in a "Camera Neutral profile". You may see how the highlights are extended from the simple norm of the first table into a rolloff in the second table, and, looking at the difference between adjacent half-stop zones in that second table, you might see that there is some slight tonal expansion in some mid tones such as in zone V and in darker zones.
For instance, from the middle of zone IV where both scales share nearly the same tone, you might observe how there is some slight tonal expansion from the norm in darker mid tones albeit with less expansion the darker we get, while I noted that the camera could record more detail than the general conservative scale might have predicted. I would say though that the more general scale is not exactly WAY off for the camera I tested:
You might find it interesting to compare these two tables. One thing I found interesting other than how ("reasonably") well they compare is how in the second table, zone VII is no longer "king" for the best tonal separation. Zone VI is - although zone V is close enough that it might as well share the title. This is getting back to a traditional FILM-based notion that zone V is the best for tonal separation. It seems that the character of the camera
makes it so, or nearly so. Where zone VII on the first table seems to either just miss the point where tonal compression occurs in the highlights or dangerously touch highlight rolloff, the same zone on the second table seems even more dangerous for its tonal compression, whereas zone VI seems to miss the tonal compression completely. You couldn't pick any other two half-stop ranges to comprise such a tonally wide full-stop range. But that's assuming that we are displaying an image much as we are recording it. For technical reasons I'll visit later that are beyond the charts above, the "best" zone on our particular test camera, which is digital, to record the greatest amount of tonal range is high in the highlights - zone IX most likely. This would allow us to adjust the exposure post-capture and expand all those tones tremendously - despite being able to only see them by "default" prior as being highly compressed! Not to confuse the issue, but we could be talking about using many zone scales in an image-capturing workflow between which tonal conversions may be necessary - including a human visual scale, one for our camera, and even one regarding our final display medium.
You will see that later despite all my work here that I will dissuade you from using a zone system at least at the image acquisition phase (vs any post-capture image enhancement phase) in favour of a simpler system I have developed. But let's say for now that we did want to use or develop at least one zone scale for whatever reason, and maybe later we might find that we do - although I can tell you now that it is highly optional. But stick with me here as at least pretending we need zone scales will allow us to learn some potentially important things! - even if only why a traditional "digital" zone scale might not do for you, nor even the one I'm hinting here that I also claim is "better".
Notice in the left/first more general table that the dynamic range from the middle grey to white is limited to the point that zone VIII is "flat" with practically no more tonal dimension than one tonal step if even that much, and there are no more zones after that. One COULD however optionally extend the zones for a camera we might have that can capture specular highlights beyond our 100% reference white without the automatic s-curve tonal compression - much as I did on the right/second table through zone X. I would encourage that such a task is left up to US instead of assuming that my first table is written in stone. Some might find adding a specular zone overkill however as you can underexpose specular highlights to better see them on your camera's display and capture them and, if need be, add them back to a zone that would not exceed what's displayable (through exposure adjustment in editing). But the option still remains to expand the zones at least for temporarily holding specular highlights and darker shadows if this is desired. It's possible for instance that the zone scale for YOU might actually be somewhat of a hybrid between the two scales above.
Now, the more general scale you've just seen, discounting any strong s-curve-based deviation that a camera might have from it, is THEORETICALLY FINE for a camera that meters well, or averages what it sees, to 18% reflectance or nearly so. That is to say, your camera should expose your 18% grey card at your so-called ISO baseline, which is your lowest settable ISO, so that the card accurately translates to a 50% tone or very nearly so. But some cameras at the baseline might underexpose the grey card. Doing so in essence takes more of a risk of underexposing some scenes using auto-exposure, but it might actually better expose other scenes - with less of a risk of overexposing them while better protecting highlights since now we're giving more dynamic range between that a lower reference reflectance and 100% white, and possibly with less noise and grain too. But for using the zone scale above, this inaccuracy of metering an 18% grey card is not going to work, unless that is you can accurately adjust your camera's exposure compensation to add enough extra exposure - likely by some small number of fractional stops. People often debate if, in a situation like this, the camera was intentionally manufactured to meter at a lower reference reflectance than 18% to allow those advantages I mentioned, or if in fact the metering system is still set for 18% but the manufacturer set the camera up with the wrong baseline ISO - with that level of amplification of sensor signal and noise somehow affecting how accurately the camera meters EXCEPT when you add the proper exposure compensation. Either way, the EFFECT is the same.
So if you're camera is giving you that effect, and if you don't want to bother with exposure compensation, you're going to need a different grey card - one that is darker and perhaps less traditional than the 18% card. Some people claim that often a camera will meter better to 12.5% reflectance, amongst other numbers not so far from this, and that you're in business then if you can find a 12.5% grey card or even an equivalent patch you can spot meter to. This is a half stop darker reflectance than 18%, giving us that extra half stop in dynamic range from reference 100% white. So how many stops does that give us between our grey and white references? Let's see. 12.5 doubled is 25, then 50, then 100 (unless you prefer to start with 100 and halve) - so 3 stops, precisely. To get the reflectance values in-between any of these full stop increments, like our ideal 17.68% middle grey, you have to use the square root of two to fit our non-linear progression, but I digress. Let's show my zone scale modified for 12.5% reflectance then, but before you look too deeply at it, the thing I'd ask you to look out for is how, basically, there is VERY little difference between the two scales EXCEPT just a few things. First, there is the half-stop difference in where the zones are placed...
Secondly, there is the extra half-stop of dynamic range between the grey and white references. This sort of forces us to add a zone IX as a "flat" zone if we need such a reference zone - although this is a zone that might be considered flatter than even just one tonal step, without dimension if you will, as by our mathematical model it can only be reached through overexposure or the potential arguable overkill of specular highlights I mentioned - but on the other hand in reality it can be reached by cameras like the one I tested in the graph prior, which believe it or not has only a one-inch chip - a Nikon 1 V2 that does a GREAT job capturing highlights beyond reference white, but tested mind you under INDOOR lighting for my table. I was able to adjust the camera's shutter by third and full stops but not by half stops. However, ACR's Exposure Slider matched the resulting tones so well that I could use it to interpolate any tones I was missing.
So, I've left zone IX in the scale as kind of a non-zone, but also a flag of a zone that you should NOT assume you can use at least in terms of trying to capture this range with your camera accurately metering to 12.5% reflectance - UNLESS that is your camera has tested that it CAN use that zone, in which case you could alter the scale. You'll get a sense of whether it can even just by the highlight recovery that you can accomplish in ACR.
Otherwise, no great differences between the two "general" scales. Who said it had to be harder than this?
Which zone scale you might use, if any, depends on which one your camera works best with, and that would be determined from testing your camera. And don't forget that, more in line with the chart prior to the last
one, there's a THIRD option here to find and establish zone scale divisions in a fashion that is more customized and accurate to your camera - based on somewhat more extensive testing of your camera. There's really no math or graphs involved in that either, but still kind of a pain in the butt to test your camera under many varying conditions such as ISO values, lenses, lighting conditions, etc.
Without a better and simpler method as I intend to discuss, arriving at a customized zone scale like the prior one shown could more useful in the acquisition phase of an image than in editing, and thus this scale might
prove more important than any step wedge overlay that might be derived from it for use in editing. But at least some people with exacting standards will find use in such an overlay, so it's good to have that option. Here is one, for instance, derived from the prior example of a customized zone scale:
I've hinted now so much at a "tonally stop-accurate zone system" that I've practically given it away for free, but I've show you all this work I've done in order to indicate that as good as it is, I still have something simpler and better to offer, but even so, I don't mind if people - especially fellow zone system aficionados - find some usefulness in what I've introduced here. Perhaps it can be used in combination with the simpler technique I have to offer, but likely only for specialized things like landscape and architecture photography - perhaps employing HDR imaging. In fact my original name for this system was ... well have a look at this graphic, for the e-book that never was (except for what part of it is essentially here for free)!:
I developed this based on trying to find a workable digital zone system for myself particularly for doing HDR work, and finding problems in "tonal accuracy" in those systems and then a solution of sorts for myself - well two really. So how is the first solution better or more "tonally stop-accurate" than other digital zone systems? You might have already figured it out. Certainly I am not criticizing the original Zone System for film, but rather how it has been adapted by various individuals independently of each other into "digital" systems. I am not saying that they are without there merits or uses either. But there ARE problems with them, at least for doing certain things, and all relate to a lack of tonal (stop) accuracy that prevents them from being as convenient and accurate as they could be and as I believe my system/solution is, as I'll explain.
The problem is that while most (but not all) of the systems offer what we might call "one-stop exposure zones" that make spot meter usage more convenient, all of the (other) systems so far as I know do NOT offer what I personally was looking for to decide on such a system for myself, and that is what I call "one-stop TONAL divisions" that might both accompany and truly define the zones, while taking into account the tonal compression and expansion between exposures I discussed that takes place after the linear sensor data is converted to non-linear tones with an additional s-curve treatment that is friendly towards human vision in whatever colour space we might chose. Nor do these systems offer even a simple "step wedge overlay" of full or half stop tonal divisions for Photoshop or any other image editing app.
These systems may offer zone scales with typically 10% (rather than progressively compressed or expanded) tonal spans from centre-to-adjacent-centre, and in some cases perhaps even step wedge overlays representative of those zone scales, but again these scales do NOT offer full or half-stop TONAL accuracy. Except now, that is, my system offers these things - to make not only spot metering more convenient but computer-based editing and exposure adjustment more convenient too! What can also help is employing a customized step wedge overlay much as I've shown.
Just to clarify what a NON-tonally-stop-accurate zone scale might look like with 10% tonal spans instead of tonally compressed or expanded progressions, which I've seen from more sources on the Internet than I can recall or count, take a look at this one and compare it if you like to the ones I've shown you prior:
Sometimes a digital zone system will show the zones with amounts of light as reflectance percentages representing the middles of those zones also being reported, or I should say MISreported, as I show here. At least the PROGRESSION of those percentages I show is fine as HALF STOPS. But they DON'T translate into any singular tonal % difference between either half stops OR, as their positioning here might suggest, full stops; and NONE translate to define any precisely 10% part of the greyscale for a full stop as is typically mistakenly shown, much as I am showing you on the table now as the 10% tonal increments. We'd be closer in accuracy, mind you, if we took the zones, which are indicated in the leftmost column of the table as supposedly one-stop range zones, and redeclared them to be half-stop zones in order to match the correct progression of half-stop reflectance values on the rightmost column, but that would be still no cigar as far as the tonal percentages we're using go and the RGB tones they translate into. We might still fluke some good results using a NON-stop-accurate system with frustration mind you, but that doesn't sound fun.
I think it's worth me going back to basics and saying a little bit more about the stop here, as well as tonal compression. On your camera, you can adjust the the exposure of the scene you are trying to capture, resulting in a difference of brightness in the resulting image, and you can do this adjustment by, for instance, a full stop. A stop then is a doubling or halving of light - in this case, the amount of light you are allowing into your camera. A slightly less traditional stop is one that involves amplifying the signal and noise in the camera's sensor to give the EFFECT of greater exposure - in this case an effective stop added through doubling the ISO.
In ANY case, despite one stop brighter being twice as much light, it won't APPEAR twice as bright though but quite a bit less, and one stop darker won't appear half as bright or anywhere as much - at least not to our "non- linear" eyes, whereas the linear sensor might "see" one stop brighter or darker as being respectively twice or half as mud light. For now, it should suffice to say that the increments between stops can visually seem smaller than what a lot of people might expect, and about equal if you only witness a few increments, or, if you are really observant, you might see that in reality the increments are NOT equal. That gets into tonal compression. With human eyesight, we've shown that the compression gets progressively greater starting from a pure white. With a digital camera, the highlight rolloff means additional compression going the opposite way INTO the highlights. In fact, with human eyesight, there is a similar compression with specular "beyond white" highlights - a similar s-curve response in that regard actually, which I mention for the first time here. Cameras typically allow you to adjust exposure by a stop, or a half stop, or even a third stop which, personally, is a difference I often find difficult to impossible to perceive especially where the tonal increments get smaller. Some people would probably prefer that cameras could be adjusted by equal tonal increments instead - say 5% at a time - so we could easily map tones to a scale of zones where most span a 10% range, like in the zone scale above.
But the reality is, cameras don't work that way, nor SHOULD they. There's a lot of POWER in instead using the traditional full stops, and fractional stops, that will likely become more evident to you if it has not already, and if you stick with me to see what I'm proposing. Traditionally most zone systems include zones where most DO span a stop each. If you disregard the tonal shades and values above then, you have a serviceable and rather traditional zone system - with middle grey at zone V and zone V at the very centre of a practical dynamic range of 9 stops - or 7 if you include only the one-stop exposure zones that are considered to contain texture (via zones II to VIII). Although this is a good rule of thumb to follow that has been around for decades applicable at least to a monochrome film medium if not others, and not bad to follow for at least some digital cameras as well; we could debate whether there is some deviation to this with OTHER film stock and digital cameras, and by how much. Or more importantly, we could look at creating scales of exposure zones customized to those other various mediums, but then again what I am proposing here is that what may be even more important than THAT is understanding the zone scale for human eyesight FIRST, as we've reviewed.
Of course I've also been claiming inaccuracies in other digital zone systems in comparison to mine and trying to prove these. Obviously these other systems have been around for some time and HAVE proven useful for both editing one exposure and editing and blending multiple exposures, but I believe many people would have found the greater convenience of a stop-accurate tonal scale system preferable, and a different method entirely that I will soon get into even MORE preferable (although one at least has a wealth of options this way now). The authors of digital zone systems are perfectly honest that the majority of their zones are NOT a stop in range each at least in terms of the tones on the the scale they show (nor, for at least one system with tonal divisions similar to the one in the chart above, are the majority of the zones a stop in range in any other way - with one author admitting the distance between zone V and zone IX centre-to-centre being NOT 4 stops that would be convenient for a spot meter, but closer to 2-1/3 if we are abiding by the tones). Such a lack of complete one-stop accuracy can be problematic for capturing and blending together more than one exposure. Even for editing just one exposure, my approach (or "both" approaches if you like) better accounts than others for the natural tonal compression and expansion between exposure levels.
Really, though, the most important thing I can say about all that, I can do so simply right here:
Stops simply do not differ from each by the same tonal increment.
I'll concede that such a zone scale as above might be fine for defining one-stop exposure zones with modification - particularly where actual tonal lightness values are NOT reported and any tones of an accompanying greyscale are NOT to be taken literally. But I think it's much more useful to take into account the tonal values, as I have in the prior scales.
BEFORE YOU USE AND EXPAND UPON MY ZONE SYSTEM AS YOU ARE WELCOME TO DO SO, PERHAPS DON'T (ESPECIALLY FOR IMAGE ACQUISITION)!:
As I've said prior, I believe I have come up with a better, simpler method than all the digital zone systems including my own - at least for optimizing the "dynamic" and "tonal" range that your camera can capture in a RAW image file in either one exposure or more than one to blend together, and achieving an image that is reasonably accurate to what your eye sees. This is not a bad first step to take before image enhancement that veers into "altered reality"! I'll more fully explain my methods in a couple e-books I'm developing, and to some extent in the promotional materials for those. For image ENHANCEMENT however, where parts of an image are altered in exposure as might be done using a zone system, what I've presented here can only help...
Using the information here for image enhancement is first a matter of appreciating that the generic zone scales I've shown are not really for the default kind of imagery resulting out of digital cameras, but more for human eyesight and possibly "pre-s-curve sensor response" (albeit perhaps a bit too conservative in describing perceivable texture in darkness). Secondly we need to appreciate the differences made with the s-curve or whatever impositions from a specific digital camera makes in exposure response, and then we need to take use whatever control we have over this in RAW processing/development! All this will help to inform your decisions in changing the exposure of parts of your image post-capture including possibly full ranges of exposure (not unlike the one-stop exposure zones we've discussed). Tone curves (adjustments) are your friends here! And so is good software that allows exposure adjustment on RAW files with proper tonal compression and expansion. ACR (Adobe Camera Raw) is a great example! For doing HDR work, the rather inexpensive Photoshop Elements including the Photomerge software and ACR that are both packaged with it could work, but for true image enhancement and better exposure adjustment and highlight recovery as of this writing, Photoshop CC may be the better choice. Starting with a great image or series of exposures for blending is also important, and is discussed next!
As for the image acquisition method I'm introducing in my e-books, it's a specific application of the "ETTR" (Expose to the Right) method that, when necessary, is married to another specific application of the "HDR" (High Dynamic Range) method. The result is explained in the two e-books' shared title:
PERFECT (Grain-Reduced Blown-Highlights-Recovered) EXPOSURE Through Sensor Optimization:
E-Book 1 of 2 - ETTR and "OTTR" Techniques
E-Book 2 of 2 - HDR and "HHDR" Techniques
- Exploiting the Little-Known Benefits of Intentional Overexposure of Non-Highlight Areas
Combined with Post-Capture Exposure Correction Using Either One or More Exposures,
and Replacing the Need for a Digital Zone System at the Image Acquisition Phase
Do you remember a concept I conveyed earlier - basically that most of the tonal range that a camera sensor can record lies in the highlights (even seemingly blown highlights), and that those details can be recovered? This is what we are exploiting here, and the fact that this can be done with reduction in grain due to electronic noise and with other benefits of the extra latitude as well!:
[Triangle Diagram? or left to linked page]
For more information, please visit this linked web page, and please consider purchasing my e-book from here!:
[Temporary]:
http://www.tonallystopaccurate.com/index4.html
---
Obviously I can't give EVERYTHING away for free!
And by the way, in case you still need more proof that digital zone systems have problems, here is another source:
Despite my finding that link interesting, and the things I've said on this site that are similarly critical of digital zone systems, please note that all this is coming from a guy who was actually looking FORWARD to slowing down a little to spot meter everything with a zone system in mind! I'm an Ansel Adams fan who bought a film camera AFTER buying a number of digital cameras! So I'm hesitant to say that there may never be a need of a zone system anymore with "digital" acquisition as there once was with "film". But for most things or types of scenes, it would seem to me that there is not that need - not even with landscapes and architecture at least much of the time. And as with HDR captured through a camera on a tripod, the scene you might wish to acquire is not always appropriate for that method nor a zone system for acquisition, nor is there always enough time. My e-book gets more into these considerations. It's ultimate goal is to establish a method to help you - arguably the best method - for achieving decent exposure across ALL parts of an image! Knowing how to achieve such excellence I would argue is the first step to standing out. It also helps that it can make an older, cheaper camera or camera with a small sensor work like a better camera!
Thank you!
- Stacy Muller (camera guy).