At first blush, it seems like augmented reality should be easier than virtual reality. Whereas virtual reality involves the generation of full stereoscopic digital environments as well as interactive objects to place in those environments, augmented reality is simply adding digital content to our view of the real world. Virtual reality would seem to be doing more heavy lifting.
In actual fact, both technologies are creating illusions to fool the human eye and the human brain. In this effort, virtual reality has an easier task because it can shut out points of reference that would otherwise belie the illusion. Augmented reality experiences, by contrast, must contend with real world visual cues that draw attention to the false nature of the mixed reality content being added to a user’s field of view.
In this post, I will cover some of the additional challenges that make augmented reality much more difficult to get right. In the process, I hope to also provide clues as to why augmented reality HMDs like HoloLens and Magic Leap are taking much longer to bring to market than AR devices like the Oculus Rift, HTC Vive and Sony Project Morpheus.
But first, it is necessary to distinguish between two different kinds of augmented reality experience. One is informatics based and is supported by most smart phones with cameras. The ideal example of this type of AR is the Terminator-vision from James Cameron’s 1984 film “The Terminator.” It is relatively easy to to do and is the most common kind of AR people encounter today.
The second, and more interesting, kind of AR requires inserting illusory 3D digital objects (rather than informatics) into the world. The battle chess game from 1977’s “Star Wars” epitomizes this second category of augmented reality experience. This is extremely difficult to do.
The Microsoft HoloLens and Magic Leap (as well as any possible HMDs Apple and others might be working on) are attempts to bring both the easy type and the hard type of AR experience to consumers.
Here are a few things that make this difficult to get right. We’ll put aside stereoscopy which has already been solved effectively in all the VR devices we will see coming out in early 2016.
1. Occlusion The human brain is constantly picking up clues from the world in order to determine the relative positions of objects such as shading, relative size and perspective. Occlusion is one that is somewhat tricky to solve. Occlusion is an effect that is so obvious that it’s hard to realize it is a visual cue. When one body is in our line of sight and is positioned in front of another body, that other body is partially hidden from our view.
In the case where a real world object is in front of a digital object, we can clip the digital object with an outline of the object in front to prevent bleed through. When we try to create the illusion that a digital object is positioned in front of a real world object, however, we encounter a problem inherent to AR.
In a typical AR HMD we see the real world through a transparent screen upon which digital content is either projected or, alternatively, illuminated as with LED displays. An obvious characteristic of this is that digital objects on a transparent display are themselves semi-transparent. Getting around this issue would seem to require being able to make certain portions of the transparent display more opaque than others as needed in order to make sure our AR objects look substantial and not ghostly.
2. Accommodation It turns out that stereoscopy is not the only way our eyes recognize distance. The image above is from a scene in Orson Welles’s “Citizen Kane” in which a technique called “deep focus” is used extensively. Deep focus maintains clarity in the frame whether the actors and props are in the foreground, background or middle ground. Nothing is out of focus. The technique is startling both because it is counter to the way movies are generally shot but also because it is counter to how our eyes work.
If you cover one eye and use the other to look at one of your fingers, then move the finger toward and away from you, you should notice yourself refocusing on the finger as it moves while other objects around the finger become blurry. The shape of the cornea actually becomes more rounded when objects are close in order to cause light to refract more in order to reach the retina. For further away objects, the cornea flattens out because less refraction is needed. As we become older, the ability to bow the cornea lessens and we lose some of our ability to focus on near objects – for instance when we read. In AR, we are attempting to make a digital object that is really only centimeters from our eyes appear to be much further away.
Depending on how the light from the display passes through the eye, we may end up with the digital object appearing clear while the real world objects supposedly next to it and at the same distance appear blurred.
3. Vergence-Accommodation Mismatch The accommodation problem is one aspect of yet another VR/AR difficulty. The term vergence describes the convergence and divergence of the two eyes from one another as objects move closer or further away. An interesting aspect of stereoscopy – which is used both for virtual reality as well as augmented reality to create the illusion of depth – is that the distance at which the two eyes coordinate to see an object is generally different from the focal distance from the eyes to the display screen(s). This consequently sends two mismatched signals to the brain concerning how far away the digital object is supposed to be. Is it the focal length or the vergence length? Among other causes, vergence-accommodation mismatch is believed to be a contributing factor to VR sickness. Should the accommodation problem above be resolved for a given AR device, it is safe to assume that the vergence-accommodation mismatch will also be solved.
4. Tetherless Battery Life Smart phones have changed our lives among other reasons because they are tetherless devices. While the current slate of VR devices all leverage powerful computers to which they are attached, since VR experiences are all currently somewhat stationary (the HTC Vive being the odd bird), AR needs to be portable. This naturally puts a strain on the battery, which needs to be relatively light since it will be attached to the head-mounted-display, but also long-lived as it will be powering occasionally intensive graphics, especially for games.
5. Tetherless GPU Another strain on the system is the capability of the GPU. Virtual reality devices can be fairly intense since they require the user to purchase a reasonably powerful and somewhat expensive graphics card. AR devices can be expected to have similar graphics requirements as VR with much less to work with since the GPU needs to be onboard. We can probably expect a streamlined graphics pipeline dedicated to and optimized for AR experiences will help offset lower GPU capabilities.
6. Applications Not even talking about killer apps, here. Just apps. Microsoft has released videos of several impressive demos including Minecraft for HoloLens. Magic Leap up to this point has only shown post-prod, heavily produced illustrative videos. The truth is that everyone is still trying to get their heads around designing for AR. There aren’t really any guidelines for how to do it or even what interactions will work. Other than the most trivial experiences (e.g. weather and clock widgets projected on a wall) this will take a while as we develop best practices while also learning from our mistakes.
With the exception of V-AM, these are all problems that VR does not have to deal with. Is it any wonder, then, that while we are being led to believe that consumer models of the Oculus Rift, HTC Vive and Sony Project Morpheus will come to market in the first quarter of 2016, news about HoloLens and Magic Leap has been much more muted. There is simply much more to get right before a general rollout. One can hope, however, that dev units will start going out soon from the major AR players in order to mitigate challenge #6 while further tuning continues, if needed, on challenges #1-#5.