HoloLens Occlusion vs Field of View

prometheusmovie6812

[Note: this post is entirely my own opinion and purely conjectural.]

Best current guesses are that the HoloLens field of view is somewhere between 32 degrees and 40 degrees diagonal. Is this a problem?

We’d definitely all like to be able to work with a larger field of view. That’s how we’ve come to imagine augmented reality working. It’s how we’ve been told it should work from Vernor Vinge’s Rainbow’s End to Ridley Scott’s Prometheus to the Iron Man trilogy – in fact going back as far as Star Wars in the late 70’s. We want and expect a 160-180 degree FOV.

So is the HoloLens’ field of view (FOV) a problem? Yes it is. But keep in mind that the current FOV is an artifact of the waveguide technology being used.

What’s often lost in the discussions about the HoloLens field of view – in fact the question never asked by the hundreds of online journalists who have covered it – is what sort of trade-off was made so that we have the current FOV.

A common internet rumor – likely inspired by a video by tech evangelist Bruce Harris taken a few months ago – is that it has to do with cost of production and consistency in production. The argument is borrowed from chip manufacturing and, while there might be some truth in it, it is mostly a red herring. An amazingly comprehensive blog post by Oliver Kreylos in August of last year went over the evidence as well as related patents and argued persuasively that while increasing the price of the waveguide material could improve the FOV marginally, the price difference was prohibitively expensive and ultimately nonsensical. At the end of the day, the FOV of the HoloLens developer unit is a physical limitation, not a manufacturing limitation or a power limitation.

haunted_mansion

But don’t other AR headset manufacturers promise a much larger FOV? Yes. The Meta 2 (shown below) has a 90 degree field of view. The way the technology works, however, involves two LED screens that are then viewed through plastic positioned at 45 degrees to the screens (technically known as a beam splitter, informally known as a piece of glass) that reflects the image into the user’s eyes at approximately half the original brightness while also letting the real world in front of the user (though half of that light is also scattered). This is basically the same technique used to create ghostly images in the Haunted Mansion at Disneyland.

brain

The downside of this increased FOV is you are loosing a lot of brightness through the beam splitter. You are also losing light based on the distance it takes the light to pass through the plastic and get to your eyes. The result is a see-through “hologram”.

Iron-Man-AR

But is this what we want? See-through holograms? The visual design team for Iron man decided that this is indeed what they wanted for their movies. The translucent holograms provide a cool ghostly effect, even in a dark room.

leia

The Princess Leia hologram from the original Star Wars, on the other hand, is mostly opaque. That visual design team went in a different direction. Why?

leia2

My best guess is that it has to do with the use of color. While the Iron Man hologram has a very limited color palette, the Princess Leia hologram uses a broad range of facial tones to capture her expression – and also so that, dramatically, Luke Skywalker can remark on how beautiful she is (which obviously gets messed up by the Return of the Jedi). Making her transparent would simply wash out the colors and destroy much of the emotional content of the scene.

star_wars_chess

The idea that opacity is a pre-requisite for color holograms is confirmed in the Star Wars chess scene on the Millennium Falcon. Again, there is just enough transparency to indicate that the chess pieces are holograms and not real objects (digital rather than physical).

dude

So what kind of holograms does the HoloLens provide, transparent or near-opaque? This is something that is hard to describe unless you actually see it for yourself but the HoloLens “holograms” will occlude physical objects when they are placed in front of them. I’ve had the opportunity to experience this several times over the last year. This is possible because these digital images use a very large color palette and, more importantly, are extremely intense. In fact, because the holoLens display technology is currently additive, this occlusion effect actually works best with bright colors. As areas of the screen become darker, they actually appear more transparent.

Bigger field of view = more transparent , duller holograms. Smaller field of view = more opaque, brighter holograms.

I believe Microsoft made the bet that, in order to start designing the AR experiences of the future, we actually want to work with colorful, opaque holograms. The trade-off the technology seems to make in order to achieve this is a more limited field of view in the HoloLens development kits.

At the end of the day, we really want both, though. Fortunately we are currently only working with the Development Kit and not with a consumer device. This is the unit developers and designers will use to experiment and discover what we can do with HoloLens. With all the new attention and money being spent on waveguide displays, we can optimistically expect to see AR headsets with much larger fields of view in the future. Ideally, they’ll also keep the high light intensity and broad color palette that we are coming to expect from the current generation of HoloLens technology.

How HoloLens Sensors Work

kinect_sensors

[hardware specs were released this week. This post is now updated to reflect the final specs.]

In addition to a sophisticated AR display, the Micosoft HoloLens contains a wide array of sensors that constantly collect data about the user’s external and internal environments. These sensors are used to synchronize the augmented reality world with the real world as well as respond to commands. The HoloLens’s sensor technology can be thought of as a combination of two streams of research: one from the evolution of the Microsoft Kinect and the other from developments in virtual reality positioning technology. While what follows is almost entirely just well-informed guesswork, we can have a fair degree of confidence in these guesses based on what is already known publicly about the tech behind the Kinect and well documented VR gear like the Oculus Rift.

While this article will provide a broad survey of the HoloLens sensor hardware, the reader can go deeper into this topic on her own through resources like the book Beginning Kinect Programming by James Ashley and Jarrett Webb, Oliver Kreylos’s brilliant Doc-OK blog, and the perpetually enlightening Oculus blog.

Let’s begin with a list of the sensors believed to be housed in the HoloLens HMD:

  1. Gyroscope
  2. Magnetometer
  3. Accelerometer
  4. Internal facing eye tracking cameras (?)
  5. Ambient Light Detector (?)
  6. Microphone Array (4 (?) mics)
  7. Depth sensors Grayscale Cameras (4)
  8. RGB cameras (1)
  9. Depth sensor (1)

The first three make up an Inertial Measurement Unit often found in head-mounted displays for AR as well as VR. The eye tracker is technology that became commercialized by 3rd parties like Eye Tribe following the release of the Kinect but not previously used in Microsoft hardware – though it isn’t completely clear that there is any sort of eye tracking being used. There is a small sensor at the front that some people assume is an ambient light detector. The last three are similar to technology found in the Kinect.

microphone array
copyright Adobe Stock

I want to highlight the microphone array first because it was always the least understood and most overlooked feature of Kinect. The microphone array is extremely useful for speech recognition because it can distinguish between vocal commands from the user and ambient noise. Ideally, it should also be able to amplify speech from the user so commands can be heard even over a noisy room. Speech commands will likely be lit up by integrating the mic array with Microsoft’s cloud-based Cortana speech rec technology rather than something like the Microsoft Speech SDK. Depending on how the array is oriented, it may also be able to identify the direction of external sounds. In future iterations of HoloLens, we may be able to marry the microphone array’s directional capabilities with the RGB cameras and face recognition to amplify speech from our friends through the biaural audio speakers built into HoloLens.

hololens-menu
copyright Microsoft

Eye tracking cameras are part of a complex mechanism allowing the human gaze to be used in order to manipulate augmented reality menus. When presented with an AR menu, the user can gaze at buttons in the menu in order to highlight them. Selection then occurs either by maintaining the gaze or by introducing an alternative selection mechanism like a hand press – which would in turn use the depth camera combined with hand tracking algorithms. Besides being extremely cool, eye tracking is a NUI solution to a problem many of us have like encountered with the Kinect on devices like Xbox. As responsive as hand tracking can be using a depth camera, it still has lag and jitteriness that makes manipulation of graphical user interface menus tricky. There’s certainly an underlying problem in trying to transpose one interaction paradigm, menu manipulation, into another paradigm based on gestures. Similar issues occur when we try to put interaction paradigms like a keyboard on a touch screen — it can be made to work, but isn’t easy. Eye tracking is a way to remove friction when using menus in augmented reality. It’s fascinating, however, to imagine what else we could use it for in future HoloLens iterations. It can be used to store images and environmental data whenever our gaze dwells for a threshold amount of time on external objects. When we want to recall something we saw during the day, the HoloLens can bring it back to us: that book in the book store, that outfit the guy in the coffee shop was wearing, the name of the street we passed on the way to lunch. As we sleep each night, perhaps these images can be analyzed in the cloud to discover patterns in our daily lives of which we were previously unaware.

Kinect has a feature called coordinate mapping which allows you to compare pixels from the depth camera and pixels from the color camera. Because the depth camera stream contained information about pixels belonging to human beings and those that did not, the coordinate mapper could be used to identify people in the RGB image. The RGB image in turn could be manipulated to do interesting things with the human-only pixels such as background subtraction and selective application of shaders such that these effects would appear to follow the player around. HoloLens must do something similar but on a vastly grander scale. The HoloLens must map virtual content onto 3D coordinates in the world and make them persist in those locations even as the user twists and turns his head, jumps up and down, and moves freely around the virtual objects that have been placed in the world. Not only must these objects persist, but in order to maintain the illusion of persistence there can be no perceivable lag between user movements and redrawing the virtual objects on the HoloLens’s two stereoscopic displays – perhaps no more than 20 ms of delay.

This is a major problem for both augmented and virtual reality systems. The problem can be broken up into two related issues: orientation tracking and position tracking. Orientation tracking determines where we are looking when wearing a HMD. Position tracking determines where we are located with respect to the external world.

head orientation tracking
copyright Adobe Stock: Sergey Niven

Orientation tracking is accomplished through a device known as an Inertial Measurement Unit which is made up of a gyroscope, magnetometer and accelerometer. The inertial unit of measure for an Inertial Measurement Unit (see what I did there?) is radians per second (rad/s), which provides the angular velocity of any head movements. Steve LaValle provides an excellent primer on how the data from these sensors are fused together on the Oculus blog. I’ll just provide a digest here as a way to explain how HoloLens is doing roughly the same thing.

The gyroscope is the core head orientation tracking device. It measures angular velocity. Once we have the values for the head at rest, we can repeatedly check the gyroscope to see whether our head has moved and in which direction it has moved. By comparing the velocity of that movement as well as the direction and comparing this to the amount of time that has passed, we can determine how the head is currently oriented compared to its previous orientation. In fact the Oculus does this one thousand times per second and we can assume that HoloLens is collecting data at a similarly furious rate.

Over time, unfortunately, the gyroscope’s data loses precision – this is known as “drift.” The two remaining orientation trackers are used to correct for this drift. The accelerometer performs an unexpected function here by determining the acceleration due to the force of gravity. The accelerometer provides the true direction of “up” (gravity pulls down so the acceleration we feel is actually upward, as in a rocket ship flying directly up) which can be used to correct the gyroscope’s misconstrued impression of the real direction of up. “Up,” unfortunately, doesn’t provide all the correction we need. If you turn your head right and left to make the gesture for “no,” you’ll notice immediately that knowing up in this case tells us nothing about the direction in which your head is facing. In this case, knowing the direction of magnetic north would provide the additional data needed to correct for yaw error – which is why a magnetometer is also a necessary sensor in HoloLens.

position tracking
copyright Adobe Stock

Even though the IMU, made up of a gyroscope, magnetometer and accelerometer, is great for determining the deltas in head orientation from moment to moment, it doesn’t work so well for determining diffs in head position. For a beautiful demonstration of why this is the case, you can view Oliver Kreylos’s video Pure IMU-Based Positional Tracking is a No-Go. For a very detailed explanation, you should read Head Tracking for the Oculus Rift by Steven LaValle and his colleagues at Oculus.

The Oculus Rift DK2 introduced a secondary camera for positional tracking that sits a few feet from the VR user and detects IR markers on the Oculus HMD. This is known as outside-in positional tracking being the external camera determines the location of the goggles and passes it back to Oculus software. This works well for the Oculus mainly because the Rift is a tethered device. The user sits or stands in a place near to the computer that runs the experience and cannot stray far from there.

There are some alternative approaches to positional tracking which allow for greater freedom of movement. The HTC Vive virtual reality system, for instance, uses two stationary devices in a setup called Lighthouse. Instead of stationary cameras like the Oculus Rift uses, these Lighthouse boxes are stationary emitters of infrared light that the Vive uses to determine it’s position in a room with respect to them. This is sometimes called an inside-out positional tracking solution because the HMD is determining it’s location relative to known external fixed positions.

Google’s Project Tango is another example of inside-out positional tracking that uses the sensors built into handheld devices (smart phones and tablets) in order to add AR and VR functionality to applications. Because these devices aren’t packed into IMUs, they can be laggy. To compensate, Project Tango uses data from onboard device cameras to determine the orientation of the room around the device. These reconstructions are constantly compared against previous reconstructions in order to determine both the device’s position as well as its orientation with respect to the room surfaces around it.

It is widely assumed that HoloLens uses a similar technique to correct for positional drift from the Inertial Measurement Unit. After all, HoloLens has four depth IR grayscale (?) cameras built into it. The IMU, in this supposition, would provide fast but drifty positional data while the combination of data from the four depth grayscale cameras and an RGB cameras provide possibly slower (we’re talking in milliseconds, after all) but much more accurate positional data. Together, this configuration provides inside-out positional tracking that is truly tether-less. This is, in all honestly, a simply amazing feat and almost entirely overlooked in most overviews of the HoloLens.

The secret sauce that integrates camera data into an accurate and fast reconstruction of the world to be used, among other things, for position tracking is called the Holographic Processing Unit – a chip the Microsoft HoloLens team is designing itself. I’ve heard from reliable sources that fragments from Stonehenge are embedded in each chip to make this magic work.

AR wordart

On top of this, the depth sensors, IR cameras, and RGB cameras will likely be accessible as independent data streams that can be used for the same sorts of functionality for which they have been used in Kinect applications over the past four years: art, research, diagnostic, medical, architecture, and gaming. Though not discussed previously, I would hope that complex functionality we have become familiar with from Kinect development like skeleton tracking and raw hand tracking will also be made available to HoloLens developers.

Such a continuity of capabilities and APIs between Kinect and HoloLens, if present, would make it easy to port the thousands of Kinect experiences the creative and software communities have developed over the years leading up to HoloLens. This sort of continuity was, after all, responsible for the explosion of online hacking videos that originally made the Kinect such an object of fascination. The Kinect hardware used a standard USB connector that developers were able to quickly hack and then pass on to –- for the most part –- pre-existing creative applications that used less well known, less available and non-standard depth and RGB cameras. The Kinect connected all these different worlds of enthusiasts by using common parts and common paradigms.

It is my hope and expectation that HoloLens is set on a similar path.

[This post has been updated 11/07/15 following opportunities to make a closer inspection of the hardware while in Redmond, WA. for the MVP Global Summit. Big thanks to the MPC and HoloLens groups as well as the Emerging Experiences MVP program for making this possible.]

[This post has been updated again 3/3/15 following release of final specs.]

How Hololens Displays Work

HoloLens-displays

There’s been a lot of debate concerning how the HoloLens display technology works. Some of the best discussions have been on reddit/hololens but really great discussions can be found all over the web. The hardest problem in combing through all this information is that people come to the question at different levels of detail. A second problem is that there is a lot of guessing involved and the amount of guessing going on isn’t always explained. I’d like to correct that by providing a layered explanation of how the HoloLens displays work and by being very up front that this is all guesswork. I am a Microsoft MVP in the Kinect for Windows program but do not really have any insider information about HoloLens I can share and do not in any way speak for Microsoft or the HoloLens team. My guesses are really about as good as the next guy’s.

High Level Explanation

view_master

The HoloLens display is basically a set of transparent screens placed just in front of the eyes. Each eyepiece or screen lets light through and also shows digital content the way your monitor does. Each screen shows a slightly different image to create a stereoscopic illusion like the View Master toy does or 3D glasses do at 3D movies.

A few years ago I worked with transparent screens created by Samsung that were basically just LCD screens with their backings removed. LCDs work by suspending liquid crystals between layers of glass. There are two factors that make them bad candidates for augmented reality head mounts. First, they require soft backlighting in order to be reasonably useful. Second, and more importantly, they are too thick.

At this level of granularity, we can say that HoloLens works by using a light-weight material that displays color images while at the same time letting light through the displays. For fun, let’s call this sort of display an augmented reality combiner, since it combines the light from digital images with the light from the real world passing through it.

 

Intermediate Level Explanation

Light from the real world passes through two transparent pieces of plastic. That part is pretty easy to understand. But how does the digital content get onto those pieces of plastic?

Optical-Fibers

The magic concept here is that the displays are waveguides. Optical fiber is an instance of a waveguide we are all familiar with. Optical fiber is a great method for transferring data over long distances because is is lossless, bouncing light back and forth between its reflective internal surfaces.

hl_display_diagram

The two HoloLens eye screens are basically flat optical fibers or planar waveguides. Some sort of image source at one end of these screens sends out RGB data along the length of the transparent displays. We’ll call this the image former. This light bounces around the internal front and back of each display and in this manner traverses down its length. These light rays eventually get extracted from the displays and make their way to your pupils. If you examine the image of the disassembled HoloLens at the top, it should be apparent that the image former is somewhere above where the bridge of your nose would go.

 

Low Level Explanation

The lowest level is where much of the controversy comes in. In fact, it’s such a low level that many people don’t realize it’s there. And when I think about it, I pretty much feel like I’m repeating dialog from a Star Trek episode about dilithium crystals and quantum phase converters. I don’t really understand this stuff. I just think I do.

In the field of augmented reality, there are two main techniques for extracting light from a waveguide: holographic extraction and diffractive extraction. A holographic optical element has holograms inside the waveguide which route light into and out of the waveguide. Two holograms can be used at either end of the microdisplay: one turns the originating image 90 degrees from the source and sends it down the length of the waveguide. Another intercepts the light rays and turns them another ninety degrees toward the wearer’s pupils.

A company called TruLife Optics produces these types of displays and has a great FAQ to explain how they work. Many people, including Oliver Kreylos who has written quite a bit on the subject, believe that this is how the HoloLens microdisplays work. One reason for this is Microsoft’s emphasis on the terms “hologram” and “holographic” to describe their technology.

On the other hand, diffractive extraction is a technique pioneered by researchers at Nokia – for which Microsoft currently owns the patents and research. Due to a variety of reasons, this technique falls under the semantic umbrella of a related technology called Exit Pupil Expansion. EPE literally means making an image bigger (expanding it) so it covers as much of the exit pupil as possible, which means your eye plus every area your pupil might go to as you rotate your eyeball to take in your field of view (about a 10mm x 8mm rectangle or eye box). This, in turn, is probably why measuring the interpupillary distance is a large aspect of fitting people for the HoloLens.

ASPEimage002

Nanometer wide structures or gratings are placed on the surface of the waveguide at the location where we want to extract an image. The grating effectively creates an interference pattern that diffracts the light out and even enlarges the image. This is known as SRG or surface relief grating as shown in the above image from holographix.com.

Reasons for believing HoloLens is using SRG as its way of doing EPE include the Nokia connection as well as this post from Jonathan Lewis, the CEO of TruLife, in which Lewis states following the original HoloLens announcement that it isn’t the holographic technology he’s familiar with and is probably EPE. There’s also the second edition of Woodrow Barfield’s Wearable Computers and Augmented Reality in which Barfield seems pretty adamant that diffractive extraction is used in HoloLens. Being a professor at the University of Washington, which has a very good technology program as well as close ties to Microsoft, he may know something about it.

On the other hand, it doesn’t get favored or disfavored in this Microsoft patent clearly talking about HoloLens that ends up discussing both volume holograms (VH) as well as surface relief grating (SRG). I think HL is more likely to be using diffractive extraction rather than holographic extraction, but it’s by no means a sure thing.

 

Impact oN Field of View

An important aspect of these two technologies is that they both involve a limited field of view based on the ways we are bouncing and bending light in order to extract it from the waveguides. As Oliver Kreylos has eloquently pointed out, “the current FoV is a physical (or, rather, optical) limitation instead of a performance one.” In other words, any augmented reality head mounted display (HMD) or near eye display (NED) is going to suffer from a small field of view when compared to virtual reality devices. This is equally true of the currently announced devices like HoloLens and Magic Leap, the currently available AR devices like those by Vuzix and DigiLens, and the expected but unannounced devices from Google, Facebook and Amazon.  Let’s call this the keyhole problem (KP).

keyhole

The limitations posed by KP are a direct result of the need to use transparent displays that are actually wearable. Given this, I think it is going to be a waste of time to lament the fact that AR FOVs are smaller than we have been led to expect from the movies we watch. I know Iron Man has already had much better AR for several years with a 360 degree field of view but hey, he’s a superhero and he lives in a comic book world and the physical limitations of our world don’t apply to him.

Instead of worrying that tech companies for some reason are refusing to give us better augmented reality, it probably makes more sense to simply embrace the laws of physics and recognize that, as we’ve been told repeatedly, hard AR is still several years away and there are many technological breakthroughs still needed to get us there (let’s say five years or even “in the Windows 10 timeframe”).

In the meantime, we are being treated to first generation AR devices with all that the term “first generation” entails. This is really just as well because it’s going to take us a lot of time to figure out what we want to do with AR gear, when we get beyond the initial romantic phase, and a longer amount of time to figure out how to do these experiences well. After all, that’s where the real fun comes in. We get to take the next couple of years to plan out what kinds of experiences we are going to create for our brave new augmented world.

HoloLens Fashionista

microsoft-executives-testing-the-hololens

While Google Glass certainly had its problems as an augmented reality device – among other things not really being an augmented reality device as GA Tech professor Blair MacIntyre pointed out – it did demonstrate two remarkable things. First, that people are willing to shell out $1500 for new technology. In the debates over the next year concerning the correct price point for VR and AR head mounted displays, this number will play a large role. Second, it demonstrated the importance of a sense of style when designing technology. Google glass, for many reasons, was a brilliant fashion accessory.

If a lesson can be drawn from these two data points, it might be that new — even Project Glass-level iffy — technology can charge a lot if it manages to be fashionable as well as functional.

When you look at the actual HoloLens device, you may, like me, be thinking “I don’t know if I’d wear that out in public.” In that regard, I’d like to nudge your intuitions a bit.

Obviously there is time to do some tweaking with the HL design. I recently found some nostalgic pictures online that made me start to think that with modifications, I could rock this look.

It all revolves around one of the first animes imported to the United States in the 70s called Battle of the Planets. It sounded like this:

Battle of the Planets! G-Force! Princess! Tiny! Keyop! Mark! Jason! And watching over them from Center Neptune, their computerized coordinator, 7-Zark-7! Watching, warning against surprise attacks by alien galaxies beyond space. G-Force! Fearless young orphans, protecting Earth’s entire galaxy. Always five, acting as one. Dedicated! Inseparable! Invincible!

 

And it looked AMAZING. I think this look could work for HoloLens. I think I could pull it off. The capes and tights, of course, are purely optional.

3457571746_828ed63868

2033478-g_force_3

hololenss

battle_of_the_planets___mark_and_princess_by_mlcraighead-d4igfy3

battle_of_the_planets__g_force_by_gabrielxmarquez-d6vlr7u

battle_of_the_planets_by_dwinbotp

Microsoft Windows 10

d099db9b8e3471db5f3016fa0349a61d

gatchaman_01

b8bea9c9-1ca3-4c10-8d52-44d04b3062a2

battle-of-the-planets-poster-c10106337jpeg1

lensinspace

holo-force

Why Augmented Reality is harder than Virtual Reality

hololensx519_0

At first blush, it seems like augmented reality should be easier than virtual reality. Whereas virtual reality involves the generation of full stereoscopic digital environments as well as interactive objects to place in those environments, augmented reality is simply adding digital content to our view of the real world. Virtual reality would seem to be doing more heavy lifting.

perspective

In actual fact, both technologies are creating illusions to fool the human eye and the human brain. In this effort, virtual reality has an easier task because it can shut out points of reference that would otherwise belie the illusion. Augmented reality experiences, by contrast, must contend with real world visual cues that draw attention to the false nature of the mixed reality content being added to a user’s field of view.

In this post, I will cover some of the additional challenges that make augmented reality much more difficult to get right. In the process, I hope to also provide clues as to why augmented reality HMDs like HoloLens and Magic Leap are taking much longer to bring to market than AR devices like the Oculus Rift, HTC Vive and Sony Project Morpheus.

terminator vision

But first, it is necessary to distinguish between two different kinds of augmented reality experience. One is informatics based and is supported by most smart phones with cameras. The ideal example of this type of AR is the Terminator-vision from James Cameron’s 1984 film “The Terminator.” It is relatively easy to to do and is the most common kind of AR people encounter today.

star wars chess

The second, and more interesting, kind of AR requires inserting illusory 3D digital objects (rather than informatics) into the world. The battle chess game from 1977’s “Star Wars” epitomizes this second category of augmented reality experience. This is extremely difficult to do.

The Microsoft HoloLens and Magic Leap (as well as any possible HMDs Apple and others might be working on) are attempts to bring both the easy type and the hard type of AR experience to consumers.

Here are a few things that make this difficult to get right. We’ll put aside stereoscopy which has already been solved effectively in all the VR devices we will see coming out in early 2016.

cloaked predator

1. Occlusion The human brain is constantly picking up clues from the world in order to determine the relative positions of objects such as shading, relative size and perspective. Occlusion is one that is somewhat tricky to solve. Occlusion is an effect that is so obvious that it’s hard to realize it is a visual cue. When one body is in our line of sight and is positioned in front of another body, that other body is partially hidden from our view.

In the case where a real world object is in front of a digital object, we can clip the digital object with an outline of the object in front to prevent bleed through. When we try to create the illusion that a digital object is positioned in front of a real world object, however, we encounter a problem inherent to AR.

In a typical AR HMD we see the real world through a transparent screen upon which digital content is either projected or, alternatively, illuminated as with LED displays. An obvious characteristic of this is that digital objects on a transparent display are themselves semi-transparent. Getting around this issue would seem to require being able to make certain portions of the transparent display more opaque than others as needed in order to make sure our AR objects look substantial and not ghostly.

 citizen kane

2. Accommodation It turns out that stereoscopy is not the only way our eyes recognize distance. The image above is from a scene in Orson Welles’s “Citizen Kane” in which a technique called “deep focus” is used extensively. Deep focus maintains clarity in the frame whether the actors and props are in the foreground, background or middle ground. Nothing is out of focus. The technique is startling both because it is counter to the way movies are generally shot but also because it is counter to how our eyes work.

If you cover one eye and use the other to look at one of your fingers, then move the finger toward and away from you, you should notice yourself refocusing on the finger as it moves while other objects around the finger become blurry. The shape of the cornea actually becomes more rounded when objects are close in order to cause light to refract more in order to reach the retina. For further away objects, the cornea flattens out because less refraction is needed. As we become older, the ability to bow the cornea lessens and we lose some of our ability to focus on near objects – for instance when we read. In AR, we are attempting to make a digital object that is really only centimeters from our eyes appear to be much further away.

Depending on how the light from the display passes through the eye, we may end up with the digital object appearing clear while the real world objects supposedly next to it and at the same distance appear blurred.

vergence_accomodation

3. Vergence-Accommodation Mismatch The accommodation problem is one aspect of yet another VR/AR difficulty. The term vergence describes the convergence and divergence of the two eyes from one another as objects move closer or further away. An interesting aspect of stereoscopy – which is used both for virtual reality as well as augmented reality to create the illusion of depth – is that the distance at which the two eyes coordinate to see an object is generally different from the focal distance from the eyes to the display screen(s). This consequently sends two mismatched signals to the brain concerning how far away the digital object is supposed to be. Is it the focal length or the vergence length? Among other causes, vergence-accommodation mismatch is believed to be a contributing factor to VR sickness. Should the accommodation problem above be resolved for a given AR device, it is safe to assume that the vergence-accommodation mismatch will also be solved.

 4. Tetherless Battery Life Smart phones have changed our lives among other reasons because they are tetherless devices. While the current slate of VR devices all leverage powerful computers to which they are attached, since VR experiences are all currently somewhat stationary (the HTC Vive being the odd bird), AR needs to be portable. This naturally puts a strain on the battery, which needs to be relatively light since it will be attached to the head-mounted-display, but also long-lived as it will be powering occasionally intensive graphics, especially for games.

5. Tetherless GPU Another strain on the system is the capability of the GPU. Virtual reality devices can be fairly intense since they require the user to purchase a reasonably powerful and somewhat expensive graphics card. AR devices can be expected to have similar graphics requirements as VR with much less to work with since the GPU needs to be onboard. We can probably expect a streamlined graphics pipeline dedicated to and optimized for AR experiences will help offset lower GPU capabilities.

6. Applications Not even talking about killer apps, here. Just apps. Microsoft has released videos of several impressive demos including Minecraft for HoloLens. Magic Leap up to this point has only shown post-prod, heavily produced illustrative videos. The truth is that everyone is still trying to get their heads around designing for AR. There aren’t really any guidelines for how to do it or even what interactions will work. Other than the most trivial experiences (e.g. weather and clock widgets projected on a wall) this will take a while as we develop best practices while also learning from our mistakes.

Conclusion

With the exception of V-AM, these are all problems that VR does not have to deal with. Is it any wonder, then, that while we are being led to believe that consumer models of the Oculus Rift, HTC Vive and Sony Project Morpheus will come to market in the first quarter of 2016, news about HoloLens and Magic Leap has been much more muted. There is simply much more to get right before a general rollout. One can hope, however, that dev units will start going out soon from the major AR players in order to mitigate challenge #6 while further tuning continues, if needed, on challenges #1-#5.

The HoloCoder’s Resume

agile

In an ideal world, the resume is an advertisement for our capabilities and the interview process is an audit of those claims. Many factors have contributed to complicating what should be a simple process.

 

ihaventreadyourresume

The first is the rise of professional IT recruiters and the automation of the resume process. Recruiters bring a lot to the game, offering a wider selection of IT job candidates to hiring companies, on the one hand, and providing a wider selection of jobs to job hunters, on the other. Automation requires standardization, however, and this has led to an overuse of key search terms when matching candidates to positions. The process begins with job specs from the hiring company — which parenthetically often have little to do with the actual job itself and highlights the frequent disconnect between IT departments and HR departments. A naive job hunter would try to describe their actual experience, which typically will not match the job spec as written by HR. At this point the recruiter helps the job hunter modify the details of her resume to match the template provided by the hiring company by injecting and prioritizing key buzzwords into the resume. “I’m sorry but Lolita, Inc will never hire you unless you have synesthesia listed in your job history. You do have experience with synesthesia, don’t you?”

 

clusteredindex

All of this gerrymandering is required in order to get to the next step, the job interview. Unfortunately, the people doing the job interview have little confidence in the resume as a vehicle for accurately describing a candidate’s actually abilities. First of all, they know that recruiters have already gone over it to eliminate useful information and replace it with keywords instead. Next, the interviewers typically haven’t actually seen the HR job specs and do not understand what kind of role they are hiring for. Finally, none of the interviewers have any particular training in doing job interviews or any particular skill in ascertaining what a candidate knows. In short, the interviewer doesn’t know what he’s looking for and wouldn’t know how to get it if he did.

greatestweakness

A savvy interviewer will probably realize that he is looking for the sort of generalist that Joel Spolsky describes as “smart and gets things done,” but how do you interview for that? The tools the interviewer is provided with are not generic but instead highly specific technology skills. At some point, this impedance mismatch between technology specific interview questions on the one had and a desire to hire generalists on the other (technology, after all, simply changes too quickly to look for only one skillset) lead to an increased reliance on behavioral questions and eventually Google-style language games. Neither of these, it turns, out, particularly help in hiring good candidates.

polymorphism

Once we historically severed any attempt to match interview questions to actual skills, the IT interview process was allowed to become a free floating hermeneutic exercise. Abstruse but non-specific questions involving principles and design patterns have taken over the process. This has led to two strange outcomes. On the one hand, job applicants are now required to be fluent in technical information they will never actually use in their jobs. Literary awareness of ten year old blog posts by Martin Fowler are more important than actually knowing how to get things done. And if the job interviewer exhibits any self-awareness when he turns down a candidate for not being clear on the justified uses of the CQRS pattern (there are none), it will not be because the candidate didn’t know something important for the job but rather because the candidate was unwilling to play the software architecture language game, and anyone unwilling to play the game is likely going to be a poor cultural fit.

The other consequence of an increased reliance on abstruse and non-essential IT knowledge has been the rise of the Architect across the industry. The IT industry has created a class of software developers who cannot actually develop software but instead specialize in telling other people what is wrong with their code. The architect is a specialization that probably indicates a deviant phase in the software industry – but at the same time it is a natural outcome of our IT job spec – resume – interview process. The skills of a modern software architect – knowledge of abstruse information and jargon often combined with an inability to get things done – is what we currently look for through our IT cargo cult hiring rituals.

whencanyoustart

This distinction between the ritual of IT hiring and the actual goals of IT hiring become most apparent when we look for specific as opposed to generalist skills. We hire generalists to be on staff over a long period. We hire specialists to perform difficult but real tasks that can eventually be handed over to our generalists – when we need to get something specific done.

Which gets us to the point of this post. What are the skills we should look for when hiring for a HoloLens developer? And what are the skills a HoloLens developer should be highlighting on her resume?

At this point in time, when there is still no SDK generally available for the HoloLens and all HoloLens coders are working for Microsoft and under various NDAs, it is hard to say. Fortunately, important clues have been provided by the recent announcement of the first consulting agency dedicated to the HoloLens and co-founded by someone who has been working on HoloLens applications for Microsoft over the past year. The company Object Theory was just started by Michael Hoffman and Raven Zachary and they threw up a website to advertise this new venture.

Among the tasks involved in creating this sort of extremely specialized website is explaining what capabilities you offer. First, they offer experience since Hoffman has worked on several of the demos that Microsoft has been exhibiting at conferences and in promotional videos. But is this enough of a differentiator? What skills do they have to offer to a company looking to build a HoloLens application?

This is part of the fascination of their “Work” page. It cannot describe any actual work since the company just started and hasn’t technically done any technical work. Instead, it provides a list of capabilities that look amazingly like resume keywords – but different from any keywords you may have come across:

 

          • Entirely new Natural User Interfaces (NUI)
          • Surface reconstruction and object persistence
          • 3D Spatial HRTF audio
          • Mesh reduction, culling and optimization
          • Baked shadows and ambient occlusion
          • UV mapping
          • Optimized render shaders
          • Efficient WiFi connectivity to back-end services
          • Unity and DirectX
          • Windows 10 APIs

 

These, in fact, are probably the sorts of skills you should be putting on your resume – or learning about in order to put on your resume – if getting a job programming HoloLens is your goal.

The verso side of this coin is that the list can also be turned into a great set of interview questions for someone thinking of hiring for HoloLens development, for instance:

Explain the concept of NUI to me.

Tell me about your experience with surface reconstruction and object persistence.

What is 3D spatial HRTF audio and why is it important for engineering HoloLens apps?

What are mesh reduction, mesh culling and mesh optimization?

Do you know anything about baked shadows and ambient occlusion?

Describe how you would go about performing UV mapping.

What are optimized render shaders and when would you need them?

How does the HoloLens communicate with external services such as a database?

What are the advantages and disadvantages of developing in Unity vs DirectX?

Describe the Windows 10 APIs that are used in HoloLens application development.

 

Then again, maybe these questions are a bit too abstruse?

HoloLens App Development with Unity

A few months ago I wrote a speculative piece about how HoloLens might work with XAML frameworks based on the sample applications Microsoft had been showing.

Even though Microsoft has still released scant information about integration with 3D platforms, I believe I can provide a fairly accurate walkthrough of how HoloLens development will occur for Unity3D. In fact, assuming I am correct, you can begin developing games and applications today and be in a position to release a HoloLens experience shortly after the hardware becomes available.

To be clear, though, this is just speculative and I have no insider information about the final product that I can talk about. This is just what makes sense based on publicly available information regarding HoloLens.

Unity3D integration with third party tools such as Kinect and Oculus Rift occurs through plugins. The Kinect 2 plugin can be somewhat complex as it introduces components that are unique to the Kinect’s capabilities.

The eventual HoloLens plugin, on the other hand, will likely be relatively simple since it will almost certainly be based on a pre-existing component called the FPSController (in Unity 5.1 which is currently the latest).

To prepare for HoloLens, you should start by building your experience with Unity 5.1 and the FPSController component. Here’s a quick rundown of how to do this.

Start by installing the totally free Unity 5.1 tools: http://unity3d.com/get-unity/download?ref=personal

newproject

Next, create a new project and select 3D for the project type.

newprojectcharacters

Click the button for adding asset packages and select Characters. This will give you access to the FPSController. Click done and continue. The IDE will now open with an practically empty project.

assetstore

At this point, a good Unity3D tutorial will typically show you how to create an environment. We’re going to take a shortcut, however, and just get a free one from the Asset Store. Hit Ctrl+9 to open the Asset Store from inside your IDE. You may need to sign in with your Unity account. Select the 3D Models | Environments menu option on the right and pick a pre-built environment to download. There are plenty of great free ones to choose from. For this walkthrough, I’m going to use the Japanese Otaku City by Zenrin Co, Ltd.

import

After downloading is complete, you will be presented with an import dialog box. By default, all assets are selected. Click on Import.

hierarchy_window

Now that the environment you selected has been imported, go the the scenes folder in your project window and select a sample scene from the downloaded environment. This will open up the city or dungeon or forest or whatever environment you chose. It will also make all the different assets and components associated with the scene show up in your Hierarchy window. At this point, we want to add the first-person shooter controller into the scene. You do this by selecting the FPSController from the project window under Assets/Standard Assets/Characters/FirstPersonCharacter/Prefabs and dragging the FPSController into your Hierarchy pane.

fpscontroller

This puts a visual representation of the FPS controller into your scene. Select the controller with your mouse and hit “F” to zoom in on it. You can see from the visual representation that the FPS controller is basically a collision field that can be moved with a keyboard or gamepad that additionally has a directional camera component and a sound component attached. The direction the camera faces ultimately become the view that players see when you start the game.

dungeon

Here is another scene that uses the Decrepit Dungeon environment package by Prodigious Creations and the FPS controller. The top pane shows a design view while the bottom pane shows the gamer’s first-person view.

buttons

You can even start walking through the scene inside the IDE by simply selecting the blue play button at the top center of the IDE.

The way I imagine the HoloLens integration to work is that another version of FPS controller will be provided that replaces mouse controller input with gyroscope/magnetometer input as the player rotates her head. Additionally, the single camera view will be replaced with a two camera rig that sends two different, side-by-side feeds back to the HoloLens device. Finally, you should be able to see how all of this works directly in the IDE like so:

stereoscope

There is very good evidence that the HoloLens plugin will work something like I have outlined and will be approximately this easy. The training sessions at the Holographic Academy during /Build pretty much demonstrated this sort of toolchain. Moreover, this is how Unity3D currently integrates with virtual reality devices like Gear VR and Oculus Rift. In fact, the screen cap of the Unity IDE above is from an Oculus game I’ve been working on.

So what are you waiting for? You pretty much have everything you already need to start building complex HoloLens experiences. The integration itself, when it is ready, should be fairly trivial and much of the difficult programming will be taken care of for you.

I’m looking forward to seeing all the amazing experiences people are building for the HoloLens launch day. Together, we’ll change the future of personal computing!

Minecraft in Virtual Reality and Augmented Reality

minecraft_ar_godview

Microsoft recently created possibly the best demo they have ever done on stage for E3. Microsoft employees played Minecraft in a way no one has ever seen it before, on a table top as if it was a set of legos. Many people speculated on social media that this may be the killer app that HoloLens has been looking for.

What is particularly exciting about the way the demo captured people’s imaginations is that they can start envisioning what AR might actually be used for. People are even getting a firm grip on the differences between Virtual Reality, which creates an immersive experience, and augmented reality which creates a mixed experience overlapping digital objects with real world objects.

Nevertheless, there is still a tendency to see virtual reality exemplified by the Oculus Rift and augmented reality exemplified by HoloLens and Magic Leap as competing solutions. In fact they are complementary solutions. They don’t compete with one another any more than your mouse and your keyboard do.

Bill Buxton has famously said that everything is best for something and worst for something else. By contrasting the Minecraft experience for Oculus and HoloLens, we can better see what each technology is best at.

 

minecraft_vr

The Virtual Reality experience for Oculus is made possible by a free hacking effort called Minecrift. It highlights the core UX flavor of almost all VR experiences – they are first person, with the player fully present in a 3D virtual world. VR is great for playing Minecraft in adventure or survival mode.

minecraft_ar_wall

Adventure mode with HoloLens is roughly equivalent to the adventure mode we get today on a PC or XBOX console with the added benefit that the display can be projected on any wall. It isn’t actually 3D, though, as far as we can tell from the demo, despite the capability of displaying stereoscopic scenes with HoloLens.

What does work well, however, is Minecraft in creation mode. This is basically the god view we have become familiar with from various strategy and resource games over the years.

minecraft_ar_closeup

God View vs First Person View

In a fairly straightforward way, it makes sense to say that AR is best for a god-centric view while VR is best for a first-person view. For instance, if we wanted to create a simulation that allows users to fly a drone or manipulate an undersea robot, virtual reality seems like the best tool for the job. When we need to create a synoptic view of a building or even a city, on the other hand, then augmented reality may be the best UX. Would it be fair to say that all new UX experiences fall into one of these two categories?

 

Most of our metaphors for building software, building businesses and building every other kind of buildable thing, after all, are based on the lego building block and it’s precursors the Lincoln log and erector sets. We play games as children in order, in part, to prepare ourselves for thinking as adults. Minecraft was built similarly on the idea of creating a simulation of a lego block world that we could not only build but also virtually play in on the computer.

lego_astronaut

The playful world of Lego blocks is built on two things: the blocks themselves formed into buildings and scenes and the characters that we identify with who live inside the world of blocks. In other words the god-view and the first-person view.

coffee prince

It should come as no surprise, then, that these two core modes of our imaginative lives should stay with us through our childhoods and into our adult approaches to the world. We have both an interpersonal side and an abstract, calculating side. The best leaders have a bit of both.

minecraft_ar_limburgh 

You apparently didn’t put one of the new coversheets on your TPS report

The god-view in business tends to be the synoptic view demanded by corporate executives and provided in the form of dashboards or crystal reports. It would be a shame if AR ended up falling into that use-case when it can provide so much more and in more interesting ways. As both VR and AR mature over the next five years, we all have a responsibility to keep them anchored in the games of our childhood and avoid letting them become the faults and misdemeanors of the corporate adult world.

Update 6/20

A recent arstechnica article indicates that the wall-projected HoloLens version of Minecraft in adventure mode can be played in true 3D:

One other impressive feature of the HoloLens-powered virtual screen was the ability to activate a three-dimensional image, so that the scene seemed to recede into the wall like a window box. Unlike a standard 3D monitor, this 3D image actually changed perspective based on the viewing angle. If I went up near the wall and looked at the screen from the left, I could see parts of the world that would usually be behind the right side of the wall, as if the screen was simply a window into another world.

Why are the best Augmented Reality Experiences inside of Virtual Reality Experiences?

elite_cockpit

I’ve been playing the Kickstarted space simulation game Elite: Dangerous for the past several weeks with the Oculus Rift DK2. Totally work related, of course.

Basically I’ve had the DK2 since Christmas and had been looking for a really good game to go with my device (rather than the other way around). After shelling out $350 for the goggles, $60 more for a game didn’t seem like such a big deal.

In fact, playing Elite: Dangerous with the Oculus and an XBox One gamepad has been one of the best gaming experiences I have ever had in my life – and I’m someone who played E.T. on the Atari 2600 when it first came out so I know what I’m talking about, yo. It is a fully realized Virtual Reality environment which allows me to fly through a full simulation of the galaxy based on current astronomical data. When I am in the simulation, I objectively know that I am playing a game. However, all of my peripheral awareness and background reactions seem to treat the simulation as if it is real. My sense of space changes and my awareness expands into the virtual space of the simulation. If I don’t mistake the VR experience for reality, I nevertheless do experience a strong suspension of disbelief when I am inside of it.

elite_cockpit2

One of the things I’ve found fascinating about this Virtual Reality simulation is that it is full of Augmented Reality objects. For instance, the two menu bars at the top of the screencap above, to the top left and the top right, are full holograms. When I move my head around, parallax effects demonstrate that their positions are anchored to the cockpit rather than to my personal perspective. If the VR goggles allowed me to do it, I would be able to even lean forward and look at the backside of those menus. Interestingly, when the game is played in normal 3D first person mode rather than VR with the Oculus, those menus are rendered as head-up displays and are anchored to my point of view as I use the mouse to look around the cockpit — in much the same way that google glass anchored menus to the viewer instead of the viewed.

The navigation objects on the dashboard in front of me are also AR holograms. Their locations are anchored to the cockpit rather than to me, and when I move around I can see them at different angles. At the same time, they exhibit a combination of glow and transparency that isn’t common to real-world objects and that we have come to recognize, from sci fi movies, as the inherent characteristics of holograms.

I realized at about the 60 hour mark into my gameplay \ research that one of the current opportunities as well as problems with AR devices like the Magic Leap and HoloLens is that not many people know how to develop UX for them. This was actually one of the points of a panel discussion concerning HoloLens at the recent BUILD conference. The field is wide open. At the same time, UX research is clearly already being done inside VR experiences like Elite: Dangerous. The hologram-based control panel at the front of the cockpit is a working example of how to design navigation tools using augmented reality.

elite_cockpit3

Another remarkable feature of the HoloLens is the use of gaze as an input vector for human-computer interactions. Elite: Dangerous, however, has already implemented it. When the player looks at certain areas of the cockpit, complex menus like the one shown in the screencap above pop into existence. When one removes one’s direct gaze, the menu vanishes. If this were a usability test for gaze-based UI, Elite: Dangerous will have already collected hours of excellent data from thousands of players to verify whether this is an effective new interaction (in my experience, it totally is, btw). This is also the exact sort of testing that we know will need to be done over the next few years in order to firm up and conventionalize AR interactions. By happenstance, VR designers are already doing this for AR before AR is even really on the market.

sao1

The other place augmented reality interaction design research is being carried out is in Japanese anime. The image above is from a series called Sword Art Online. When I think of VR movies, I think of The Matrix. When I put my children into my Oculus, however, they immediately connected it to SAO. SAO is about a group of beta testers for a new MMORPG that requires virtual reality goggles who become trapped inside the MMORPG due to the evil machinations of one of the game developers. While the setting of the VR world is medieval, players still interact with in-game AR control panels.

sao2

Consider why this makes sense when we ask the hologram versus head-up display question. If the menu is anchored to our POV, it becomes difficult to actually touch menu items. They will move around and jitter as the player looks around. In this case, a hologram anchored to the world rather than to the player makes a lot more sense. The player can process the consistent position of the menu and anticipate where she needs to place her fingers in order to interact with it. Sword Art Online effectively provides what Bill Buxton describes as a UX sketch for interactions of this sort.

On an intellectual level, consider how many overlapping interaction metaphors are involved in the above sketch. We have a 1) GUI-based menu system transposed to 2) touch (no right clicking) interactions, then expressed as 3) an augmented reality experience placed inside of 4) a virtual reality experience (and communicated inside a cartoon).

Why is all of this possible? Why are the best augmented reality experiences inside of virtual reality experiences and cartoons? I think it has to do with cost of execution. Illustrating an augmented reality experience in an anime is not really any more difficult than illustrating a field of grass or a cute yellow gerbil-like character. The labor costs are the same. The difficulty is only in the conceptualization.

Similarly, throwing a hologram into a virtual reality experience is not going to be any more difficult than throwing a tree or a statue into the VR world. You just add some shaders to create the right transparency-glowy-pulsing effect and you have a hologram. No additional work has to be done to marry the stereoscopic convergence of hologram objects and the focal position of real world locations as is required for really good AR. In the VR world, these two things – the hologram world and the VR world – are collapsed into one thing.

There has been a tendency to see virtual reality and mixed reality as opposed technologies. What I have learned from playing with both, however, is that they are actually complementary technologies. While we wait for AR devices to be released by Microsoft, Magic Leap, etc. it makes sense to jump into VR as a way to start understanding how humans will interact with digital objects and how we must design for these interactions. Additionally, because of the simplification involved in creating AR for VR rather than AR for reality, it is likely that VR will continue to hold a place in the design workflow for prototyping our AR experiences even years from now when AR becomes not only a reality but an integral thread in the fabric of reality.

On The Gaze as an input device for Hololens

microsoft-hololens-build-anatomy

The Kinect sensor and other NUI devices have introduced an array of newish interaction patterns between humans and computers: tap, touch, speech, finger tracking, body gestures. The hololens provides us with a new method of interaction that hasn’t been covered extensively from a UX perspective before: The Gaze. Hololens performs eye tracking in order to aid users of the AR device to activate menus.

Questions immediately arise as to the role this will play in surveillance culture, and even more in the surveillance of surveillance culture. While sensors track our gaze, will they similarly inform us about the gaze of others? Will we one day receive notifications that someone is checking us out? Quis custodiet ipsos custodes? To the eternal question who watches the watchers, we finally have an answer. HoloLens does.

lacan gaze

Even though The Gaze has not been analyzed deeply from a UX perspective, it has been the object of profound study from a phenomenological and a structuralist point of view. In this post I want to introduce you to five philosophical treatments of The Gaze covering the psychological, the social, the cinematic, the ethical and the romantic. To start, the diagram above is not from an HCI book as one might reasonably assume but rather from a monograph by the psychoanalyst Jacques Lacan.

the_ambassadors

A distinction is often drawn between Lacan’s early studies of The Gaze and his later conclusions about it. The early work relates it to a “mirror stage” of self-awareness and concerns the gaze when directed to ourselves rather than to others:

“This event can take place … from the age of six months on; its repetition has often given me pause to reflect upon the striking spectacle of a nursling in front of a mirror who has not yet mastered walking or even standing, but who … overcomes, in a flutter of jubilant activity, the constraints of his prop in order to adopt a slightly leaning-forward position and take in an instantaneous view of the image in order to fix it in his mind.”

This notion flowered in the later work The Split Between the Eye and the Gaze into a theory of narcissism in which the subject sees himself/herself as an objet petit a (a technical term for an object of desire) through the distancing effect of the gaze. Through this distancing, the subject also become alienated from itself. What is probably essential for us in this work – as students of emerging technologies –  is the notion that the human gaze is emotionally distancing. This observation was later taken up in post-colonial theory as the “Imperial Gaze” and in feminist theory as “objectification.”

eye-tracking

Michel Foucault achieved fame as a champion of the constructivist interpretation of truth but it is often forgotten that he was also an historian of science. A major theme in his work is the way in which the gaze of the other affects and shapes us – in particular the “scientific gaze.” Being watched causes us discomfort and we change our behavior – sometimes even our perception of who we are – in response to it. The grand image Foucault raises to encapsulate this notion is Jeremy Bentham’s Panopticon, a circular prison in which everyone is watched by everyone else.

THE HOUSEHOLD OF PHILIP IV or LAS MENINAS by Juan Bautista Martinez del Mazo (c1612-15-1667) after Diego Velazquez (1599 - 1660), at Kingston Lacy, Dorset

Where Lacan concentrates on the self-gaze and Foucault on the way the gaze makes us feel, Slovoj Zizek is concerned with the appearance of The Gaze when we gaze back at it. He writes in an essay called “Why Does the Phaullus Appear” from the collection Enjoy Your Symptom:

Shane Walsh (Jon Bernthal) - The Walking Dead - Season 2, Episode 12 - Photo Credit: Gene Page/AMC

Let us take the ‘phantom of the opera,’ undoubtedly mass culture’s most renowned specter, which has kept the popular imagination occupied from Gaston Leroux’s novel at the turn of the century through a series of movie and television versions up to the recent triumphant musical: in what consists, on a closer look, the repulsive horror of his face? The features which define it are four:

“1) the eyes: ‘his eyes are so deep that you can hardly see the fixed pupils. All you can see is two big black holes, as in a dead man’s skull.’ To a connoisseur of Alfred Hitchcock, this image instantly recalls The Birds, namely the corpse with the pecked-out eyes upon which Mitch’s mother (Jessica Tandy) stumbles in a lonely farmhouse, its sight causing her to emit a silent scream. When, occasionally, we do catch the sparkle of these eyes, they seem like two candles lit deep within the head, perceivable only in the dark: these two lights somehow at odds with the head’s surface, like lanterns burning at night in a lonely, abandoned house, are responsible for the uncanny effect of the ‘living dead.’”

Obviously whatever Zizek says about the phantom of the opera applies equally well to The Walking Dead. What ultimately distinguishes vampires, zombies, demons and ghosts lies in the way they gaze at us.

While Zizek finds in the eyes a locus for inhumanity, the ethicist Emmanual Levinas believes this is where our humanity resides. These two notions actually complement each other, since what Zizek indentifies as disturbing is the inability to project a human mind behind a vacant stare. As Levinas says in a difficult and metaphysical way in his masterpiece Totality and Infinity:

“The presentation of the face, expression, does not disclose an inward world previously closed, adding thus a new region to comprehend or to take over. On the contrary, it calls to me above and beyond the given that speech already puts in common among us…. The third party looks at me in the eyes of the Other – language is justice. It is not that there first would be the face, and then the being it manifests or expresses would concern himself with justice; the epiphany of the face qua face opens humanity…. Like a shunt every social relation leads back to the presentation of the other to the same without the intermediary of any image or sign, solely by the expression of the face.”

The face and the gaze of the other implies a demand upon us. For Levinas, unlike Foucault, this demand isn’t simply a demand to behave according to norms but more broadly posits an existential command. The face of the other asks us implicitly to do the right thing: it demands justice.

new_optics_06

The final aspect of the gaze to be discussed – though probably the first aspect to occur to the reader – is the gaze of love, i.e. love at first sight. This was a particular interest of the scholar Ioan P. Couliano. In his book Eros and Magic in the Renaissance Couliano examines old medical theories about falling in love and cures for infatuation and obsession. He relates this to Renaissance theories about pneuma [spiritus, phantasma], which was believed to be a pervasive fluid that allowed objects to be sensed through apparently empty air and become transmitted to the brain and the heart. In this regard, Couliano raises a question that would only make sense to a true Renaissance man: “How does a woman, who is so big, penetrate the eyes, which are so small?” He quotes the 13th century Bernard of Gordon:

Leopold_von_Sacher-Masoch_with_Fannie

The illness called ‘hereos’ is melancholy anguish caused by love for a woman. The ‘cause’ of this affliction lies in the corruption of the faculty to evaluate, due to a figure and a face that have made a very strong impression. When a man is in love with a woman, he thinks exaggeratedly of her figure, her face, her behavior, believing her to be the most beautiful, the most worthy of respect, the most extraordinary with the best build in body and soul, that there can be. This is why he desires her passionately, forgetting all sense of proportion and common sense, and thinks that, if he could satisfy his desire, he would be happy. To so great an extent is his judgment distorted that he constantly thinks of the woman’s figure and abandons all his activities so that, if someone speaks to him, he hardly hears him.”

And here is Couliano’s gloss of Bernard’s text:

RokebyVenus

“If we closely examine Bernard of Gordon’s long description of ‘amor hereos,’ we observe that it deals with a phantasmic infection finding expression in the subject’s melancholic wasting away, except for the eyes. Why are the eyes excepted? Because the very image of the woman has entered the spirit through the eyes and, through the optic nerve, has been transmitted to the sensory spirit that forms common sense…. If the eyes do not partake of the organism’s general decay, it is because the spirit uses those corporeal apertures to try to reestablish contact with the object that was converted into the obsessing phantasm: the woman.”

 

[As an apology and a warning, I want to draw your attention to the use of ocular vocabulary such as “perspective,” “point of view,” “in this regard,” etc. Ocular phrases are so pervasive in the English language that it is nearly impossible to avoid them and it would be more awkward to try to do so than it is to use them without comment. If you intend to speak about visual imagery, take my advice and pun proudly and without apology – for you will see that you have no real choice in the matter.]