Tag Archives: HoloLens

Microsoft’s convergence of chatbots and mixed reality

One of the biggest trends in mixed reality this year is the arrival of chatbots on platforms like HoloLens. Speech commands are a common input for many XR devices. Adding conversational AI to extend these native speech recognition capabilities is a natural next steps toward a future in which personalized virtual assistant backed by powerful AI accompany us in hologram form. They may be relegated to providing us with shopping suggestions, but perhaps, instead, they’ll become powerful custom tools that help make us sharper, give honest feedback, and assist in achieving our personal goals.

If you have followed the development of sci-fi artificial intelligence in television and movies over the years, the move from voice to full holograms will seem natural. In early sci-fi, such as HAL from the movie 2001: A Space Odyssey or the computer from the original Star Trek, computer intelligence was generally represented as a disembodied voice. In more recent incarnations of virtual assistance, such as Star Trek Voyager and Blade Runner 2049, these voices are finally personified by full holograms of the Emergency Medical Hologram and Joi.

In a similar way, Cortana, Alexa, and Siri are slowly moving from our smartphones, Echos, and Invoke devices to our holographic headsets. These are still early days, but the technology is already in place and the future incarnation of our virtual assistants is relatively clear.

The rise of the chatbot

For Microsoft’s personal digital assistant Cortana, who started her life as a hologram in the Halo video games for Xbox, the move to holographic headsets is a bit of a homecoming. It seems natural, then, that when Microsoft HoloLens was first released in 2016, Cortana was already built into the onboard holographic operating system.

Then, in a 2017 article on the Windows Apps Team blog, Building the Terminator Vision HUD in HoloLens, Microsoft showed people how to integrate Azure Cognitive Services into their holographic head-mounted display in order to provide smart object recognition and even translation services as a Terminator-like HUD overlay.

The only thing left to do to get to a smart virtual assistant was to tie together the HoloLens’s built-in Cortana speech capabilities with some AI to create an interactive experience. Not surprisingly, Microsoft was able to fill this gap with the Bot Framework.

Virtual assistants and Microsoft Bot Framework

Microsoft Bot Framework combines AI backed by Azure Cognitive Serviceswith natural-language capabilities. It includes a set of open source SDKs and tools that enable developers to build, test, and connect bots that interact naturally with users. With the Microsoft Bot Framework, it is easy can create a bot that can speak, listen, understand, and even learn from your users over time with Azure Cognitive Services. This chatbot technology is sometimes referred to as conversational AI.

There are several chatbot tools available. I am most familiar with the Bot Framework, so I will be talking about that. Right now, chatbots built with the Bot Framework can be adapted for speech interactions or for text interactions like the UPS virtual assistant example above. They are relatively easy to build and customize using prepared templates and web-based dialogs.

One of my favorite ways to build a chatbot is by using QnA Maker, which lets you simply point to an online FAQ page or upload product documentation to use as the knowledge base for your bot service. QnA Maker then walks you through applying a chatbot personality to your knowledge base and deploying it, usually with no custom coding. What I love about this is that you can get a sophisticated chatbot rolled out in about half a day.

Using the Microsoft Bot Framework, you also have the ability to take full control of the creation process to customize your bot in code. Bot apps can be created in C#, JavaScript, Python or Java. You can extend the capabilities of the Bot Framework with middleware that you either create yourself or bring into your code from third parties. There are even advanced capabilities available for managing complex conversation flows with branches and loops.

Ethical chatbots

Having introduced the idea above of building a Terminator HUD using Cognitive Services, it’s important to also raise awareness about fostering an environment of ethical AI and ethical thinking around AI. To borrow from the book The Future Computed, AI systems should be fair, reliable and safe, private and secure, inclusive, transparent, and accountable. As we build all forms of chatbots and virtual assistants, we should always consider what we intend our intelligent systems to do, as well as concern ourselves with what they might do unintentionally.

The ultimate convergence of AI and mixed reality

Today, chatbots are geared toward integrating skills for commerce like finding directions, locating restaurants, and providing help with a company’s products through virtual assistants. One of the chief research goals driving better chatbots is to personalize the chatbot experience. Achieving a high level of personalization will require extending current chatbots with more AI capabilities. Fortunately, this isn’t a far-future thing. As shown in the Terminator HUD tutorial above, adding Cognitive Services to your chatbots and devices is easy to do.

Because holographic headsets have many external sensors, AI will also be useful for analyzing all this visual and location data and turning it into useful information through the chatbot and Cognitive Services. For instance, cameras can be used to help translate street signs if you are in a foreign city or to identify products when you are shopping and provide helpful reviews.

Finally, AI will be needed to create realistic 3D model representations of your chatbot and overcome the uncanny valley that is currently holding back VR, AR, and MR. When all three elements are in place to augment your chatbot — personalization, computer vision, and humanized 3D modeling — we’ll be that much closer to what we’ve always hoped for — personalized AI that looks out for us as individuals.

Here is some additional reading on the convergence of chatbots and MR you will find helpful:

The Fork in Mixed Reality

Futuristic Spaceman - Programmer Reading Projected Information

Yogi Berra gnomically said, “when you come to a fork in the road, take it.”  On the evening of Friday, February 1st, 2019 at approximately 9 PM EST, that’s exactly what happened to Mixed Reality.

The Mixed Reality Toolkit open source project, which grew out of the earlier HoloLens Toolkit on github, was forked into the Microsoft MRTK and a cross-platform XRTK (read the announcement). While the MRTK will continue to target primarily Microsoft headsets like the HoloLens and WMR, XRTK will feature a common framework for HoloLens, Magic Leap, VR Headsets, Mobile AR – as well as HoloLens 2 and any other MR devices that eventually come on the market.

So why did this happen? The short of it is that open source projects can sometimes serve multiple divergent interests and sometimes they cannot. Microsoft was visionary in engineering and releasing the original HoloLens MR Headset. They made an equally profound and positive step back in 2016 by choosing to open source the developer SDK/Framework/Toolkit (your choice) that allows developers to build Unity apps for the HoloLens. This was the original HoloLens Toolkit (HLTK).

While the HLTK started as a primarily Microsoft engineering effort, members of the community quickly jumped in and began contributing more and more code to the point that the Microsoft contributions became a minority of overall contributions. This, it should be noted, goes against the common trend of a single company paying their own engineers to keep an open source project going. The HLTK was an open source success story.

In this regard, it is worth calling out two developers in particular, Stephen Hodgson and Simon Jackson, for the massive amounts of code and thought leadership they have contributed to the MR community. Unsung heroes barely captures what they have done.

In 2017 Microsoft started helping to build occluded WinMR (virtually the same as VR) devices with several hardware vendors and it made sense to create something that supported more than just the HoloLens. This is how the MRTK came to be. It served the same purpose as the HLTK, to accelerate development with Unity scripts and components, but now with a larger perspective about who should be served.

In turn, this gave birth to something that is generally known as MRTK vNext, an ambitious project to support not just Microsoft devices but also platforms from other vendors. And what’s even more amazing, this was again driven by the community rather than by Microsoft itself. Microsoft was truly embracing the open source mindset and not just paying lip service to it as many naysayers were claiming.

But as Magic Leap, the other major MR Headset vendor, finally released their product in fall, 2018, things began to change. Unlike Microsoft, Magic Leap developed their SDK in-house and threw massive resources at it. Meanwhile, Microsoft finally started throwing their engineers at the MRTK again after taking a long hiatus. This may have been in response to the Magic Leap announcement or equally could have been because the team was setting the stage for a HoloLens 2 announcement in early 2019.

And this was the genesis of the MR fork in the road: For Microsoft, it did not make sense to devote engineering dollars toward creating a platform that supported their competitors’ devices. In turn, it probably didn’t make sense for engineers from Google, Magic Leap, Apple, Amazon, Facebook, etc. to devote their time toward a project that was widely seen as  a vehicle for Microsoft HMDs.

And so a philosophical split needed to occur. It was necessary to fork MRTK vNext. The new XRTK (which is also pronounced “Mixed Reality Toolkit”) is a cross-platform framework for HoloLens as well as Magic Leap (Lumin SDK support is in fact already working in XRTK and is getting even more love over the weekend even as I write).

But XRTK will also be a platform that supports developing for Oculus Rift, Oculus Go, HTC Vive, Apple ARKit, Google ARCore, the new HoloLens 2 which may or may not be announced at MWC 2019, and whatever comes next in the  Mixed Reality Continuum.

So does this mean it is time to stick a fork in the Microsoft MRTK? Absolutely not. Microsoft’s MRTK will continue to do what people have long expected of it, supporting both HoloLens and Occluded WinMR devices (that is such a wicked mouthful — I hope someone will eventually give it a decent name like “Windows Surface Kinect for Azure Core DotNet Silverlight Services” or something similarly delightful).

In the meantime, while Microsoft is paying its engineers to work on the MRTK, XRTK needs fresh developers to help contribute. If you work for a player in the MR/VR/AR/XR space, please consider contributing to the project.

Or to word it in even stronger terms, if you give half a fork about the future of mixed reality, go check out  XRTK  and start making a difference today.

10 Questions with Jasper Brekelmans

This is the first in a series of interviews intended to help people get to know the movers & shakers as well as the drones & technicians (sometimes the same person is all four) who are making mixed reality … um … a reality.  I’ve borrowed the format from Vox but added some new questions.

brekel

Though not widely known outside of certain circles, when you ask experienced HoloLens developers who they most admire, Jasper’s name usually comes up. Jasper is the creator of the Brekel Toolset, an affordable tool for doing motion capture with the Kinect sensor. He also works with HoloLens, Oculus, and the Vive and his innovative projects have been featured on RoadToVR and other venues. His work on collaboration between multiple Vive headsets was mind-blowing—but then again, so was his HoloLens motion capture demo with a live dancer, his HoloLens integration with Autodesk MotionBuilder, and his recent release of the OpenVR Recorder.

Without further ado, here are Jasper’s answers to 10 questions:

 

What movie has left the most lasting impression on you?
Spring Summer, Fall, Winter… and Spring“, “A Clockwork Orange“, “The Evil Dead“, “The Wrestler“, “Straight Story“, “Hidden Figures“…… too many to choose 🙂

What is the earliest video game you remember playing?
Pac-Man (arcade) and Donkey Kong (handheld).

Who is the person who has most influenced the way you think?
A work mentor and some close personal friends.

When was the last time you changed your mind about something?
Probably on a weekly basis on something or other.

What’s a programming skill people assume you have but that you are terrible at?
Heavily math based algorithms and/or coding for mobile platforms.

What inspires you to learn?
The goal of having new possibilities with freshly learned skills.

What do you need to believe in order to get through the day?
That what I do matters to others.

What’s a view that you hold but can’t defend?
That humanity will be better off once next generations have grown up with true AR glasses/lenses technology, have played with virtual galaxies and value virtual objects similarly to physical objects for certain purposes.

What will the future killer Mixed Reality app do?
Empower users in their daily live without them realizing it while at the same time letting new users realize what they miss instantly.

What book have you recommended the most?
Ready Player One.

Pokémon Go as An Illustration of AR Belief Circles

venn

Recent rumors circling around Pokémon Go suggest that they will delay their next major update until next year. It was previously believed that they would be including additional game elements, creatures and levels beyond level 40 sometime in December.

A large gap between releases like this would seem to leave the door open to other copy cat games to move into the opening that Niantec is providing them. And maybe this wouldn’t be such a bad thing. While World of Warcraft is the most successful MMORPG, for instance, it certainly wasn’t the first. Dark Age of Camelot, Everquest, Asheron’s Call and Ultima Online all preceded it. What WoW did was perhaps to collect the best features of all these games while also ride the right graphics card cycle to success.

A similar student-becomes-the-master trope can play out for other franchise owners, since the only thing that seems to be required to get a game similar to Pokemon going is a pre-existing storyline (like WoW had) and 3D assets either available or easily created to go into the game. With Azure and AWS cloud computing easily available, even infrastructure isn’t such a challenge as it was when the early MMORPGs were starting. Possible franchise holders that could make the leap into geographically-aware augmented reality games include Disney, Wow itself, Yu-Gi-Oh!, Magic the Gathering, and Star Wars.

Imagine going to the park one day and asking someone else face down staring at their phone if they know where the bulbasaur showing up on the nearby is and having them not knowing what you are talking about because they are looking for Captain Hook or a jawa on their nearby?

This sort of experience is exemplary of what Vernor Vinge calls belief circles in his book about augmented reality, Rainbow’s End. Belief circles describe groups of people who share a collaborative AR experience. Because they also share a common real life world with others, their belief circles may conflict with other people’s belief circles. What’s even more peculiar is that members of different belief circles do not have access to each other’s augmented worlds – a peculiar twist on the problem of other minds. So while a person in H.P. Lovecraft’s belief circle can encounter someone in Terry Pratchett’s Discworld belief circle at a Starbuck’s, it isn’t at all clear how they will ultimately interact with one another. Starbuck’s itself may provide virtual assets that can be incorporated into either belief circle in order to attract customers from different worlds and backgrounds – basically multi-tier marketing of the future. Will different things be emphasized in the store based on our self-selected belief circles? Will our drinks have different names and ingredients? How will trademark and copyright laws impact the ability to incorporate franchises into the muti-aspect branding of coffee houses, restaurants and other mall stores?

But most of all, how will people talk to each other? One of the great pleasures of playing Pokemon today is encountering and chatting with people I otherwise wouldn’t meet and having a common set of interests that trump our political and social differences. Belief circles in the AR future of five to ten years may simply encourage the opposite trend of community Balkanization in interest zones. Will high concept belief circles based on art, literature and genre fiction simply devolve into Democrat and Republican belief circles at some point?

HoloLens Occlusion vs Field of View

prometheusmovie6812

[Note: this post is entirely my own opinion and purely conjectural.]

Best current guesses are that the HoloLens field of view is somewhere between 32 degrees and 40 degrees diagonal. Is this a problem?

We’d definitely all like to be able to work with a larger field of view. That’s how we’ve come to imagine augmented reality working. It’s how we’ve been told it should work from Vernor Vinge’s Rainbow’s End to Ridley Scott’s Prometheus to the Iron Man trilogy – in fact going back as far as Star Wars in the late 70’s. We want and expect a 160-180 degree FOV.

So is the HoloLens’ field of view (FOV) a problem? Yes it is. But keep in mind that the current FOV is an artifact of the waveguide technology being used.

What’s often lost in the discussions about the HoloLens field of view – in fact the question never asked by the hundreds of online journalists who have covered it – is what sort of trade-off was made so that we have the current FOV.

A common internet rumor – likely inspired by a video by tech evangelist Bruce Harris taken a few months ago – is that it has to do with cost of production and consistency in production. The argument is borrowed from chip manufacturing and, while there might be some truth in it, it is mostly a red herring. An amazingly comprehensive blog post by Oliver Kreylos in August of last year went over the evidence as well as related patents and argued persuasively that while increasing the price of the waveguide material could improve the FOV marginally, the price difference was prohibitively expensive and ultimately nonsensical. At the end of the day, the FOV of the HoloLens developer unit is a physical limitation, not a manufacturing limitation or a power limitation.

haunted_mansion

But don’t other AR headset manufacturers promise a much larger FOV? Yes. The Meta 2 (shown below) has a 90 degree field of view. The way the technology works, however, involves two LED screens that are then viewed through plastic positioned at 45 degrees to the screens (technically known as a beam splitter, informally known as a piece of glass) that reflects the image into the user’s eyes at approximately half the original brightness while also letting the real world in front of the user (though half of that light is also scattered). This is basically the same technique used to create ghostly images in the Haunted Mansion at Disneyland.

brain

The downside of this increased FOV is you are loosing a lot of brightness through the beam splitter. You are also losing light based on the distance it takes the light to pass through the plastic and get to your eyes. The result is a see-through “hologram”.

Iron-Man-AR

But is this what we want? See-through holograms? The visual design team for Iron man decided that this is indeed what they wanted for their movies. The translucent holograms provide a cool ghostly effect, even in a dark room.

leia

The Princess Leia hologram from the original Star Wars, on the other hand, is mostly opaque. That visual design team went in a different direction. Why?

leia2

My best guess is that it has to do with the use of color. While the Iron Man hologram has a very limited color palette, the Princess Leia hologram uses a broad range of facial tones to capture her expression – and also so that, dramatically, Luke Skywalker can remark on how beautiful she is (which obviously gets messed up by the Return of the Jedi). Making her transparent would simply wash out the colors and destroy much of the emotional content of the scene.

star_wars_chess

The idea that opacity is a pre-requisite for color holograms is confirmed in the Star Wars chess scene on the Millennium Falcon. Again, there is just enough transparency to indicate that the chess pieces are holograms and not real objects (digital rather than physical).

dude

So what kind of holograms does the HoloLens provide, transparent or near-opaque? This is something that is hard to describe unless you actually see it for yourself but the HoloLens “holograms” will occlude physical objects when they are placed in front of them. I’ve had the opportunity to experience this several times over the last year. This is possible because these digital images use a very large color palette and, more importantly, are extremely intense. In fact, because the holoLens display technology is currently additive, this occlusion effect actually works best with bright colors. As areas of the screen become darker, they actually appear more transparent.

Bigger field of view = more transparent , duller holograms. Smaller field of view = more opaque, brighter holograms.

I believe Microsoft made the bet that, in order to start designing the AR experiences of the future, we actually want to work with colorful, opaque holograms. The trade-off the technology seems to make in order to achieve this is a more limited field of view in the HoloLens development kits.

At the end of the day, we really want both, though. Fortunately we are currently only working with the Development Kit and not with a consumer device. This is the unit developers and designers will use to experiment and discover what we can do with HoloLens. With all the new attention and money being spent on waveguide displays, we can optimistically expect to see AR headsets with much larger fields of view in the future. Ideally, they’ll also keep the high light intensity and broad color palette that we are coming to expect from the current generation of HoloLens technology.

HoloLens Hardware Specs

visual_studio

Microsoft is releasing an avalanche of information about HoloLens this week. Within that heap of gold is, finally, clearer information on the actual hardware in the HoloLens headset.

I’ve updated my earlier post on How HoloLens Sensors Work to reflect the updated spec list. Here’s what I got wrong:

1. Definitely no internal eye tracking camera. I originally thought this is what the “gaze” gesture was. Then I thought it might be used for calibration of interpupillary distance. I was wrong on both counts.

2. There aren’t four depth sensors. Only one. I had originally thought these cameras would be used for spatial mapping. Instead just the one depth camera is, and it maps a 75 degree cone out in front of the headset, with a range of 0.8 M to 3.1 M.

3.  The four cameras I saw are probably just grayscale cameras – and it’s these cameras along with cool algorithms that are being used to do inside-out position tracking along with the IMU.

Here are the final sensor specs:

  • 1 IMU
  • 4 environment understanding cameras
  • 1 depth camera
  • 1 2MP photo / HD video camera
  • Mixed reality capture
  • 4 microphones
  • 1 ambient light sensor

The mixed reality capture is basically a stream that combines digital objects with the video stream coming through the HD video camera. It is different from the on-stage rigs we’ve seen which can calculate the mixed-reality scene from multiple points of view. The mixed reality capture is from the user’s point of view only. The mixed-reality capture can be used for streaming to additional devices like your phone or TV.

Here are the final display specs:

  • See-through holographic lenses (waveguides)
  • 2 HD 16:9 light engines
  • Automatic pupillary distance calibration
  • Holographic Resolution: 2.3M total light points
  • Holographic Density: >2.5k radiants (light points per radian)

I’ll try to explain “light points” in a later post – if I can ever figure it out.

Hitchhiking the Backroads to Augmented Reality

To date, Microsoft has been resistant to sharing information about the HoloLens technology. Instead, they have relied on shock and awe demos to impress people with the overall experience rather than getting mired down in the nitty-gritty of the software and hardware engineering. Even something as simple as the field-of-view is never described in mundane numbers but rather in circumlocutions about tv screens X distance from the viewer. It definitely builds up mystery around the product.

Given the lack of concrete information, lots of people have attempted to fill in the gaps with varying degrees of success which, in their own way, make it difficult to navigate the technological true true. In an effort to simplify the research one typically has to do on one’s own in order to understand HoloLens and AR, I’ve made a sort of map for those interested in making their way. Here are some of the best resources I’ve found.

1. You should start with the Oculus blog, which is obviously about the Oculus and not about HoloLens. Nevertheless, the core technology the makes the Oculus Rift work is also in the HoloLens in some form. Moreover, the Oculus blog is a wonderful example of sharing and successfully explaining complicated concepts to the layman. Master these posts about how the Rift works and you are half way to understanding how HoloLens works:

2. Next, you should really read Oliver Kreylos’s (Doc OK) brilliant posts about the HoloLens field of view and waveguide display technology. Many disagreements around HoloLens would evaporate if people would simply invest half an hour into reading OK’s insights :

3. If you’ve gone through these, then you are ready for Dr. Michael J. Gourlay’s youtube discussion of surface reconstruction, occlusion, tracking and mapping. Sadly the audio drops out at key moments and the video drops out for the entire Q & A, but there’s lots of gold for everyone in this mine. Also check out his audio interview at Georgia Tech:

4. There have been lots of first-impression blog posts concerning the HoloLens, but Jasper Brekelmans provides far-and-away the best of these by following a clear just-the-facts-ma’am approach:

5. HoloLens isn’t only about learning new technology but also discovering a new design language. Mike Alger’s video provides a great introduction into the problems as well as some solutions for AR/VR interface and usability design:

6. Oculus, Leap Motion and others who have been designing VR experiences provide additional useful tips about what they have discovered along the way in articles like the now famous “Swayze Effect” (yes, that Swayze):

7. Finally, here are some video parodies and inspirational videos of VR and AR from the tv show Community and others:

I know I’ve left a lot of good material out, but these have been some of the highlights for me over the past year while hitchhiking on the backroads leading to Augmented Reality. Drop them in your mental knapsack, stick out your thumb and wait for the future to pick you up.

HoloLens Fashionista

microsoft-executives-testing-the-hololens

While Google Glass certainly had its problems as an augmented reality device – among other things not really being an augmented reality device as GA Tech professor Blair MacIntyre pointed out – it did demonstrate two remarkable things. First, that people are willing to shell out $1500 for new technology. In the debates over the next year concerning the correct price point for VR and AR head mounted displays, this number will play a large role. Second, it demonstrated the importance of a sense of style when designing technology. Google glass, for many reasons, was a brilliant fashion accessory.

If a lesson can be drawn from these two data points, it might be that new — even Project Glass-level iffy — technology can charge a lot if it manages to be fashionable as well as functional.

When you look at the actual HoloLens device, you may, like me, be thinking “I don’t know if I’d wear that out in public.” In that regard, I’d like to nudge your intuitions a bit.

Obviously there is time to do some tweaking with the HL design. I recently found some nostalgic pictures online that made me start to think that with modifications, I could rock this look.

It all revolves around one of the first animes imported to the United States in the 70s called Battle of the Planets. It sounded like this:

Battle of the Planets! G-Force! Princess! Tiny! Keyop! Mark! Jason! And watching over them from Center Neptune, their computerized coordinator, 7-Zark-7! Watching, warning against surprise attacks by alien galaxies beyond space. G-Force! Fearless young orphans, protecting Earth’s entire galaxy. Always five, acting as one. Dedicated! Inseparable! Invincible!

 

And it looked AMAZING. I think this look could work for HoloLens. I think I could pull it off. The capes and tights, of course, are purely optional.

3457571746_828ed63868

2033478-g_force_3

hololenss

battle_of_the_planets___mark_and_princess_by_mlcraighead-d4igfy3

battle_of_the_planets__g_force_by_gabrielxmarquez-d6vlr7u

battle_of_the_planets_by_dwinbotp

Microsoft Windows 10

d099db9b8e3471db5f3016fa0349a61d

gatchaman_01

b8bea9c9-1ca3-4c10-8d52-44d04b3062a2

battle-of-the-planets-poster-c10106337jpeg1

lensinspace

holo-force

Why Augmented Reality is harder than Virtual Reality

hololensx519_0

At first blush, it seems like augmented reality should be easier than virtual reality. Whereas virtual reality involves the generation of full stereoscopic digital environments as well as interactive objects to place in those environments, augmented reality is simply adding digital content to our view of the real world. Virtual reality would seem to be doing more heavy lifting.

perspective

In actual fact, both technologies are creating illusions to fool the human eye and the human brain. In this effort, virtual reality has an easier task because it can shut out points of reference that would otherwise belie the illusion. Augmented reality experiences, by contrast, must contend with real world visual cues that draw attention to the false nature of the mixed reality content being added to a user’s field of view.

In this post, I will cover some of the additional challenges that make augmented reality much more difficult to get right. In the process, I hope to also provide clues as to why augmented reality HMDs like HoloLens and Magic Leap are taking much longer to bring to market than AR devices like the Oculus Rift, HTC Vive and Sony Project Morpheus.

terminator vision

But first, it is necessary to distinguish between two different kinds of augmented reality experience. One is informatics based and is supported by most smart phones with cameras. The ideal example of this type of AR is the Terminator-vision from James Cameron’s 1984 film “The Terminator.” It is relatively easy to to do and is the most common kind of AR people encounter today.

star wars chess

The second, and more interesting, kind of AR requires inserting illusory 3D digital objects (rather than informatics) into the world. The battle chess game from 1977’s “Star Wars” epitomizes this second category of augmented reality experience. This is extremely difficult to do.

The Microsoft HoloLens and Magic Leap (as well as any possible HMDs Apple and others might be working on) are attempts to bring both the easy type and the hard type of AR experience to consumers.

Here are a few things that make this difficult to get right. We’ll put aside stereoscopy which has already been solved effectively in all the VR devices we will see coming out in early 2016.

cloaked predator

1. Occlusion The human brain is constantly picking up clues from the world in order to determine the relative positions of objects such as shading, relative size and perspective. Occlusion is one that is somewhat tricky to solve. Occlusion is an effect that is so obvious that it’s hard to realize it is a visual cue. When one body is in our line of sight and is positioned in front of another body, that other body is partially hidden from our view.

In the case where a real world object is in front of a digital object, we can clip the digital object with an outline of the object in front to prevent bleed through. When we try to create the illusion that a digital object is positioned in front of a real world object, however, we encounter a problem inherent to AR.

In a typical AR HMD we see the real world through a transparent screen upon which digital content is either projected or, alternatively, illuminated as with LED displays. An obvious characteristic of this is that digital objects on a transparent display are themselves semi-transparent. Getting around this issue would seem to require being able to make certain portions of the transparent display more opaque than others as needed in order to make sure our AR objects look substantial and not ghostly.

 citizen kane

2. Accommodation It turns out that stereoscopy is not the only way our eyes recognize distance. The image above is from a scene in Orson Welles’s “Citizen Kane” in which a technique called “deep focus” is used extensively. Deep focus maintains clarity in the frame whether the actors and props are in the foreground, background or middle ground. Nothing is out of focus. The technique is startling both because it is counter to the way movies are generally shot but also because it is counter to how our eyes work.

If you cover one eye and use the other to look at one of your fingers, then move the finger toward and away from you, you should notice yourself refocusing on the finger as it moves while other objects around the finger become blurry. The shape of the cornea actually becomes more rounded when objects are close in order to cause light to refract more in order to reach the retina. For further away objects, the cornea flattens out because less refraction is needed. As we become older, the ability to bow the cornea lessens and we lose some of our ability to focus on near objects – for instance when we read. In AR, we are attempting to make a digital object that is really only centimeters from our eyes appear to be much further away.

Depending on how the light from the display passes through the eye, we may end up with the digital object appearing clear while the real world objects supposedly next to it and at the same distance appear blurred.

vergence_accomodation

3. Vergence-Accommodation Mismatch The accommodation problem is one aspect of yet another VR/AR difficulty. The term vergence describes the convergence and divergence of the two eyes from one another as objects move closer or further away. An interesting aspect of stereoscopy – which is used both for virtual reality as well as augmented reality to create the illusion of depth – is that the distance at which the two eyes coordinate to see an object is generally different from the focal distance from the eyes to the display screen(s). This consequently sends two mismatched signals to the brain concerning how far away the digital object is supposed to be. Is it the focal length or the vergence length? Among other causes, vergence-accommodation mismatch is believed to be a contributing factor to VR sickness. Should the accommodation problem above be resolved for a given AR device, it is safe to assume that the vergence-accommodation mismatch will also be solved.

 4. Tetherless Battery Life Smart phones have changed our lives among other reasons because they are tetherless devices. While the current slate of VR devices all leverage powerful computers to which they are attached, since VR experiences are all currently somewhat stationary (the HTC Vive being the odd bird), AR needs to be portable. This naturally puts a strain on the battery, which needs to be relatively light since it will be attached to the head-mounted-display, but also long-lived as it will be powering occasionally intensive graphics, especially for games.

5. Tetherless GPU Another strain on the system is the capability of the GPU. Virtual reality devices can be fairly intense since they require the user to purchase a reasonably powerful and somewhat expensive graphics card. AR devices can be expected to have similar graphics requirements as VR with much less to work with since the GPU needs to be onboard. We can probably expect a streamlined graphics pipeline dedicated to and optimized for AR experiences will help offset lower GPU capabilities.

6. Applications Not even talking about killer apps, here. Just apps. Microsoft has released videos of several impressive demos including Minecraft for HoloLens. Magic Leap up to this point has only shown post-prod, heavily produced illustrative videos. The truth is that everyone is still trying to get their heads around designing for AR. There aren’t really any guidelines for how to do it or even what interactions will work. Other than the most trivial experiences (e.g. weather and clock widgets projected on a wall) this will take a while as we develop best practices while also learning from our mistakes.

Conclusion

With the exception of V-AM, these are all problems that VR does not have to deal with. Is it any wonder, then, that while we are being led to believe that consumer models of the Oculus Rift, HTC Vive and Sony Project Morpheus will come to market in the first quarter of 2016, news about HoloLens and Magic Leap has been much more muted. There is simply much more to get right before a general rollout. One can hope, however, that dev units will start going out soon from the major AR players in order to mitigate challenge #6 while further tuning continues, if needed, on challenges #1-#5.