To date, Microsoft has been resistant to sharing information about the HoloLens technology. Instead, they have relied on shock and awe demos to impress people with the overall experience rather than getting mired down in the nitty-gritty of the software and hardware engineering. Even something as simple as the field-of-view is never described in mundane numbers but rather in circumlocutions about tv screens X distance from the viewer. It definitely builds up mystery around the product.
Given the lack of concrete information, lots of people have attempted to fill in the gaps with varying degrees of success which, in their own way, make it difficult to navigate the technological true true. In an effort to simplify the research one typically has to do on one’s own in order to understand HoloLens and AR, I’ve made a sort of map for those interested in making their way. Here are some of the best resources I’ve found.
1. You should start with the Oculus blog, which is obviously about the Oculus and not about HoloLens. Nevertheless, the core technology the makes the Oculus Rift work is also in the HoloLens in some form. Moreover, the Oculus blog is a wonderful example of sharing and successfully explaining complicated concepts to the layman. Master these posts about how the Rift works and you are half way to understanding how HoloLens works:
- Latency https://www.oculus.com/en-us/blog/building-a-sensor-for-low-latency-vr/
- Sensor Fusion https://developer.oculus.com/blog/sensor-fusion-keeping-it-simple/
2. Next, you should really read Oliver Kreylos’s (Doc OK) brilliant posts about the HoloLens field of view and waveguide display technology. Many disagreements around HoloLens would evaporate if people would simply invest half an hour into reading OK’s insights :
- HoloLens and Field of View http://doc-ok.org/?p=1274
- HoloLens and Holograms http://doc-ok.org/?p=1329
3. If you’ve gone through these, then you are ready for Dr. Michael J. Gourlay’s youtube discussion of surface reconstruction, occlusion, tracking and mapping. Sadly the audio drops out at key moments and the video drops out for the entire Q & A, but there’s lots of gold for everyone in this mine. Also check out his audio interview at Georgia Tech:
- HoloLens for Gaming: https://www.youtube.com/watch?v=0DlSpk5CfOk
- Interview with Dr. Gourlay: http://gvu.gatech.edu/michael-gourlay-audio-interview
4. There have been lots of first-impression blog posts concerning the HoloLens, but Jasper Brekelmans provides far-and-away the best of these by following a clear just-the-facts-ma’am approach:
- just the facts, ma’am: http://brekel.com/my-hololens-experience-and-technical-insights/
5. HoloLens isn’t only about learning new technology but also discovering a new design language. Mike Alger’s video provides a great introduction into the problems as well as some solutions for AR/VR interface and usability design:
- Mike Alger: https://vimeo.com/141330081
6. Oculus, Leap Motion and others who have been designing VR experiences provide additional useful tips about what they have discovered along the way in articles like the now famous “Swayze Effect” (yes, that Swayze):
- The Swayze Effect https://storystudio.oculus.com/en-us/blog/the-swayze-effect/
- Lessons Learned https://storystudio.oculus.com/en-us/blog/5-lessons-learned-while-making-lost/
- VR Design Tips https://medium.com/google-design/from-product-design-to-virtual-reality-be46fa793e9b#.i5hpuu54q
- How Fictional Designs Affect RL Motion Controls http://blog.leapmotion.com/fictional-uis-influence-todays-motion-controls/
- VR Best Practices (PDF) https://developer.leapmotion.com/assets/Leap%20Motion%20VR%20Best%20Practices%20Guidelines.pdf
- Sci Fi’s Influence on Gestural Interfaces https://www.smashingmagazine.com/2013/03/sci-fi-interaction-designers-gestural-interfaces/
7. Finally, here are some video parodies and inspirational videos of VR and AR from the tv show Community and others:
- Community (season 6 episode 2) https://www.yahoo.com/tv/video/lawnmower-maintenance-postnatal-care-070001912.html
- Glen Keane (tiltbrush demo) https://vimeo.com/138790270
- hyper reality (concept) https://vimeo.com/8569187
- Sight (short film) https://vimeo.com/46304267
I know I’ve left a lot of good material out, but these have been some of the highlights for me over the past year while hitchhiking on the backroads leading to Augmented Reality. Drop them in your mental knapsack, stick out your thumb and wait for the future to pick you up.
This is the picture of the homemade clock Ahmed Mohamed brought to his Irving, Texas high school. Apparently no one ever mistook it for a bomb, but they did suspect that it was made to look like a bomb and so they dragged the hapless boy off in handcuffs and suspended him for three days.
This is a strange case of perception versus reality in which the virtual bomb was never mistaken for a real bomb. Instead, what was identified was the fact that it was, in fact, only a bomb virtually and, as with all things virtual, therefore required some sort of explanation.
The common sympathetic explanation is that this isn’t a picture of a virtual bomb at all but rather a picture of a homemade clock. Ahmed recounts that he made the clock, in maker fashion, in order to show an engineering teacher because he had done robotics in middle school and wanted to get into a similar program in high school. Homemade clocks, of course, don’t require an explanation since they aren’t virtually anything other than themselves.
It turns out, however, that the picture at the top does not show a homemade maker clock. Various engineering types have examined the images and determined that it is in fact a disassembled clock from the 80’s.
The telling aspect is the DC power cord which doesn’t actually get used in homemade projects. Instead, anyone working with arduino projects typically (pretty much always) uses AA batteries. The clock components have also been tracked back to their original source, however, so the evidence seems pretty solid.
The photo at the top shows not a virtual bomb nor a homemade clock but, in fact, a virtual homemade clock. That is, it was made to look like a homemade clock but was mistakenly believed to be something made to look like a homemade bomb.
[As a disclaimer about intentions, which is necessary because getting on the wrong side of this gets people in trouble, I don’t know Ahmed’s intentions and while I’m a fan of free speech I can’t say I actually believe in free speech having worked in marketing and I think Ahmed Mohammed looks absolutely adorable in his NASA t-shirt and I have no desire to be placed in company with those other assholes who have shown that this is not a real homemade clock but rather a reassembled 80’s clock and therefore question Ahmed’s motives whereas I refuse to try to get into a high schooler’s head, having two of my own and knowing what a scary place that can be … something, something, something … and while I can’t wholeheartedly support every tweet made by Richard Dawkins and have at times even felt in mild disagreement with things he and others have tweeted on twitter I will say that I find his book The Selfish Gene a really good read … etc, etc, … and for good measure fuck you FoxNews.]
The salient thing for me is that we all implicitly know that a real bomb isn’t supposed to look like a bomb. The authorities at Ahmed’s high school knew that immediately. Bombs are supposed to look like shoes or harmless tourist knickknacks. If you think it looks like a bomb, it obviously isn’t. So what does it mean to look like a bomb (to be virtually a bomb) but not be an actual bomb?
I covered similar territory once before in a virtual exhibit called les fruits dangereux and at the time concluded that virtual objects, like post-modern novels, involve bricolage and the combining of disparate elements in unexpected ways. For instance combining phones, electrical tape and fruit or combining clock parts and pencil cases. Disrupting categorical thinking at a very basic level makes people – especially authority people – suspicious and unhappy.
Which gets us back to racism which is apparently what has happened to Ahmed Mohammed who was led out of school in handcuffs in front of his peers – and we’re talking high school! and he wasn’t asking to be called “McLovin.” It’s pretty cruel stuff. The fear of racial mixing (socially or biologically) always raises it’s head and comes from the same desire to categorize people and things into bento box compartments. The great fear is that we start to acknowledge that we live in a continuum of types rather than distinct categories of people, races and objects. In the modern age, mass production makes all consumer objects uniform in a way that artisanal objects never were while census forms do the same for people.
Virtual reality will start by copying real world objects in a safe way. As with digital design, it will start with isomorphism to make people feel safe and comfortable. As people become comfortable, bricolage will take hold simply because, in a digital world rather than a commoditized/commodified world, mashups are easy. Irony and a bit of subversiveness will lead to bricolage with purpose as we find people’s fantasies lead them to combine digital elements in new and unexpected ways.
We can all predict augmented and virtual ways to press a digital button or flick through a digital menu projected in front of us in order to get a virtual weather forecast. Those are the sorts of experiences that just make people bored with augmented reality vision statements.
The true promise of virtual reality and augmented reality is that they will break down our racial, social and commodity thinking. Mixed-reality has the potential to drastically change our social reality. How do social experiences change when the color of a person’s avatar tells you nothing real about them, when our social affordances no longer provide clues or shortcuts to understanding other people? In a virtual world, accents and the shoes people wear no longer tell us anything about their educational background or social status. Instead of a hierarchical system of discrete social values, we’ll live in a digital continuum.
That’s the sort of augmented reality future I’m looking forward to.
The important point in the Ahmed Mohammed case, of course, is that you shouldn’t arrest a teenager for not making a bomb.
At first blush, it seems like augmented reality should be easier than virtual reality. Whereas virtual reality involves the generation of full stereoscopic digital environments as well as interactive objects to place in those environments, augmented reality is simply adding digital content to our view of the real world. Virtual reality would seem to be doing more heavy lifting.
In actual fact, both technologies are creating illusions to fool the human eye and the human brain. In this effort, virtual reality has an easier task because it can shut out points of reference that would otherwise belie the illusion. Augmented reality experiences, by contrast, must contend with real world visual cues that draw attention to the false nature of the mixed reality content being added to a user’s field of view.
In this post, I will cover some of the additional challenges that make augmented reality much more difficult to get right. In the process, I hope to also provide clues as to why augmented reality HMDs like HoloLens and Magic Leap are taking much longer to bring to market than AR devices like the Oculus Rift, HTC Vive and Sony Project Morpheus.
But first, it is necessary to distinguish between two different kinds of augmented reality experience. One is informatics based and is supported by most smart phones with cameras. The ideal example of this type of AR is the Terminator-vision from James Cameron’s 1984 film “The Terminator.” It is relatively easy to to do and is the most common kind of AR people encounter today.
The second, and more interesting, kind of AR requires inserting illusory 3D digital objects (rather than informatics) into the world. The battle chess game from 1977’s “Star Wars” epitomizes this second category of augmented reality experience. This is extremely difficult to do.
The Microsoft HoloLens and Magic Leap (as well as any possible HMDs Apple and others might be working on) are attempts to bring both the easy type and the hard type of AR experience to consumers.
Here are a few things that make this difficult to get right. We’ll put aside stereoscopy which has already been solved effectively in all the VR devices we will see coming out in early 2016.
1. Occlusion The human brain is constantly picking up clues from the world in order to determine the relative positions of objects such as shading, relative size and perspective. Occlusion is one that is somewhat tricky to solve. Occlusion is an effect that is so obvious that it’s hard to realize it is a visual cue. When one body is in our line of sight and is positioned in front of another body, that other body is partially hidden from our view.
In the case where a real world object is in front of a digital object, we can clip the digital object with an outline of the object in front to prevent bleed through. When we try to create the illusion that a digital object is positioned in front of a real world object, however, we encounter a problem inherent to AR.
In a typical AR HMD we see the real world through a transparent screen upon which digital content is either projected or, alternatively, illuminated as with LED displays. An obvious characteristic of this is that digital objects on a transparent display are themselves semi-transparent. Getting around this issue would seem to require being able to make certain portions of the transparent display more opaque than others as needed in order to make sure our AR objects look substantial and not ghostly.
2. Accommodation It turns out that stereoscopy is not the only way our eyes recognize distance. The image above is from a scene in Orson Welles’s “Citizen Kane” in which a technique called “deep focus” is used extensively. Deep focus maintains clarity in the frame whether the actors and props are in the foreground, background or middle ground. Nothing is out of focus. The technique is startling both because it is counter to the way movies are generally shot but also because it is counter to how our eyes work.
If you cover one eye and use the other to look at one of your fingers, then move the finger toward and away from you, you should notice yourself refocusing on the finger as it moves while other objects around the finger become blurry. The shape of the cornea actually becomes more rounded when objects are close in order to cause light to refract more in order to reach the retina. For further away objects, the cornea flattens out because less refraction is needed. As we become older, the ability to bow the cornea lessens and we lose some of our ability to focus on near objects – for instance when we read. In AR, we are attempting to make a digital object that is really only centimeters from our eyes appear to be much further away.
Depending on how the light from the display passes through the eye, we may end up with the digital object appearing clear while the real world objects supposedly next to it and at the same distance appear blurred.
3. Vergence-Accommodation Mismatch The accommodation problem is one aspect of yet another VR/AR difficulty. The term vergence describes the convergence and divergence of the two eyes from one another as objects move closer or further away. An interesting aspect of stereoscopy – which is used both for virtual reality as well as augmented reality to create the illusion of depth – is that the distance at which the two eyes coordinate to see an object is generally different from the focal distance from the eyes to the display screen(s). This consequently sends two mismatched signals to the brain concerning how far away the digital object is supposed to be. Is it the focal length or the vergence length? Among other causes, vergence-accommodation mismatch is believed to be a contributing factor to VR sickness. Should the accommodation problem above be resolved for a given AR device, it is safe to assume that the vergence-accommodation mismatch will also be solved.
4. Tetherless Battery Life Smart phones have changed our lives among other reasons because they are tetherless devices. While the current slate of VR devices all leverage powerful computers to which they are attached, since VR experiences are all currently somewhat stationary (the HTC Vive being the odd bird), AR needs to be portable. This naturally puts a strain on the battery, which needs to be relatively light since it will be attached to the head-mounted-display, but also long-lived as it will be powering occasionally intensive graphics, especially for games.
5. Tetherless GPU Another strain on the system is the capability of the GPU. Virtual reality devices can be fairly intense since they require the user to purchase a reasonably powerful and somewhat expensive graphics card. AR devices can be expected to have similar graphics requirements as VR with much less to work with since the GPU needs to be onboard. We can probably expect a streamlined graphics pipeline dedicated to and optimized for AR experiences will help offset lower GPU capabilities.
6. Applications Not even talking about killer apps, here. Just apps. Microsoft has released videos of several impressive demos including Minecraft for HoloLens. Magic Leap up to this point has only shown post-prod, heavily produced illustrative videos. The truth is that everyone is still trying to get their heads around designing for AR. There aren’t really any guidelines for how to do it or even what interactions will work. Other than the most trivial experiences (e.g. weather and clock widgets projected on a wall) this will take a while as we develop best practices while also learning from our mistakes.
With the exception of V-AM, these are all problems that VR does not have to deal with. Is it any wonder, then, that while we are being led to believe that consumer models of the Oculus Rift, HTC Vive and Sony Project Morpheus will come to market in the first quarter of 2016, news about HoloLens and Magic Leap has been much more muted. There is simply much more to get right before a general rollout. One can hope, however, that dev units will start going out soon from the major AR players in order to mitigate challenge #6 while further tuning continues, if needed, on challenges #1-#5.
A few months ago I wrote a speculative piece about how HoloLens might work with XAML frameworks based on the sample applications Microsoft had been showing.
Even though Microsoft has still released scant information about integration with 3D platforms, I believe I can provide a fairly accurate walkthrough of how HoloLens development will occur for Unity3D. In fact, assuming I am correct, you can begin developing games and applications today and be in a position to release a HoloLens experience shortly after the hardware becomes available.
To be clear, though, this is just speculative and I have no insider information about the final product that I can talk about. This is just what makes sense based on publicly available information regarding HoloLens.
Unity3D integration with third party tools such as Kinect and Oculus Rift occurs through plugins. The Kinect 2 plugin can be somewhat complex as it introduces components that are unique to the Kinect’s capabilities.
The eventual HoloLens plugin, on the other hand, will likely be relatively simple since it will almost certainly be based on a pre-existing component called the FPSController (in Unity 5.1 which is currently the latest).
To prepare for HoloLens, you should start by building your experience with Unity 5.1 and the FPSController component. Here’s a quick rundown of how to do this.
Start by installing the totally free Unity 5.1 tools: http://unity3d.com/get-unity/download?ref=personal
Next, create a new project and select 3D for the project type.
Click the button for adding asset packages and select Characters. This will give you access to the FPSController. Click done and continue. The IDE will now open with an practically empty project.
At this point, a good Unity3D tutorial will typically show you how to create an environment. We’re going to take a shortcut, however, and just get a free one from the Asset Store. Hit Ctrl+9 to open the Asset Store from inside your IDE. You may need to sign in with your Unity account. Select the 3D Models | Environments menu option on the right and pick a pre-built environment to download. There are plenty of great free ones to choose from. For this walkthrough, I’m going to use the Japanese Otaku City by Zenrin Co, Ltd.
After downloading is complete, you will be presented with an import dialog box. By default, all assets are selected. Click on Import.
Now that the environment you selected has been imported, go the the scenes folder in your project window and select a sample scene from the downloaded environment. This will open up the city or dungeon or forest or whatever environment you chose. It will also make all the different assets and components associated with the scene show up in your Hierarchy window. At this point, we want to add the first-person shooter controller into the scene. You do this by selecting the FPSController from the project window under Assets/Standard Assets/Characters/FirstPersonCharacter/Prefabs and dragging the FPSController into your Hierarchy pane.
This puts a visual representation of the FPS controller into your scene. Select the controller with your mouse and hit “F” to zoom in on it. You can see from the visual representation that the FPS controller is basically a collision field that can be moved with a keyboard or gamepad that additionally has a directional camera component and a sound component attached. The direction the camera faces ultimately become the view that players see when you start the game.
Here is another scene that uses the Decrepit Dungeon environment package by Prodigious Creations and the FPS controller. The top pane shows a design view while the bottom pane shows the gamer’s first-person view.
You can even start walking through the scene inside the IDE by simply selecting the blue play button at the top center of the IDE.
The way I imagine the HoloLens integration to work is that another version of FPS controller will be provided that replaces mouse controller input with gyroscope/magnetometer input as the player rotates her head. Additionally, the single camera view will be replaced with a two camera rig that sends two different, side-by-side feeds back to the HoloLens device. Finally, you should be able to see how all of this works directly in the IDE like so:
There is very good evidence that the HoloLens plugin will work something like I have outlined and will be approximately this easy. The training sessions at the Holographic Academy during /Build pretty much demonstrated this sort of toolchain. Moreover, this is how Unity3D currently integrates with virtual reality devices like Gear VR and Oculus Rift. In fact, the screen cap of the Unity IDE above is from an Oculus game I’ve been working on.
So what are you waiting for? You pretty much have everything you already need to start building complex HoloLens experiences. The integration itself, when it is ready, should be fairly trivial and much of the difficult programming will be taken care of for you.
I’m looking forward to seeing all the amazing experiences people are building for the HoloLens launch day. Together, we’ll change the future of personal computing!
While slumming on the internet looking for new content about digital media I came across this promising article entitled Virtual Reality, Augmented Reality and Application Development. I was feeling hopeful about it until I came across this peculiar statement:
“Of the two technologies, augmented reality has so far been seen as the more viable choice…”
What a strange thing to write. Would we ever ask whether the keyboard or the mouse is the more viable choice? The knife or the fork? Paper or plastic? It should be clear by now that this is a false choice and not a case of having your cake or eating it, too. We all know that the cake is a lie.
But this corporate blog post was admittedly not unique in creating a false choice between virtual reality and augmented reality. I’ve come across this before and it occurred to me that this might be an instance of a category mistake. A category mistake is itself a category of philosophical error identified by the philosopher Gilbert Ryle to tackle the sticky problem of Cartesian dualism. He pointed out that even though it is generally accepted in the modern world that mind is not truly a separate substance from mind but is in fact a formation that emerges in some fashion out of the structure of our brains, we nevertheless continue to divide the things of the world, almost as if by accident, into two categories: mental stuff and material stuff.
There are certainly cases of competing technologies where one eventually dies off. The most commonly cited example is the Betamax and VHS. Of course, they both ultimately died off and it is meaningless today to claim that either one really succeeded. There are many many more examples of apparently technological duels in which neither party ultimately falls or concedes defeat. PC versus Mac. IE vs Chrome. NHibernate vs EF. etc.
The rare case is when one technology completely dominates a product category. The few cases where this has happened, however, have so captured our imaginations that we forget it is the exception and not the rule. This is the case with category busters like the iPhone and the iPad – brands that are so powerful it has taken years for competitors to even come up with viable alternatives.
What this highlights is that, typically, technology is not a zero sum game. The norm in technology is that competition is good and leads to improvements across the board. Competition can grow an entire product category. The underlying lie, however, is perhaps that each competitor tells itself that they are in a fight to the death and that they are the next iPhone. This is rarely the case. The lie beneath that lie is that each competitor is hoping to be bought out by another larger company for billions of dollars and has to look competitive up until that happens. A case of having your cake and eating it, too.
There is, however, a category in which one set of products regularly displace another set of products. This happens in the fashion world.
Each season, each year, we change out our cuts, our colors and accessories. We put away last year’s fashions and wouldn’t be caught dead in them. We don’t understand how these fashion changes occur or what rules they obey but the fashion houses all seem to conform to these unwritten rules of the season and bring us similar new things at the proper time.
This is the category mistake that people make when they ask things such as which is more viable: augmented reality or virtual reality? Such questions belong to the category of fashion (which is in season: earth tones or pastels?) and not to technology. In the few unusual cases where this does happen, then the category mistake is clearly in the opposite direction. The iPhone and iPad are not technologies: they are fashion statements.
Virtual reality and augmented reality are not fashion statements. They aren’t even technologies in the way we commonly talk about technology today – they are not software platforms (though they require SDKs), they are not hardware (though they are useless without hardware), they are not development tools (you need 3D modeling tools and game engines for this). In fact, they have more in common with books, radio, movies and television than they do to software. They are new media.
A medium, etymologically speaking, is the thing in the middle. It is a conduit from a source to a receiver – from one world to another. A medium lets us see or hear things we would otherwise not have access to. Books allow us to hear the words of people long dead. Radio transmits words over vast distances. Movies and television let us see things that other people want us to see and we pay for the right to see those things. Augmented reality and virtual reality, similarly, are conduits for new content. They allow us to see and hear things in ways we haven’t experienced content before.
The moment we cross over from talking about technology and realize we are talking about media, we automatically invoke the spirit of Marshall McLuhan, the author of Understanding Media: The Extensions of Man. McLuhan thought deeply about the function of media in culture and many of his ideas and aphorisms, such as “the medium is the message,” have become mainstays of contemporary discourse. Other concepts that were central to McLuhan’s thought still elude us and continue to be debated. Among these are his two media categories: hot and cold.
McLuhan claimed that any media is either hot or cold, warm or cool. Cool mostly means what we think it means metaphorically; for instance, James Dean is cool in exactly the way McLuhan meant. Hot media, in turn, is in most ways what you would think it is: kinetic with a tendency to overwhelm the senses. To illustrate what he meant by hot and cold, McLuhan often provides contrasting examples. Movies are a hot medium. Television is a cold medium. Jazz is a hot medium. The twist is a cool medium. Cool media leave gaps that the observer must fill in. It is highly participatory. Hot media is a wall of sensation that does not require any filling in: McLuhan characterizes it as “high definition.”
I think it is pretty clear, between virtual reality and augmented reality, which falls into the category of a cool medium and which a hot one.
To help you come to your own conclusions about how to categorize augmented reality glasses and the virtual reality goggles, though, I’ll provide a few clues from Understanding Media:
“In terms of the theme of media hot and cold, backward countries are cool, and we are hot. The ‘city slicker’ is hot, and the rustic is cool. But in terms of the reversal of procedures and values in the electric age, the past mechanical time was hot, and we of the the TV age are cool. The waltz was hot, fast mechanical dance suited to the industrial time in its moods of pomp and circumstance.”
“Any hot medium allows of less participation than a cool one, as a lecture makes for less participation than a seminar, and a book for less than dialogue. With print many earlier forms were excluded from life and art, and many were given strange new intensity. But our own time is crowded with examples of the principle that the hot form excludes, and the cool one includes.”
“The principle that distinguishes hot and cold media is perfectly embodied in the folk wisdom: ‘Men seldom make passes at girls who wear glasses.’ Glasses intensify the outward-going vision, and fill in the feminine image exceedingly, Marion the Librarian notwithstanding. Dark glasses, on the other hand, create the inscrutable and inaccessible image that invites a great deal of participation and completion.”
Microsoft recently created possibly the best demo they have ever done on stage for E3. Microsoft employees played Minecraft in a way no one has ever seen it before, on a table top as if it was a set of legos. Many people speculated on social media that this may be the killer app that HoloLens has been looking for.
What is particularly exciting about the way the demo captured people’s imaginations is that they can start envisioning what AR might actually be used for. People are even getting a firm grip on the differences between Virtual Reality, which creates an immersive experience, and augmented reality which creates a mixed experience overlapping digital objects with real world objects.
Nevertheless, there is still a tendency to see virtual reality exemplified by the Oculus Rift and augmented reality exemplified by HoloLens and Magic Leap as competing solutions. In fact they are complementary solutions. They don’t compete with one another any more than your mouse and your keyboard do.
Bill Buxton has famously said that everything is best for something and worst for something else. By contrasting the Minecraft experience for Oculus and HoloLens, we can better see what each technology is best at.
The Virtual Reality experience for Oculus is made possible by a free hacking effort called Minecrift. It highlights the core UX flavor of almost all VR experiences – they are first person, with the player fully present in a 3D virtual world. VR is great for playing Minecraft in adventure or survival mode.
Adventure mode with HoloLens is roughly equivalent to the adventure mode we get today on a PC or XBOX console with the added benefit that the display can be projected on any wall. It isn’t actually 3D, though, as far as we can tell from the demo, despite the capability of displaying stereoscopic scenes with HoloLens.
What does work well, however, is Minecraft in creation mode. This is basically the god view we have become familiar with from various strategy and resource games over the years.
God View vs First Person View
In a fairly straightforward way, it makes sense to say that AR is best for a god-centric view while VR is best for a first-person view. For instance, if we wanted to create a simulation that allows users to fly a drone or manipulate an undersea robot, virtual reality seems like the best tool for the job. When we need to create a synoptic view of a building or even a city, on the other hand, then augmented reality may be the best UX. Would it be fair to say that all new UX experiences fall into one of these two categories?
Most of our metaphors for building software, building businesses and building every other kind of buildable thing, after all, are based on the lego building block and it’s precursors the Lincoln log and erector sets. We play games as children in order, in part, to prepare ourselves for thinking as adults. Minecraft was built similarly on the idea of creating a simulation of a lego block world that we could not only build but also virtually play in on the computer.
The playful world of Lego blocks is built on two things: the blocks themselves formed into buildings and scenes and the characters that we identify with who live inside the world of blocks. In other words the god-view and the first-person view.
It should come as no surprise, then, that these two core modes of our imaginative lives should stay with us through our childhoods and into our adult approaches to the world. We have both an interpersonal side and an abstract, calculating side. The best leaders have a bit of both.
You apparently didn’t put one of the new coversheets on your TPS report
The god-view in business tends to be the synoptic view demanded by corporate executives and provided in the form of dashboards or crystal reports. It would be a shame if AR ended up falling into that use-case when it can provide so much more and in more interesting ways. As both VR and AR mature over the next five years, we all have a responsibility to keep them anchored in the games of our childhood and avoid letting them become the faults and misdemeanors of the corporate adult world.
A recent arstechnica article indicates that the wall-projected HoloLens version of Minecraft in adventure mode can be played in true 3D:
One other impressive feature of the HoloLens-powered virtual screen was the ability to activate a three-dimensional image, so that the scene seemed to recede into the wall like a window box. Unlike a standard 3D monitor, this 3D image actually changed perspective based on the viewing angle. If I went up near the wall and looked at the screen from the left, I could see parts of the world that would usually be behind the right side of the wall, as if the screen was simply a window into another world.
Basically I’ve had the DK2 since Christmas and had been looking for a really good game to go with my device (rather than the other way around). After shelling out $350 for the goggles, $60 more for a game didn’t seem like such a big deal.
In fact, playing Elite: Dangerous with the Oculus and an XBox One gamepad has been one of the best gaming experiences I have ever had in my life – and I’m someone who played E.T. on the Atari 2600 when it first came out so I know what I’m talking about, yo. It is a fully realized Virtual Reality environment which allows me to fly through a full simulation of the galaxy based on current astronomical data. When I am in the simulation, I objectively know that I am playing a game. However, all of my peripheral awareness and background reactions seem to treat the simulation as if it is real. My sense of space changes and my awareness expands into the virtual space of the simulation. If I don’t mistake the VR experience for reality, I nevertheless do experience a strong suspension of disbelief when I am inside of it.
One of the things I’ve found fascinating about this Virtual Reality simulation is that it is full of Augmented Reality objects. For instance, the two menu bars at the top of the screencap above, to the top left and the top right, are full holograms. When I move my head around, parallax effects demonstrate that their positions are anchored to the cockpit rather than to my personal perspective. If the VR goggles allowed me to do it, I would be able to even lean forward and look at the backside of those menus. Interestingly, when the game is played in normal 3D first person mode rather than VR with the Oculus, those menus are rendered as head-up displays and are anchored to my point of view as I use the mouse to look around the cockpit — in much the same way that google glass anchored menus to the viewer instead of the viewed.
The navigation objects on the dashboard in front of me are also AR holograms. Their locations are anchored to the cockpit rather than to me, and when I move around I can see them at different angles. At the same time, they exhibit a combination of glow and transparency that isn’t common to real-world objects and that we have come to recognize, from sci fi movies, as the inherent characteristics of holograms.
I realized at about the 60 hour mark into my gameplay \ research that one of the current opportunities as well as problems with AR devices like the Magic Leap and HoloLens is that not many people know how to develop UX for them. This was actually one of the points of a panel discussion concerning HoloLens at the recent BUILD conference. The field is wide open. At the same time, UX research is clearly already being done inside VR experiences like Elite: Dangerous. The hologram-based control panel at the front of the cockpit is a working example of how to design navigation tools using augmented reality.
Another remarkable feature of the HoloLens is the use of gaze as an input vector for human-computer interactions. Elite: Dangerous, however, has already implemented it. When the player looks at certain areas of the cockpit, complex menus like the one shown in the screencap above pop into existence. When one removes one’s direct gaze, the menu vanishes. If this were a usability test for gaze-based UI, Elite: Dangerous will have already collected hours of excellent data from thousands of players to verify whether this is an effective new interaction (in my experience, it totally is, btw). This is also the exact sort of testing that we know will need to be done over the next few years in order to firm up and conventionalize AR interactions. By happenstance, VR designers are already doing this for AR before AR is even really on the market.
The other place augmented reality interaction design research is being carried out is in Japanese anime. The image above is from a series called Sword Art Online. When I think of VR movies, I think of The Matrix. When I put my children into my Oculus, however, they immediately connected it to SAO. SAO is about a group of beta testers for a new MMORPG that requires virtual reality goggles who become trapped inside the MMORPG due to the evil machinations of one of the game developers. While the setting of the VR world is medieval, players still interact with in-game AR control panels.
Consider why this makes sense when we ask the hologram versus head-up display question. If the menu is anchored to our POV, it becomes difficult to actually touch menu items. They will move around and jitter as the player looks around. In this case, a hologram anchored to the world rather than to the player makes a lot more sense. The player can process the consistent position of the menu and anticipate where she needs to place her fingers in order to interact with it. Sword Art Online effectively provides what Bill Buxton describes as a UX sketch for interactions of this sort.
On an intellectual level, consider how many overlapping interaction metaphors are involved in the above sketch. We have a 1) GUI-based menu system transposed to 2) touch (no right clicking) interactions, then expressed as 3) an augmented reality experience placed inside of 4) a virtual reality experience (and communicated inside a cartoon).
Why is all of this possible? Why are the best augmented reality experiences inside of virtual reality experiences and cartoons? I think it has to do with cost of execution. Illustrating an augmented reality experience in an anime is not really any more difficult than illustrating a field of grass or a cute yellow gerbil-like character. The labor costs are the same. The difficulty is only in the conceptualization.
Similarly, throwing a hologram into a virtual reality experience is not going to be any more difficult than throwing a tree or a statue into the VR world. You just add some shaders to create the right transparency-glowy-pulsing effect and you have a hologram. No additional work has to be done to marry the stereoscopic convergence of hologram objects and the focal position of real world locations as is required for really good AR. In the VR world, these two things – the hologram world and the VR world – are collapsed into one thing.
There has been a tendency to see virtual reality and mixed reality as opposed technologies. What I have learned from playing with both, however, is that they are actually complementary technologies. While we wait for AR devices to be released by Microsoft, Magic Leap, etc. it makes sense to jump into VR as a way to start understanding how humans will interact with digital objects and how we must design for these interactions. Additionally, because of the simplification involved in creating AR for VR rather than AR for reality, it is likely that VR will continue to hold a place in the design workflow for prototyping our AR experiences even years from now when AR becomes not only a reality but an integral thread in the fabric of reality.
Professions are held together by touchstones such as as a common jargon that both excludes outsiders and reinforces the sense of inclusion among insiders based on mastery of the jargon. On this level, software development has managed to surpass more traditional practices such as medicine, law or business in its ability to generate new vocabulary and maintain a sense that those who lack competence in using the jargon simply lack competence. Perhaps it is part and parcel with new fields such as software development that even practitioners of the common jargon do not always understand each other or agree on what the terms of their profession mean. Stack Overflow, in many cases, serves merely as a giant professional dictionary in progress as developers argue over what they mean by de-coupling, separation of concerns, pragmatism, architecture, elegance, and code smell.
Cultures, unlike professions, are held together not only by jargon but also by shared ideas and philosophies that delineate what is important to the tribe and what is not. Between a profession and a culture, the members of a professional culture, in turn, share a common imaginative world that allows them to discuss shared concepts in the same way that other people might discuss their favorite TV shows.
This post is an experiment to see what the shared library of augmented reality and virtual reality developers might one day look like. Digital reality development is a profession that currently does not really exist but which is already being predicted to be a multi-billion dollar industry by 2020.
HoloCoding, in other words, is a profession that exists only virtually for now. As a profession, it will envelop concerns much greater than those considered by today’s software developers. Whereas contemporary software development is mostly about collecting data, reporting on data and moving data from point A to points B and C, spatial software development will be more concerned with environments and will have to draw on complex mathematics as well as design and experiential psychology. The bookshelf of a holocoder will look remarkably different from that of a modern data coder. Here are a few ideas regarding what I would expect to find on a future developer’s bookshelf in five to ten years.
1. Understanding Media by Marshall McLuhan – written in the 60’s and responsible for concepts such as ‘the global village’ and hot versus cool media, McLuhan pioneered the field of media theory. Because AR and VR are essentially new media, this book is required reading for understanding how these technologies stand side-by-side with or perhaps will supplant older media.
2. Illuminations by Walter Benjamin – while the whole work is great, the essay ‘The Work of Art in the Age of Mechanical Reproduction’ is a must read for discussing how traditional notions about creativity fit into the modern world of print and now digital reproduction (which Benjamin did not even know about). It also deals at an advanced level with how human interactions work on stage versus film and the strange effect this creates.
3. Sketching User Experiences by Bill Buxton – this classic was quickly adopted by web designers when it came out. What is sometimes forgotten is that the book largely covers the design of products and not websites or print media – products like those that can be built with HoloLens, Magic Leap and Oculus Rift. Full of insights, Buxton helps his readers to see the importance of lived experience when we design and build technology.
4. Bergsonism by Gilles Deleuze – though Deleuze is probably most famous for his collaborations with Felix Guattari, this work on the philosophical meaning of the term ‘’virtual reality’, not as a technology but rather as a way of approaching the world, is a gem.
5. Passwords by Jean Baudrillard – what Deleuze does for virtual reality, Baudrillard does for other artifacts of technological language in order to show their place in our mental cosmology. He also discusses virtual reality along the way, though not as thoroughly.
6. Mathematics for 3D Game Programming and Computer Graphics by Eric Lengeyl – this is hardcore math. You will need this. You can buy it used online for about $6. Go do that now.
7. Linear Algebra and Matrix Theory by Robert Stoll – this is a really hard book. Read the Lengeyl before trying this. This book will hurt you, by the way. After struggling with a page of this book, some people end up buying the Manga Guide to Matrix Theory thinking that there is a fun way to learn matrix math. Unfortunately, there isn’t and they always come back to this one.
8. Phenomenology of Perception by Maurice Merleau-Ponty – when it first came out, this work was often seen as an imitation of Heiddeger’s Being and Time. It may be the case that it can only be truly appreciated today when it has become much clearer, thanks to years of psychological research, that the mind reconstructs not only the visual world for us but even the physical world and our perception of 3D spaces. Merleau-Ponty pointed this out decades ago and moreover provides a phenomenology of our physical relationship to the world around us that will become vitally important to anyone trying to understand what happens when more and more of our external world becomes digitized through virtual and augmented reality technologies.
9. Philosophers Explore the Matrix – just as The Matrix is essential viewing for anyone in this field, this collection of essays is essential reading. This is the best treatment available of a pop theme being explored by real philosophers – actually most of the top American philosophers working on theories of consciousness in the 90s. Did you ever think to yourself that The Matrix raised important questions about reality, identity and consciousness? These professional philosophers agree with you.
10. Snow Crash by Neal Stephenson – sometimes to understand a technology, we must extrapolate and imagine how that technology would affect society if it were culturally pervasive and physically ubiquitous. Fortunately Neal Stephenson did that for virtual reality in this amazing book that combines cultural history, computer theory and a fast paced adventure.
Over the past few years we’ve seen the rapid release of innovative consumer technologies that are all loosely related by their ability to scan 3D spaces, interact with 3D spaces or synthesize 3D spaces. These include the Kinect sensor, Leap Motion, Intel Perceptual Computing, Oculus Rift, Google Glass, Magic Leap and HoloLens. Additional related general technologies include projection mapping and 3D printing. Additional related tools include Unity 3D and the Unreal Engine.
Despite a clear family resemblance between all of these technologies, it has been difficult to clearly define what that relationship is. There has been a tendency to categorize all of them as simply being “bleeding edge”, “emerging” or “future”. The problem with these descriptors is that they are ultimately relative to the time at which a technology is released and are not particularly helpful in defining what holds these technologies together in a common gravitational pull.
I basically want to address this problem by engaging in a bit of word magic. Word magic is a sub-category of magical thinking and is based on a form of psychological manipulation. If you have ever gone out to Martin Fowler’s Bliki then you’ve seen the practice at work. One of the great difficulties of software development is anticipating the unknown: the unknown involved in requirements, the unknown related to timelines, and the unknown concerned with the correct tactics to accomplish tasks. In a field with a limited history and a tendency not to learn from other related fields, the fear of the unknown can utterly cripple projects.
Martin Fowler’s endless enumeration of “patterns” on his bliki takes this on directly by giving names to the unknown. If one reads his blog carefully, however, it quickly becomes clear that most, though not all, of these patterns are illusory: they are written at such an abstract level that they fail to provide any prescriptive advice on how to solve the problems they are intended to address. What they do provide, however, is a sense of relief that there is a “name” that can be used to plug up the hole opened up in time by the fear of the unknown. Solutions architects can return to their teams (or their managers) and pronounce proudly that they have found a pattern to solve the outstanding problem that is hanging over everyone – all that remains is to determine what each “name” actually means.
In this sense, the whole world of software architecture – which Glassdoor ranked as the 11th best job of 2015 — is a modern priesthood devoted to prophetic interpretations of “design patterns”.
I similarly want to use word magic to define the sort of person that works with the sorts of technology I listed at the top of this article. I think I can even do it quite simply with familar imagery.
A holocoder is someone who works with technologies that are inspired by and/or anticipate the Star Trek Holodeck.
The part of the definition that states “inspired by and/or anticipate” may seem strange but it is actually quite essential. It is based on a specific temporal-cybernetic theory concerning the dissemination of ideas which I will attempt to describe but which is purely optional with respect to the definition.
But first: how can a theory be both essential and optional? This is an issue that Niels Bohr, one of the fathers of quantum mechanics, tackled frequently. In the early 30’s Bohr was travelling through eastern Europe on a lecture tour. During part of the tour, a former student met him at his inn and noticed him nailing a horse shoe over the door of his room. “Professor Bohr”, he asked, “what are you doing?” Niels Bohr replied, “The Inn Keeper informed me that a horse shoe over the door will bring me luck.” The student was scandalized by this. “But Herr Professor,” the student objected, “surely as a physicist and intellectual such as yourself does not believe in these silly superstitions.” “Of course not,” Bohr answered. “But the Inn Keeper reassured me that the horse shoe will bring me luck whether I believe in it or not.”
Here is the optional theory of the Holodeck. Certain technologies, it seems to me, can have such an influence that they shape the way we think about the world. We have seen many examples of this in our past such as the printing press, the automobile, the personal computer and the cell phone. Furthermore we anticipate the advent of similar major technologies in our future. These technologies have what is called a “psychic resonance” and change the very metaphors we use to describe our world. To give a simple example, whereas we originally used mental metaphors to explain computers in terms of “memory”, “processing” and even “computing”, today we use computer metaphors to help explain how the brain works. The arrival of the personal computer caused a shift and a reversal in what semioticians call the relationship between the explanans and the explanandum.
Psychic impact is transmitted over carriers called “memes”. Memes are basically theoretical constructs that are phenomenally identical to what we call “ideas” but behave like viruses. Memes travel through air as speech and along light waves as images in order to spread themselves from host to host. Traditionally the psychic impact of a meme is measured by the meme’s density over a given space. Besides density, the psychic impact can also be measured based on the total volume of space it is able to infect. Finally, the effectiveness of a meme can also be measured based on its ability to spread into the future. For instance, works of literature and cultural artifacts such as religions and even famous sayings are examples of memes that have effectively infected the future despite a distance of thousands of years between the point of origin of the infection and the temporal location of the target.
While the natural habitat of bacteria like e coli is in the gastrointestinal tract, the natural habitat of memes is in the brain and this leads to a fascinating third form of mimetic transmission. At the level of microtubules in the brain where memes happen to live, we enter the Planck scale in which classical physics do not apply in the way that they do at the macro level. At this scale, effects like quantum entanglement create spooky behaviors such as quantum communication. While theoretically people still cannot communicate with each other in time since that level of semiotics is still governed by classical physics, there is an opening for mimetic viruses to actually be transmitted backwards in time as if they were entering a transporter in one brain and rematerialized in another brain in the past. This allows for a third manner of mimetic spread: in space, forward in time, and finally backwards in time.
As an aside, and as I said above, this is an _optional_ theory of psychic impact through time. A common and totally valid criticism is that it appeals to quantum mystery which tends to be misused to justify anything from ghosts to religious cults. The problem with appeals to “quantum mystery” is that this simply provides a name for a problem rather than prescribing actual ways to make predictions or anticipate behavior. In other words, like Martin Fowler’s bliki, it is word magic that provides interpretations of things but not actual solutions. Against such criticisms, however, it should be pointed out that I am explicitly engaged in an exercise in word magic, in which case using certain techniques of word magic – such as quantum mystery – is perfectly legitimate and even natural.
Through quantum entanglement acting on memes at the microtubule level, a technology from our possible future which resembles the Star Trek holodeck has such a large psychic impact that it resonates backwards in time until it reaches and inhabits the brains of the writers of a futuristic science fiction show in the late 80’s and is introduced into the show as the Holodeck. Through television transmissions, the holodeck meme is then broadcast to millions of teenagers who eventually enter the tech industry, become leaders in the tech industry, and eventually decide to implement various aspects of the holodeck by creating better and better 3D sensors, 3D simulation tools and 3D visualization technologies – both augmented and virtual. In other words, the Holodeck reaches backwards in time to inspire others in order to effectively give birth to itself, ex nihilo. Those that have been touched by the transmission are what I am calling holocoders.
Alternatively, this theory of where holocoders come from can be taken as a metaphor only. In this case, holocoders are not people being pulled toward a common future but instead people being pushed forward from a common past. Holocoders are people inspired directly or indirectly by a television show from the late 80’s that involved a large room filled with holograms that could be used for entertainment as well as research. Holocoders work on any or all of the wide variety of technologies that could potentially be combined to recreate that imagined experience.
Anyways, that’s my theory and I’m sticking to it. More importantly, these technologies are deeply entangled and deserve a good name, whether you want to go with holocoding or something else (though the holodeck people from the future highly encourage you to use the terms “holocoder”, “holocoding” and “holodeck”).
There are two other important instances of environment simulators which for whatever reason do not have the same impact as the Star Trek holodeck but are nevertheless worth mentioning.
The first is the X-Men Danger Room which is an elaborate obstacle course involving holograms as well as robots used to train the X-Men. While the Danger Room goes back to the 60’s, the inclusion of holograms actually didn’t happen until the early 90’s, and so actually comes after the Star Trek environment simulator.
Clifford D. Simak published Way Station in 1963 (and won a Hugo award for it). It actually anticipates two Star Trek technologies – transporters as well as an environment simulator. Enoch Wallace, the hero of the story, works the earth relay station for intergalactic aliens who transport travelers over vast distances by sending them over shorter hops between the way stations of the title. Because he is so isolated in his job, the aliens support him by allowing him to pursue a hobby. Because Wallace enjoys hunting, the aliens build for him an environment simulator that lets him do big game hunting for dinosaurs.