The Imaginative Universal

Top 21 HoloLens Ideas

January 26, 2015 James Ashley

The image above is a best guess at the underlying technology being used in Microsoft’s new HoloLens headset. It’s not even that great a guess since the technology appears to still be in the prototype stage. On the other hand, the product is tied to the Windows 10 release date, so we may be seeing a consumer version – or at the very least a dev version – sometime in the fall.

Here are some things we can surmise about HoloLens:

a) the name may change – HoloLens is a good product name but isn’t quite where we might like it to be, in a league with Kinect, Silverlight or Surface for branding genius. In fact, Surface was such a good name, it was taken from one product group and simply given to another in a strange twist on the build vs buy vs borrow quandary. On the other hand, HoloLens sounds more official than the internal code name, Baraboo — isn’t that a party hippies throw themselves in the desert?

johnny mnemonic

b) this is augmented reality rather than virtual reality. Facebook’s Oculus Rift, which is an immersive fully digital experience, is an example of virtual reality. Other fictional examples include The Oasis from Ernest Cline’s Ready Player One, The Mataverse from Neal Stephenson’s Snow Crash, William Gibson’s Cyberspace and the VR simulation from The Lawnmower Man. Augmented reality involves overlaying digital experience on top of the real world. This can be accomplished using holography, transparent displays, or projectors. A great example of projector based AR is the RoomAlive project by Hrvoje Benko, Eyal Ofek and Andy Wilson at Microsoft Research. HoloLens uses glasses or a head-rig – depending on how generous you feel – to implement AR. Magic Leap – with heavy investment from Google – appears to be doing the same thing. The now dormant Google Glass was neither AR nor VR, but was instead a heads-up display.

kgirl

c) HoloLens uses Kinect technology under the plastic covers. While the depth sensor in the Kinect v2 has a field of view of 70 degrees by about 60 degrees, the depth capability in HoloLens is reported to include a field of view of 120 degrees by 120 degrees. This indicates that HoloLens will be using the Time-of-Flight technology used in Kinect v2 rather than the structured light from Kinect v1. This set up requires both an IR emitter as well as a depth camera combined with a sophisticated timing and phase technology to efficiently and relatively inexpensively calculate depth.

hands

d) the depth camera is being used for multiple purposes. The first is for gesture detection. One of the issues that faced both Oculus and Google Glass was that they were primarily display technologies. But a computer monitor is useless without a keyboard or mouse. Similarly, Oculus and Glass needed decent interaction metaphors. Glass relied primarily on speech commands and tapping and clicking. Oculus had nothing until their recent acquisition of the NimbleVR . NimbleVR provides a depth camera optimized for hand and finger detection over a small range. This can be mounted in front of the Oculus headset. Conceptually, this allows people to use hand gestures and finger manipulations in front of the device. A virtual hand can be created as an affordance in the virtual world of the Oculus display, allowing users to interact with virtual objects and virtual interactive menus in virtro.

The depth sensor in HoloLens would work in a similar way except that instead of a virtual hand as affordance, it’s just your hand. You will use your hand to manipulate digital objects displayed on the AR lenses or to interact with AR menus using gestures.

An interesting question is how many IR sensors are going to be on the HoloLens device. From the pictures that have been released, it looks like we will have a color camera and a depth sensor for each eye, for a total of two depth cameras and two RGB cameras located near the joint between the lenses and the headband.

holo_minecraft

e) HoloLens is also using depth data for 3d reconstruction of real world surfaces. These surfaces are then used as virtual projection surfaces for digital textures. Finally, the blitted image is displayed on the transparent lenses.

ra1

ra_2

This sort of reconstruction is a common problem in projection mapping scenarios. A great example of applying this sort of reconstruction can be found in the Halloween edition of Microsoft Research’s RoomAlive project. In the first image above, you are seeing the experience from the correct perspective. In the second image, the image is captured from a different perspective than the one that is being projected. From the incorrect perspective, it can be seen that the image is actually being projected on multiple surfaces – the various planes of the chair as well as the wall behind it – but foreshortened digitally and even color corrected to make the image appear cohesive to a viewer sitting at the correct position. One or more Kinects must be used to calibrate the projections appropriately against these multiple surfaces. If you watch the full video, you’ll see that Kinect sensors are used to track the viewer as she moves through the room and the foreshortening / skewing occurs dynamically to adjust to her changing position.

The Minecraft AR experience being used to show the capabilities of HoloLens requires similar techniques. The depth sensor is required not only to calibrate and synchronize the digital display to line up correctly with the table and other planes in the room, but also to constantly adjust the display as the player moves around the room.

eye-tracking

f) are the display lenses stereoscopic or holographic? At this point no one is completely sure, though indications are that this is something more than the stereoscopic display technique used in the Oculus Rift. While a stereoscopic display will create the illusion of depth and parallax by creating a different image for each lens, something holographic would actually be creating multiple images per lens and smoothly shifting through them based on the location of each pupil staring through its respective lens and the orientation and position of the player’s head.

One way of achieving this sort of holographic display is to have multiple layers of lenses pressed against each other and using interference shift the light projected into each pupil as the pupil moves. It turns out that the average person’s pupils typically move around rapidly in saccades, mapping and reconstructing images for the brain, even though we do not realize this motion is occurring. Accurately capturing these motions and shifting digital projections appropriately to compensate would create a highly realistic experience typically missing from stereoscopic reconstructions. It is rumored in the industry that Magic Leap is pursuing this type of digital holography.

On the other hand, it has also been reported that HoloLens is equipped with eye-tracking cameras on the inside of the frames, apparently to aid with gestural interactions. It would be extremely interesting if Microsoft’s route to achieving true holographic displays involved eye-tracking combined with a high display refresh rate rather than coherent light interference display technology as many people assume. Or, then again, it could just be stereoscopic displays after all.

occlusion

g) occlusion is generally considered a problem for interactive experiences. For augmented reality experiences, however, it is a feature. Consider a physical-to-digital interaction in which you use your finger/hand to manipulate a holographic menu. The illusion we want to see is of the hand coming between the player’s eyes and the digital menu. The player’s hand should block and obscure portions of the menu as he interacts with it.

The difficulty with creating this illusion is that the player’s hand isn’t really between the menu and the eyes. Really, the player’s hand is on the far side of the menu, and the menu is being displayed on the HoloLens between the player’s eyes and his hand. Visually, the hologram of the menu will bleed through and appear on top of the hand.

In order to re-establish the illusion of the menu being located on the far side of the hand, we need depth-sensors to accurately map an outline of the hand and arm and then cut a hand and arm shape out of the menu where the hand should be occluding it. This process has to be repeated as the hand moves in real-time and it’s kind of a hard problem.

borg

h) misc sensors : best guess is that in addition to depth sensors, color cameras and eye-tracking cameras, we’ll also get a directional microphone, gyroscope, accelerometer and magnetometer. Some sort of 3D sound has been announced, so it makes sense that there is a directional microphone or microphone array to complement it. This is something that is available on both the Kinect v1 and Kinect v2. The gyroscope, accelerometer and magnetometer are also guesses – but the Oculus hardware has them to track quick head movements, head position and head orientation. It makes sense that HoloLens will need them also.

bono

i) the current form factor looks a little big – bigger than the Magic Leap is supposed to be but smaller than the current Oculus dev units. The goal – really everyone’s goal, from Microsoft to Facebook to Google – is to continue to bring down the size of sensors so we can eventually have heavy glasses rather than light-weight head gear.

j) vampires, infrared sensors and transparent displays are all sensitive to direct sunlight. This consideration can affect the viability of some AR scenarios.

k) like all innovative technologies, the success of HoloLens will depend primarily on what people use it for. the myth of the killer app is probably not very useful anymore, but the notion that you need an app store to sell a device is a generally accepted universal constant. The success of the HoloLens will depend on what developers build for it and what consumers can imagine doing with it.

Top 21 Ideas

Many of these ideas are borrowed from other VR and AR technology. In most cases, HoloLens will simply provide a better way to implement these notions. These ideas come from movies, from art installations, and from many years working at an innovative marketing agency where we prototyped these ideas day in and day out.

1. Shopping

Amazon made one click shopping make sense. Shopping and the psychology of shopping changes when we make it more convenient, effectively turning instant gratification into a marketing strategy. Using HoloLens AR, we can remodel a room with virtual furniture and then purchase all the pieces on an interactive menu floating in the air in front of us when we find the configuration we want. We can try and buy virtual clothes. With a wave of the hand we can stock our pantry, stock our refrigerator … wait, come to think of it, with decent AR, do we even need furniture or clothes anymore?

2. Gaming

IllumiRoom was a Microsoft project that never quite made it to product but was a huge hit on the web. The notion was to extend the XBox One console with projections that reacted to what was occurring in the game but could also extend the visuals of the game into the entire living room. IllumiRoom (which I was fortunate enough to see live the last time I was in Redmond) also uses a Kinect sensor to scan the room in order to calibrate projection mapping onto surfaces like bookshelves, tables and potted plants. As you can guess, this is the same team that came up with RoomAlive. A setup that includes a $1,500 projector and a Kinect is a bit complicated, especially when a similar effect can now be created using a single unit HoloLens.

hud

The HoloLens device could also be used for in-game Heads-Up notifications or even as a second screen. It would make a lot of sense if XBox integration is on the roadmap and would set XBox apart as the clear leader in the console wars.

3. Communication

‘nuff said.

4. Home Automation

clapper

Home automation has come a long way and you can now easily turn lights on and off with your smart phone from miles away. Turning your lights on and off from inside your own house may still involve actually touching a light switch. Devices like the Kinect have the limitation that they can only sense a portion of a room at a time. Many ideas have been thrown out to create better gesture recognition sensors for the home, including using wifi signals that go through walls to detect gestures in other rooms. If you were actually wearing a gestural device around with you, this would no longer be a problem. Point at a bulb, make a fist, “put out the light, and then put out the light” to quote the Bard.

5. Education

Microsoft-future-vision

While cool visuals will make education more interesting, the biggest benefit of HoloLens for education is simple access. Children in rural areas in the US have to travel long distances to achieve a decent education. Around the world, the problem of rural education is even worse. What if educators could be brought to the children instead? This is one of the stated goals of Facebook’s purchase of Oculus Rift and HoloLens can do the same job just as well and probably better.

6. Medical Care

Technology can be used for interesting diagnostic and rehabilitation functions. The depth sensors that come with HoloLens will no doubt be used in these ways eventually. But like education, one of the great problems in medical care right now is access. If we can’t bring the patient to the doctor, let’s bring the GP to the patient and do regular check ups.

7. Holodeck

matrix-i-know-kung-fu

The RoomAlive project points the way toward building a Holodeck. All we have to do is replace Kinect sensors with HoloLens sensors, projectors with holographic displays, and then try now to break the HoloLens strapped to our heads as we learn Kung Fu.

8. Windows

window

Have you ever wished you could look out your window and be somewhere else? HoloLens can make that happen. You’ll have to block out natural light by replacing your windows with sheetrock, but after that HoloLens can give you any view you want.

But why stop at windows. You can digitize all your walls if you want, and HoloLens’ depth technology will take care of the rest.

9. Movies and Television

vr-cinema-3d

Oculus Rift and Samsung Gear VR have apps that let you watch movies in your own virtual theater. But wouldn’t it be more fun to watch a movie with your entire family? With HoloLens we can all be together on the couch but watch different things. They can watch Barney on the flatscreen while I watch an overlay of Meet the Press superimposed on the screen. Then again, with HoloLens maybe I could replace my expensive 60” plasma TV with a piece of cardboard and just watch that instead.

10. Therapy

It’s commonly accepted that white noise and muted colors relax us. Controlling our environment helps us to regulate our inner states. Behavioral psychology is based on such ideas and the father of behavioral psychology, B. F. Skinner, even created the Skinner box to research these ideas – though I personally prefer Wilhelm Reich’s Orgone box. With 3D audio and lenses that extend over most of your field of view, HoloLens can recreate just the right experience to block out the world after a busy day and just relax. shhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.

11. Concerts

burning-man-festival-nevada

Once a year in the Nevada desert a magical music festival is held called Baraboo. Or, I don’t know, maybe it’s held in Tennessee. In any case, getting to festivals is really hard and usually involves being around people who aren’t wearing enough deodorant, large crowds, and buying plastic bottles of water for $20. Wouldn’t it be great to have an immersive festival experience without all the things that get in the way. Of course, there are those who believe that all that other stuff is essential to the experience. They can still go and be part of the background for me.

12. Avatars

Gamification is a huge buzzword at digital marketing agencies. Undergirding the hype is the realization that our digital and RL experiences overlap and that it sometimes can be hard to find the seams. Vernor Vinge’s 2001 novella Fast Times at Fairmont High draws out the implications of this merging of physical and digital realities and the potential for the constant self reinvention we are used to on the internet bleeding over into the real world. Why continue with the name your parents gave you when you can live your AR life as ByteMst3r9999? Why be constrained by your biological appearance when you can project your inner self through a fun and bespoke avatar representation? AR can ensure that other people only see you the way that you want them to.

13. Blocking Other People’s Avatars

BlackBlock

The flip side of an AR society invested in an avatar culture is the ability to block people who are griefing us. Parents can call a time out and block their children for ten minutes periods. Husbands can block their wives. We could all start blocking our co-workers on occasion. For serious offenses, people face permanent blocking as a legal sanction for bad behavior by the game masters of our augmented reality world. The concept was brilliantly played out in the recent Black Mirror Christmas special starring Jon Hamm. If you haven’t been keeping up with Black Mirror, go check it out. I’ll wait for you to finish.

14. Augmented Media

fiducial

Augmented reality today typically involves a smart phone or tablet and and a fiducial marker. The fiducial is a tag or bar code that indicates to the app on your phone where an AR experience should be placed. Typically you’ll find the fiducial in a magazine ad that encourages you to download an app to see the hidden augmented content. It’s novel and fun. The problem involves having to hold up your tablet or phone for a period of time just to see what is sometimes a disappointing experience. It would be much more interesting to have these augmented media experiences always available. HoloLens can be always on and searching for these types of augmented experiences as you read the latest New Yorker or Wired. They needn’t be confined to ads, either. Why can’t the whole magazine be filled with AR content? And why stop at magazines? Comic books with additional AR content would change the genre in fascinating ways (Marvel’s online version already offers something like this, though rudimentary). And then imagine opening a popup book where all the popups are augmented, a children’s book where all the illustrations are animated, or a textbook that changes on the fly and updates itself every year with the latest relevant information – a kindle on steroids. You can read about that possibility in Neal Stephenson’s Diamond Age – only available in non-augmented formats for now.

15. Terminator Vision

robocop

This is what we thought Google Glass was supposed to provide – but then it didn’t. That’s okay. With vision recognition software and the two RGB cameras on HoloLens, you’ll never forget a name again. Instant information will appear telling you about your surroundings. Maps and directions will appear when you gesture for them. Shopping associates will no longer have to wing it when engaging with customers. Instead, HoloLens will provide them with conversation cues and decision trees that will help the associate close the sale efficiently and effectively. Dates will be more interesting as you pull up the publicly available medical, education and legal histories of anyone who is with you at dinner. And of course, with the heartbeat monitor and ability to detect small fluctuations in skin tone, no one will ever be able to lie to you again, making salary negotiations and buying a car a snap.

16. Wealth Management

With instant tracking of the DOW, S&P and NASDAQ along with a gestural interface that goes wherever you go, you can become a day trader extraordinaire. Lose and gain thousands of dollars with a flick of your finger.

17. Clippit

clippit

Call him Jarvis if it helps. Some sort of AI personal assistant has always been in the cards. Immersive AR will make it a reality.

18. Impossible UIs

minority_report

phone

3dtouch

cloud atlas floating computer

I don’t watch movies the way other people do. Whenever I go to see a futuristic movie, I try to figure out how to recreate the fantasy user experiences portrayed in it. Minority Report is an easy one – it’s a large area display, possibly projection, with Kinect-like gestural sensors. The communication device from the Total Recall reboot is a transparent screen and either capacitive touch or more likely a color camera doing blob recognition. The 3D touchscreen from Pacific Rim has always had me stumped. Possibly some sort of leap motion device attached to a Pepper’s Ghost display? The one fantasy UX I could never figure out until I saw HoloLens is the “Orison” computer made up of floating disks in Cloud Atlas. The Orison screens are clearly digital devices in a physical space – beautiful, elegant, and the sort of intuitive UX for which we should strive. Until now, they would have been impossible to recreate. Now, I’m just waiting to get my hands on a dev device to try to make working Orison displays.

19. Wiki World

wikipedia

Wiki World is a simple extension of terminator vision. Facts floating before your eyes, always available, always on. No one will ever have to look up the correct spelling for a word again or strain his memory for a sports statistic. What movie was that actor in? Is grouper ethical to eat? Is Javascript an object-oriented language? Wiki world will make memorization obsolete and obviate all arguments – well, except for edit wars between Wikipedia editors, of course.

20. Belief Circles

wwc

Belief circles are a concept from Vernor Vinge’s Hugo award winning novel Rainbows End. Augmented reality lends itself to self-organizing communal affiliations that will create inter-subjective realities that are shared. Some people will share sci-fi themes. Others might go the MMO route and share a fantasy setting with a fictional history, origin story, guilds and factions. Others will prefer football. Some will share a common religion or political vision. All of these belief circles will overlap and interpenetrate. Taking advantage of these self-generating belief circles for content creation and marketing will open up new opportunities for freelance creatives and entrepreneurs over the next ten years.

21. Theater of Memory

Giulio Camillo’s memory theater belongs to a long tradition of mnemonic technology going back to Roman times and used by orators and lawyers to memorize long speeches. The scholar Frances Yates argued that it also belonged to another Renaissance tradition of neoplatonic magic that has since been superseded by science in the same way that memory technology has been superseded by books, magazines and computers. What Frances Yates – and after her Ioan Couliano – tried to show, however, was that in dismissing these obsolete modes of understanding the world, we also lose access to a deeper, metaphoric and humanistic way of seeing the world and are the poorer for it. The theater of memory is like Feng Shui – also discredited – in that it assumes that the way we construct our surroundings also affects our inner lives and that there is a sympathetic relationship between the macrocosm of our environment and the microcosm of our emotional lives. I’m sounding too new agey so I’ll just stop now. I will be creating my own digital theater of memory as soon as I can, though, as a personal project just for me.

upgraded blog engine

January 25, 2015 James Ashley

I just finished upgrading my blogengine.net from version 1.6 to version 3.1.1. Long overdue and about half a day’s work. BlogEngine.net is definitely a tool for developers rather than consumers. Now that it’s up, I’m pretty happy, though. Thanks also to the OrcsWeb support folks (OrcsWeb hosts my blog) who helped me when I kept locking myself out of the system because I can’t remember my ftp password.

I started this out with dasBlog back in 2006. I’m glad that I’ve only had to do a few upgrades in the years between.

$5 eBooks from Packt

December 27, 2014 James Ashley

The technical publisher Packt is offering eBooks for $5 through January 6th, 2015 as a holiday promotion. I encourage you to look very carefully through their selection and see what appeals. If you have time to read on, however, I’d like to explain in greater detail my mixed feelings about Packt (this was probably not the marketing department’s intention when they sent me an email asking me to publicize the promotion but I think it will ultimately be helpful to them).

Packt Publishing has always been hit or miss for me. They are typically much more adventurous regarding computer book topics than other publishers like Apress or O’Reilly (Apress is my publisher, by the way, and are pretty fantastic to work with and very professional). At the same time, I have the impression that Packt’s bar for accepting authors tends to be lower than other publishers’, which allows them to be prolific in their offerings but at the same time entails that they produce, quite honestly, some clunkers.

A specific example of one of their clunkers would be the Packt book Unity iOS Game Development Beginner’s Guide by Greg Pierce. The topic sounds great (at least it did to me) but it turns out the book mostly just copies from publicly available documentation.

To quote from one of the Amazon reviews from 2012 by C Toussieng:

“This book is unbelievably bad. What specifically? All of it. It takes information which can be easily garnered from the Unity and/or Apple websites, distills it down to a minimally useful amount, then charges you for it.“

And this one from 2012 by JasonR:

“The book basically covers a few pages of the Unity docs, then goes into 3rd party plugins they recommend, each plugin gets a couple of pages. Frankly, a simple search on Google will give you more insight.”

This is a shame since, even as more learning material is always appearing on the Internet which displaces the traditional place of technical books in the software ecosystem – material that is often free – there is still an important role for print books (and their digital equivalent, the eBook). While online material can be thrown out quickly, often covering about a fifth to a tenth of a chapter of a book that goes through the print publishing industry, they tend to lack the cohesiveness that is only possible in a work that has taken months to write and rewrite. A 300-page software book is a distillation of experience which has undergone multiple revisions and fact checking. A really good software book tries to tell a story.

The flip side, of course, is that modern technical books quickly become outdated while technical blog posts simply disappear. All in all, though, I find that sitting down with a book that tries to explain the broader impact of a given technology serves a different and more important purpose than a web tutorial that only shows how to perform streamlined – and often ideal – tasks.

A propos of the thesis that good software books are distillations of years of experience – we could even say distillations of 10,000 hours of experience – I’d like to point you to some of the gems I’ve discovered through Packt Publishing over the years.

All of the Packt OpenCV books are interesting. I’m particularly fond of Mastering OpenCV with Practical Computer Vision Projects by Daniel Lélis Baggio, but I think all of them – at least the ones I’ve read – are pretty good. Daniel’s bio says that he “…started his works in computer vision through medical image processing at InCor (Instituto do Coração – Heart Institute) in São Paulo, Brazil.”

Another great one is Mastering openFrameworks: Creative Coding Demystified by Denis Perevalov. According to his bio, Denis is a computer vision research scientist at the Ural Branch of the Russian Academy of Sciences and co-author of two Russian patents of robotics.

One I really like simply because the topic is so specific is Kenny Lammers’ Unity Shaders and Effects Cookbook. His bio states that Kenny has been in the game industry for 13 years working for companies like “… Microsoft, Activision, and the late Surreal Software.”

I hope a theme is emerging here. The people who write these books actually have a lot of experience and are trying to pass their knowledge on to you in something more than easily digestible exercises. Best of all – ignoring the example from above – the material is typically highly original. It isn’t copy and pasted from 20 other websites covering the same material. Instead, the reader gets an opinionated and distinct take on the technology covered in each of these books.

What I especially appreciate about the $5 promotions Packt occasionally surfaces is that, for five dollars, you aren’t really obligated to try to read the entire book to get your money’s worth. I’ve taken advantage of similar deals in the past to simply read very specific chapters that are of interest to me such as Basic heads-up-display with custom GUI from Dr. Sebastian Koenig’s Unity for Architectural Visualization or Lighting and Rendering from Jen Rizzo’s Cinema 4D Beginner’s Guide. It’s also a great price when all I want to do is to skim a book on a topic I know pretty well in order to find out if there are any holes in my knowledge. Mastering Leap Motion by Brandon Sanders was extremely helpful for this and, indeed, there were holes in my knowledge.

According to his biography, by the way, Brandon is “… an 18-year-old roboticist who spends much of his time designing, building, and programming new and innovative systems, including simulators, autonomous coffee makers, and robots for competition. At present, he attends Gilbert Finn Polytechnic (which is a homeschool) as he prepares for college. He is the founder and owner of Mechakana Systems, a website and company devoted to robotic systems and solutions.”

Ash Thorp Shout Out

December 20, 2014 James Ashley

Ash Thorp is another alumnus of Atlanta’s ReMIX conference for code and design who has made it big (or, to be accurate, big-ger). Like Chris Twigg and 3Gear, who were profiled in the last post, we were fortunate to have Ash Thorp present at ReMIX a few years ago when it was still possible to get him.

Ash was recently named to the Verge 50, squeezed somewhere between Tim Cook and Matthew McConaughey. The Verge website’s Fifty of 2014 is their list of the “most important people at the intersection of technology, art, science and culture.”

Ash Thorp is a visual designer for movies, designing both the intros for movies as well as the overall look and feel of a film. His specialty is sci fi and superhero movies and you’ve seen his work everywhere from Prometheus to Ender’s Game and beyond. He first came to our attention because of his website where he lifted the curtain a bit and showed how film design is actually done. From this, we could start piecing together the similarities between what he did and the more standard graphic design typically done for digital and print.

Ash Thorp is also the host of the Collective Podcast, which are a series of open-ended and meandering conversations about design, life and the universe. It is an earnest attempt by creative professionals to connect the world with their work and to use their work as designers as a prism for understanding the world. There is nothing quite like it in my field, the software development world, and we are all the poorer for it.

He’s also done lots of other cool projects, like his homage to Ghost in the Shell, that are all about being creative and sharing inspiration without an underlying profit motive. He is constantly trying to share and change and mold and give which is as much a testament to his boundless energy as it is to his essentially giving spirit.

Here is the brilliant presentation Ash gave at the ReMIX Conference a few years ago, revealing his approach to … well … work, life and the universe.

The Future of Interface Technology – Ash Thorp from ReMIX South on Vimeo.

Congrats to NimbleVR

December 12, 2014 James Ashley

I had the opportunity to meet Rob Wang, Chris Twigg and Kenrick Kin of 3Gear several years ago when I was in San Francisco demoing retail experiences using the Microsoft Kinect and Surface Table at the 2011 Oracle OpenWorld conference. I had been following their work with stereoscopic finger and hand tracking with dual Kinects and sent them what was basically a fan letter and they were kind enough to send me an invitation to their headquarters.

At the time, 3Gear was co-sharing office space with several other companies in a large warehouse space. Their finger tracking technology blew me away and I came away with the impression that these were some of the smartest people I had ever met working with computer vision and the Kinect. After all, they’re basically all Phd’s with backgrounds at companies like Industrial Light and Magic and Pixar.

I’ve written about them several times on this blog and nominated them for the Kinect v2 preview progrram. I was extremely excited when Chris agreed to present at the ReMIX conference some friends and I organized in Atlanta a few years ago for designers and developers. Here is a video of Chris’s amazing talk.

Bringing ‘Minority Report’ to your Desk: Gestural Control Using the Microsoft Kinect – Chris Twigg from ReMIX South on Vimeo.

Since then, 3Gear have worked on the problem of finger and hand tracking on various commercial devices in multiple configurations. In October of 2014 the guys at 3Gear initiated a Kickstarter project for a sensor they had developed called Nimble Sense. Nimble Sense is a depth sensor built from commodity components that is intended to be mounted on the front of an Oculus Rift headset. It handles the difficult problem of providing a good input device for the VR system which has the obvious side-effect of preventing you from seeing your own hands.

The solution, of course, is to represent the interaction controller – in this case the user’s hands – in the virtual world itself. Leap Motion, which produces another cool finger tracking device, also is working on creating a solution for this. The advantage the 3Gear people have, of course, is that they have been working on this particular problem with particular expertise in gesture tracking – rather than merely finger tracking – as well as visualization.

After exceeding their original goal in pledges, 3Gear abruptly cancelled their kickstarter on December 11th and the official 3Gear.com website I have been going to for news updates about the company was replaced.

This is actually all good news. Nimble VR, a rebranding of 3Gear for the Nimble Sense project, has been purchased by Oculus (which in turn, you’ll recall, was purchased by Facebook several months ago for around $2 billion).

For me this is a Cinderella story. 3Gear / Nimble VR is an extremely small team of extremely smart people who have passed on much more lucrative job opportunities in order to pursue their dreams. And now they’ve achieved their much deserved big payday.

Congratulations Rob, Chris and Kenrick!

Projecting Augmented Reality Worlds

November 7, 2014 James Ashley

In my last post, I discussed the incredible work being done with augmented reality by Magic Leap. This week I want to talk about implementing augmented reality with projection rather than with glasses.

To be more accurate, varieties of AR experiences are often projection based. The technical differences depend on which surface is being projected on. Google glass projects on a surface centimeters from the eye. Magic Leap is reported to project directly on the retina (virtual retinal display technology).

AR experiences being developed at Microsoft Research, which I had the pleasure of visiting this past week during the MVP Summit, are projected onto pre-existing rooms without the need to rearrange the room itself. Using fairly common projection mapping techniques combined with very cool technology such as the Kinect and Kinect v2, the room is scanned and appropriate distortions are created to make projected objects look “correct” to the observer.

An important thing to bear in mind as you look through the AR examples below is that they are not built using esoteric research technology. These experiences are all built using consumer-grade projectors, Kinect sensors and Unity 3D. If you are focused and have a sufficiently strong desire to create magic, these experiences are within your reach.

The most recent work created by this group (led by Andy Wilson and Hrvoje Benko) is a special version of RoomAlive they created for Halloween called The Other Resident. Just to prove I was actually there, here are some pictures of the lab along with the Kinect MVPs amazed that we were being allowed to film everything given that most of the MVP Summit involves NDA content we are not allowed to repeat or comment on.

IllumiRoom is a precursor to the more recent RoomAlive project. The basic concept is to extend the visual experience on the gaming display or television with extended content that responds dynamically to what is seen onscreen. If you think it looks cool in the video, please know that it is even cooler in person. And if you like it and want it in your living room, then comment on this thread or on the youtube video itself to let them know it is definitely an M viable product for the XBox One, as the big catz say.

The RoomAlive experience is the crown jewel at the moment, however. RoomAlive uses multiple projectors and Kinect sensors to scan a room and then use it as a projection surface for interactive, procedural games: in other words, augmented reality.

A fascinating aspect of the RoomAlive experience is how it handles appearance preserving point-of-view dependent visualizations: the way objects need to be distorted in order to appear correct to the observer. In the Halloween experience at the top, you’ll notice that the animation of the old crone looks like it is positioned in front of the chair she is sitting on even the the projection surface is actually partially extended in front of the chair back and at the same time extended several feet behind the chair back for the shoulders and head. In the RoomAlive video just above you’ll see the view dependent visualization distortion occurring with the running soldier changing planes at about 2:32”.

You would think that these appearance preserving PDV techniques will fall apart anytime you have more than one person in the room. To address this problem, Hrvoje and Andy worked on another project that plays with perception and physical interactions to integrate two overlapping experiences in a Wizard Battle scenario called Mano-a-Mano or, more technically, Dyadic Projected Spatial Augmented Reality. The globe at visualization at 2:46” is particularly impressive.

My head is actually still spinning following these demos and I’m still in a bit of a fugue state. I’ve had the opportunity to see lots of cool 3D modeling, scanning, virtual experiences, and augmented reality experiences over the past several years and felt like I was on top of it, but what MSR is doing took me by surprise, especially when it was laid out sequentially as it was for us. A tenth of the work they have been doing over the past two years could easily be the seed of an idea for any number of tech startups.

In the middle of the demos, I leaned over to one of the other MVPs and whispered in his ear that I felt like Steve Jobs at Xerox PARC seeing the graphical user interface and mouse for the first time. He just stroked his beard and nodded. It was a magic moment.

Why Magic Leap is Important

October 27, 2014 James Ashley

magic-leap-shark-640x426

This past weekend a neighbor invited our entire subdivision to celebrate an Indian holiday called Diwali with them – The Festival of Lights. Like many traditions that immigrant families carry to the New World in their luggage, it had become an amalgamation of old and new. The hosts and other Indians from the neighborhood wore traditional South-East Asian formalwear. I was painfully underdressed in an old oxford, chinos and flip-flops. Others came in the formalwear of their native countries. Some just put on jackets and ties. We organized this Diwali as a pot-luck and had an interesting mix of biryanis, spaghetti, enchiladas, pancakes with syrup, borscht, tomato korma, Vietnamese spring rolls and puri.

The most important part of the celebration was the lighting of fireworks. For about two solid hour, children ran through a smoky cul-de-sac waving sparklers while firecrackers went off around them. Towards the end of this celebration, one of our hosts pulled out her iPhone in order to Facetime with her father in India and show him the children playing in the background just as they would have back home, forming a line of continuity between continents using a 1500 year old ritual and an international cellular system. Diwali is called the Festival of Lights, according to Wikipedia, because it celebrates the spiritual victory of light over darkness and ignorance.

When I got home I did some quick calculations. In order to get to that Apple moment our host had with her father – we no longer have Hallmark moments but only Apple moments today – took approximately seven years. This is the amount of time it takes for a technology to seem fantastic and impractical – because we don’t believe it can be done and can’t imagine how we would use it in everyday life if it was – to having it be unexceptional.

Video conferencing has been a staple of science fiction for ages, from 2001: A Space Odyssey to Star Trek. It was only in 2010, however, that Apple announced the FaceTime app making it generally available to anyone who could afford an iPhone. I’m basing the seven years from fantasy to facticity, though, on length of time since the initial release of the iPhone in 2007.

Magic Leap, the digital reality technology that has just received half a billion dollars of funding from companies like Google, is important because it points the way to what can happen in the next seven years. I will paint a picture for you of what a world with this kind of digital reality technology will look like and it’s perfectly okay if you feel it is too out there. In fact, if you end up thinking what I’m describing is plausible, then I haven’t done a good enough job of portraying that future.

Magic Leap is creating a wearable product which may or may not be called Dragonstone glasses and which may or may not be a combination of light field technology – like that used in the Lytro camera – and depth detection – like the Kinect sensor. They are very secretive about what they are doing exactly. When Leap Magic CEO Rony Abovitz talks about his product, however, he uses code to indicate what it is and what it isn’t.

In an interview with David Lidsky, Abovitz let slip that Dragonstone is “not holography, it’s not stereoscopic 3-D. You don’t need a giant robot to hold it over your head, you don’t need to be at home to use it. It’s not made from off-the-shelf parts. It’s not a cellphone in a View-Master.” At first reading, this seems like a quick swipe at Oculus Rift, the non-mobile, stereoscopic virtual reality solution built from consumer parts by Oculus VR and, secondarily, Samsung Gear VR, the mobile add-on to Samsung’s Galaxy Note 4 that turns it into a virtual reality device with stereoscopic audio. Dig a little deeper, however, and it’s apparent that his grand sweep of dismissal takes in a long list of digital reality plays over the years.

Let’s start with holography. Actually, let’s start with a very specific hologram.

let the wookie win

The 1977 holographic chess game from Star Wars is the precursor to both virtual and augmented reality as we think of them – for convenience, I am including them all under the “digital reality” rubric. No child saw this and didn’t want it. From George Lucas imaginative leap, we already see an essential aspect of the digital experience we crave that differentiates it from the actual technology we have. Actual holography involves a frame that we view the virtual image through. In Lucas’s vision, however, the holograms take up space and have a location.

harryhausen

What’s intriguing about the Star Wars scene is that as a piece of film magic, the technology behind the chess game wasn’t particularly innovative. It’s pretty much just the same claymation techniques Ray Harryhausen and others had been using since the 50’s and involves superimposing a animated scene over a live scene. The difference comes in how George Lucas incorporates it into the story. Whereas all the earlier films that mixed live and animated sequences sought to create the illusion that the monsters were real, in the battle chess scene, it is clear that they are not – for instance because they are semi-transparent. Because the elements of the chess game are explicitly not real within the movie narrative – unlike Wookies, Hutts, and Ton-tons – they are suddenly much more interesting. They are something we can potentially recreate.

The difference between virtual reality and augmented reality is similarly one of context. Which is which depends on how we, as the observer, are related to the digital experience. In the case of augmented reality, the context is the real world into which digital objects are inserted. An example of this occurs in Empire Strikes Back [1980], where the binoculars on Hoth provide additional information presented as an overlay on the real world.

The popular conception of virtual reality, as opposed to the technical accomplishment, probably dates to the publication of William Gibson’s Neuromancer in 1984. Gibson’s “cyberspace” is a fully digital immersive world. Unlike augmented reality where the context is our reality, in cyberspace the context is a digital space into which we, as observers and participants, are superimposed.

titan

To schematize the difference, in augmented reality, reality is the background and digital content is in the foreground; in virtual reality, the background that we perceive is digital while the foreground is a combination of digital and actual objects. I find this to be a clean way of distinguishing the two and preferable to the tendency to distinguish them based on different degrees of immersion. To the extent that contemporary VR is built around improving the video game experience, we see that POV games have, as a goal, to create increasingly realistic world – but what is more realistic than the real world. On the other side, augmented reality, when done right, have the potential to be incredibly immersive.

We can subdivide augmented reality even further. We’ll actually need to in order to elucidate why AR in Magic Leap is different from AR in Google Glass. Overlaying digital content on top of reality can take several forms and tends to fall along two axes. An AR experience is either POV or non-POV. It can also be either informational or interactive.

terminator_view

Augmented Reality in the POV-Informatics quadrant is often called Terminator Vision after the 1984 sci-fi Austrian body-builder augmented film. I’m not sure why a computer, the Terminator, would need a display to present data to itself, but in terms of the narrative it does wonders for the audience. It gives a completely false sense of what it must be like to think like a computer.

google glass

Experiences in the non-POV-Informatics quadrant are typically called Heads-Up-Displays or HUD. They have their source in military applications but are probably best known from first-person-shooters where the view-point is tied to objects like windshields or gun-sights rather than to the point-of-view of the player. They also don’t take up the entire view and consequently we can look away from them – unlike Terminator Vision. Google Glass is actually an example of a HUD – though it is sometimes mistaken for TV — since the display only fills up the right corner of the visual field.

fiducial

Non-POV interactive can be either magic mirror experiences or hand-held games and advertisements involving fiducials. This is a common way of creating augmented reality experiences for the iPad and smartphones. The device camera is pointed toward a fiducial, such as a picture in a catalog, and a 3-D model is layered over the video returned by the camera. Interestingly Qualcomm, one of the backers in Magic Leaps recent round of funding, is also a leader in developing tools for this type of AR experience.

hope

POV interactive, the final quadrant, is where Magic Leap falls. I don’t need to describe it because its exemplar is the sort of experience that Rony Abovitz says Dragonstone is not – the hologram from Star Wars. The difference is that where Abovitz is referring to the sort of holography we can do in actual reality, Magic Leap’s technology is the kind of holography that, so far, we have only been able to do in the movies.

If you examine the two images I’ve included from Star Wars IV, you’ll notice that the holograms are seen not from a single point of view but from multiple points of view. This is a feature of persistent augmented reality. The digital AR objects virtually exist in a real-world location and exist that way for multiple people. Even though Luke and Ben have different photons shooting at their eyes displaying the image of Leia from different perspectives, they are nevertheless looking at the same virtual Princess.

This kind of persistence, and the sort of additional technology required to make it work, helps to explain part of the reason Google is interested in it. Google, as we know, already has its own augmented reality play. Where Google brings something new to a POV interactive AR experience is in its expertise in geolocation, without which persistent AR entities would be much harder to create.

This sort of AR experience does not necessarily imply the use of glasses. We don’t know what sort of pseudo-technology is used the the Star Wars universe, but there are indications that it is some sort of projection. In Vernor Vinge’s sci-fi novel Rainbow’s End [2006], persistent augmented reality is projected on microscopic filaments that people experience without wearables.

Because Magic Leap is creating the experience inside a wearable close-range display, i.e. glasses, additional tricks are required. In addition to geolocation – which is only a guess at this point – it will also require some sort of depth sensor to determine if real-world objects are located between the viewer and the object’s location. If there is, then the occlusion of the virtual entity has to be simulated in the visualization – basically, a chunk has to be cut out of the image.

magic-leap-whale

If I have described the Magic Leap technology correctly – and there’s a good chance I have not given the secretiveness around it – then what we are looking at seven years out is a world in which everything we see is constantly being photoshopped in real-time. At a basic level, this fulfills the Magic Leap promise to re-enchant the world with digital entities and also makes sense of their promotional materials.

There are also some interesting side-effects. For one, an augmented world would effectively turn everything and everyone into a potential billboard. Given Google’s participation, this seems even likely. As with the web, advertisements will pay for the content that populates an augmented reality world. Like the web and mobile devices, the same geolocation that makes targeted content possible may also be used to track our behavior.

magic

There are additional social consequences. Many strange aspects of online behavior may make its way into our world. Pseudo-anonymity, which can encourage bad behavior in good people, can become a larger aspect of our world. Instead of appearing as themselves, people may prefer enhanced versions of themselves or even avatars.

jedi_council

In seven years, it may become normal to sit across a conference table from a giant rabbit and Master Chief discussing business strategies. Constant self-reinvention, which is a hallmark of the online experience, may become even more prevalent. In turn, reputation systems may also become more common as a way to curb the problems associated with anonymity. Liking someone I pass in the street may become much more literal.

Jedi

There is also, however, the cool stuff. Technology, despite all the frequent articles to the contrary, has the power to bring people together. Imagine one day being able to share an indigenous festival with loved ones who live thousands of miles away. My eleven year-old daughter has grown up with friends from around the world whom she has met online. Technology allows her not only to chat with them with texts, but also to speak with them while she is performing chores or walking around the house. Yet she has never met any of them. In seven years, we may live in a world where physical distance no longer implies emotional distance and where sitting around chatting face-to-face with someone you have never actually met face-to-face does not seem at all strange.

For me, Magic Leap points to a future where physical limitations are no longer limitations in reality.

Kinect SDK 2.0 Live!

October 22, 2014 James Ashley

Today the Kinect SDK 2.0 – the development kit for the new, improved Kinect version 2 – went live. You can download it immediately.

Kinect for Windows v2 is now out of its beta and pre-release phase.

Additionally, the Windows Store will now accept apps developed for Kinect. If you have a Kinect for Windows v2 sensor and are running Windows 8, you will be able to use it to run apps you’ve downloaded from the Windows Store.

And if you don’t have a Kinect for Windows v2? In that case, you can use the Kinect sensor from your XBox One and – with a $50 adapter that Microsoft just released – turn it into a sensor you can use with your Windows 8 computer.

You basically now have a choice of purchasing a Kinect for Windows v2 kit for $200, or a separate Kinect for Xbox One for $150 and an adapter for $50.

Alternatively, if you already have the sensor that came with your Xbox One, Microsoft has effectively lowered the entry bar to $50 so you can start trying the new Kinect:

1. Buy the Kinect v2 adapter.

2. Download the SDK to your 64-bit Windows 8 machine.

3. Detach the Kinect from your XBox One and plug it into your computer.

Code Camp, MVP, etc.

October 14, 2014 James Ashley

It has been a busy two weeks. On the first of the month I was renewed for the Microsoft MVP program. I started out as a Client App Dev MVP many years ago and am currently an MVP in the Kinect for Windows program. I’m very grateful to the Kinect for Windows team for re-upping me again this year. It’s a magnificent program and the team is incredibly supportive and helpful. It’s also an honor to be associated with the other K4W MVPs who are all amazing in their own right and, to be honest, somewhat intimidating. But they politely laugh at my jokes in group calls and rarely call me out when I say something stupid. For all this, I am very grateful.

I’m often asked how one gets into the MVP program. There are, of course, midnight rituals and secret nominations as with any similar association of people. In general, however, the MVP is given out for participating in community activities like message boards (yes, you should be answering questions on the MSDN forums and passing your knowledge on to others!) as well as Code Camps like the one I attended this past Saturday.

My talk at the 2014 Code Camp Atlanta was on the Kinect for Windows v2. It was appropriately called “Handwaving with the Kinect for Windows v2” since the projector in the room didn’t work for the first twenty minutes or so of the presentation. I was delighted to find out that I actually knew enough to talk through the features of the new Kinect without notes, slides, or a way to show my Kinect demos and still remain relatively entertaining and informative.

Once the nameless but wonderful tech guy finished installing a second projector in the room as I was going through my patter, I was able to start navigating through my slides using hand gestures and this gesture mapper tool I built last year: http://channel9.msdn.com/coding4fun/kinect/Kinect-PowerPoint-Mapper-A-fresh-look-at-Kinecting-to-PowerPoint

Anyways, I wanted to express my appreciation for the early morning attendees who sat through my hand-waving exercise and I hope it got you interested enough to download the sdk and start trying your hand at Kinect development.

MSR Mountain View and Kinect

September 21, 2014 James Ashley

Just before the start of the weekend, Mary Jo Foley broke the story that the Mountain View lab of Microsoft Research was being closed. Ideally, most of the researchers will be redistributed to other locations and not be casualties of the most recent round of layoffs.

The Kinect sensor is one of the great examples of Microsoft Research successfully working well with a product team to bring something to market. Researchers from around the world worked on Project Natal (the code-name for Kinect). An extremely important contribution to the machine learning required to make skeleton tracking work on the Kinect was made in Mountain View.

Machine learning works best when you are dealing with lots of data. In the case of skeleton tracking, millions of images had been gathered. But how do you find the hardware to process that many images?

Fortunately, the Mountain View group specialized in distributed computing. One researcher in particular, Mihai Budiu, worked on a project that he believed would help the Project Natal team to solve one of its biggest hurdles. The project was called DryadLinq and could be used to coordinate parallel processing over a large server cluster. The problem it solved was recognizing body parts for people of various sizes and shapes – a preliminary step to generating the skeleton view.

The research lab at Mountain View was an essential part of the Kinect story. It will be missed.