Extending Chatbots with Azure Cognitive Services

Microsoft Bot Framework is an open source SDK and set of tools for developing chatbots. One of the advantages of building chatbots with the Bot Framework is that you can easily integrate your bot service with the powerful AI algorithms available through Azure Cognitive Services. This is a quick and easy way to give your chatbot super powers when you need them.

Microsoft Cognitive Services is an ever-growing collection of algorithms developed by experts in the fields of computer vision, speech, natural-language processing, decision assistance, and web search. The services simplify a variety of common AI-based tasks, which are then easily consumable through web APIs. The APIs are also constantly being improved and some are even able to teach themselves to be smarter based on the information you feed them.

Here is a quick highlight reel of some of the current Cognitive Services available to chatbot creators:

Language

People have a natural ability to say the same thing in many ways. Intelligent bots need to be just as flexible in understanding what human beings want. The Cognitive Service Language APIs provide language models to determine intent, so your bots can respond with the appropriate action.

The Language Understanding Service (LUIS) easily integrates with Azure Bot Service to provide natural language capabilities for your chatbot. Using LUIS, you can classify a speaker’s intents and perform entity extraction. For instance, if someone tells your bot that they want to buy tickets to Amsterdam, LUIS can help identify that the speaker intends to book a flight and that Amsterdam is a location entity for this utterance.

While LUIS offers prebuilt language models to help with natural language understanding, you can also customize these models for particular language domains that are pertinent to your needs. LUIS also supports active learning, allowing your models to get progressively better as more people communicate with it.

Decision assist services

Cognitive Services has knowledge APIs that extend your bot’s ability to make judgments. Where the language understanding service helps your chatbot determine a speaker’s intention, the decision services help your chatbot figure out the best way to respond. Personalizer, currently in preview, uses machine learning to provide the best results for your users. For instance Personalizer can make recommendations or rank a chatbot’s optional responses to select the best one. Additionally, the Content Moderator service helps identify offensive language, images, and video, filtering profanity and adult content.

Speech recognition and conversion

The Speech APIs in Cognitive Services can give your bot advanced speech skills that leverage industry-leading algorithms for speech-to-text and text-to-speech conversion, as well as Speaker Recognition, a service that lets people use their voice for verification. The Speech APIs use built-in language models that cover a wide range of scenarios with high accuracy.

For applications that require further customization, you can use the Custom Recognition Intelligent Service (CRIS). This allows you to calibrate the language and acoustic models of the speech recognizer by tailoring it to the vocabulary of the application and to the speaking style of your bot’s users. This service allows your chatbot to overcome common challenges to communication such as dialects, slang and even background noise. If you’ve ever wondered how to create a bot that understands the latest lingo, CRIS is the bot enhancement you’ve been looking for.

Web search

The Bing Search APIs add intelligent web search capabilities to your chatbots, effectively putting the internet’s vast knowledge at your bot’s fingertips. Your bot can access billions of:

· webpages

· images

· videos

· news

· local businesses

Image and video understanding

The Vision APIs bring advanced computer vision algorithms for both images and video to your bots. For example, you can use them to recognize objects, people’s faces, age, gender, or even feelings.

The Vision APIs support a variety of image-understanding features. They can categorize the content of images, determining if the setting is at the beach or at a wedding. They can perform optical character recognition on your photo, picking out road signs and other text. The Vision APIs also support several image and video-processing capabilities, such as intelligently generating image or video thumbnails, or stabilizing the output of a video for you.

Summary

While chatbots are already an amazing way to help people interact with complex data in a human-centric way, extending them with web-based AI is a clear opportunity to make them even better assistants for people. Easy to use AI algorithms like the ones in Microsoft Cognitive Services remove language friction and give your chatbots super powers.

Creating a Chatbot with Microsoft Azure QnA Maker and Alexa

QnA Maker is Microsoft’s easy-to-use, cloud-based API for turning a public-facing FAQ page, product manuals, and support documents into a natural-language bot service. Because it takes in pre-vetted data to use as its “smarts,” it’s one of the easiest ways to build a powerful bot for your company.

Alexa, of course, is the world’s most pervasive host for conversational bots. It’s found in homes, corporate boardrooms, and anywhere else people want easy access to web-based information.

In this article, I will show you how to attach the plumbing to push the Q&A information your company wants users to know onto the conversational bot devices that they are most frequently using.

Part 1: Creating a bot service with QnA Maker

To get started, I first created a free Azure account to play with. I then went to the QnA Maker portal page and clicked the Create a knowledge base tab at the top to set up the knowledge base for my bot. I then clicked the blue Create a QnA service button to make a new QnA service with my free Azure account.

I followed the prompts throughout the process, which made it easy to figure out what I needed to do at each step.

In step 2, I selected my Azure tenant, Azure subscription name, and Azure resource name associated with the QnA Maker service. I also chose the Azure QnA Maker service I’d just created in the previous step to host the knowledge base.

I then entered a name for my knowledge base and the URL of my company’s FAQ to use as the brains for my knowledge base. If you just want to test this part out, you can even use the FAQ for QnA Maker itself.

QnA Maker has an optional feature called Chit-chat that let me give my bot service a personality. I decided to go with “The Professional” for this, but definitely would like to try out “The Comic” at some point to see what that’s like.

The next step was just clicking the Create your KB button and waiting patiently for my data to be extracted and my knowledge base to be created.

Once that was done, I opened the Publish page in the QnA Maker portal, published my knowledge base, and hit the Create Bot button.

After filling out additional configuration information for Azure that was specific to my account, I had a bot deployed with zero coding on Microsoft Bot Framework v4. I could even chat with it using the built-in “Test in Web Chat” feature. You can find more details in this cognitive services tutorial.

Part 2: Making your bot service work on Alexa

To get the bot service I created above working with Alexa, I had to use an open-source middleware adapter created by the botbuilder community. Fortunately, the Alexa Middleware Adapter was available as a NuGet package for Visual Studio.

I went to the Azure portal and selected the bot I created in the previous section. This gave me the option to “Download Bot source code.” I downloaded my bot source code as a zip file, extracted it into a working directory, and opened it up in Visual Studio 2017.

When the bot is automatically generated, it’s created with references to the Microsoft.AspNetCore.App NuGet package and the Microsoft.AspNetCore.App SDK. Unfortunately, this had compatibility issues with the middleware package. To fix this, I right-clicked on the Microsoft.AspNetCore.App NuGet package in the Solution Explorer window and removed it. This automatically also removed the equivalent SDK. To get back all the DLLs I needed, I used NuGet Package Manager to install the Microsoft.AspNetCore.All (2.0.9) package instead. Be sure to install this specific version of the package to ensure compatibility.

After making those adjustments to the solution, I went to the Visual Studio menu bar and selected Tools -> Nuget Package Manager -> Manage Nuget Packages for Solution. I searched for Adapters.Alexa and installed the Bot.Builder.Community.Adapters.Alexa package.

If your downloaded app is missing its Program.cs or Startup.cs file, you will need to create these for you project in order to build and publish. In my case, I created a new Microsoft Bot Builder v4 project and copied these two files from there. In the Startup method of the Startup class I created a ConfigurationBuilder to gather my app settings.

Then in the ConfigureServices and Configure methods, I added a call to services.AddAlexaBot and UseAlexa in order to enable the Alexa middleware and set up a special endpoint for calls from Alexa.

Following these code changes, I published the Web App Bot back to my Azure account. The original QnA Bot Service now has an additional channel endpoint for Alexa. The Alexa address is the original Web App Bot root address with /api/skillrequests added to the end.

At this point, I was ready to go to my Amazon account and create a new Alexa skill. I went to: https://developer.amazon.com/alexa and signed in. (If you don’t already have a developer account, you will need to enter your information and agree to the developer EULA.) Next, I tapped the Alexamenu item at the top of the developer page and selected Alexa Skills Set. This took me to https://developer.amazon.com/alexa/console/ask, where I clicked the Create Skill button.

I wrote in a unique name for my skill, selected Custom for the model, and clicked Create skill. On the following screen, I selected Start from Scratchfor my template.

I selected JSON Editor.

Next, I opened another web browser and went to this source code, and copied the example JSON found in the README.md file.

I returned to the web browser that had the Amazon Alexa portal opened and pasted the JSON into the box. I change the invocationName to the name of my skill, clicked Save Model, and finally clicked Build Model.

After waiting patiently for the build to complete, I selected Endpoint in the left navigation window and clicked HTTPS. I then entered the address of the Azure App Service URL and added /api/skillrequests to the end.

To distribute my Alexa skill so people can use it on their own Amazon devices, I clicked the Distribution link in the Alexa developer console and followed the instructions from there.

And before I knew it, I was able to have a conversation with my company’s FAQ page, using the QnA Maker’s professional chit-chat personality, from my living room.

Microsoft’s convergence of chatbots and mixed reality

One of the biggest trends in mixed reality this year is the arrival of chatbots on platforms like HoloLens. Speech commands are a common input for many XR devices. Adding conversational AI to extend these native speech recognition capabilities is a natural next steps toward a future in which personalized virtual assistant backed by powerful AI accompany us in hologram form. They may be relegated to providing us with shopping suggestions, but perhaps, instead, they’ll become powerful custom tools that help make us sharper, give honest feedback, and assist in achieving our personal goals.

If you have followed the development of sci-fi artificial intelligence in television and movies over the years, the move from voice to full holograms will seem natural. In early sci-fi, such as HAL from the movie 2001: A Space Odyssey or the computer from the original Star Trek, computer intelligence was generally represented as a disembodied voice. In more recent incarnations of virtual assistance, such as Star Trek Voyager and Blade Runner 2049, these voices are finally personified by full holograms of the Emergency Medical Hologram and Joi.

In a similar way, Cortana, Alexa, and Siri are slowly moving from our smartphones, Echos, and Invoke devices to our holographic headsets. These are still early days, but the technology is already in place and the future incarnation of our virtual assistants is relatively clear.

The rise of the chatbot

For Microsoft’s personal digital assistant Cortana, who started her life as a hologram in the Halo video games for Xbox, the move to holographic headsets is a bit of a homecoming. It seems natural, then, that when Microsoft HoloLens was first released in 2016, Cortana was already built into the onboard holographic operating system.

Then, in a 2017 article on the Windows Apps Team blog, Building the Terminator Vision HUD in HoloLens, Microsoft showed people how to integrate Azure Cognitive Services into their holographic head-mounted display in order to provide smart object recognition and even translation services as a Terminator-like HUD overlay.

The only thing left to do to get to a smart virtual assistant was to tie together the HoloLens’s built-in Cortana speech capabilities with some AI to create an interactive experience. Not surprisingly, Microsoft was able to fill this gap with the Bot Framework.

Virtual assistants and Microsoft Bot Framework

Microsoft Bot Framework combines AI backed by Azure Cognitive Serviceswith natural-language capabilities. It includes a set of open source SDKs and tools that enable developers to build, test, and connect bots that interact naturally with users. With the Microsoft Bot Framework, it is easy can create a bot that can speak, listen, understand, and even learn from your users over time with Azure Cognitive Services. This chatbot technology is sometimes referred to as conversational AI.

There are several chatbot tools available. I am most familiar with the Bot Framework, so I will be talking about that. Right now, chatbots built with the Bot Framework can be adapted for speech interactions or for text interactions like the UPS virtual assistant example above. They are relatively easy to build and customize using prepared templates and web-based dialogs.

One of my favorite ways to build a chatbot is by using QnA Maker, which lets you simply point to an online FAQ page or upload product documentation to use as the knowledge base for your bot service. QnA Maker then walks you through applying a chatbot personality to your knowledge base and deploying it, usually with no custom coding. What I love about this is that you can get a sophisticated chatbot rolled out in about half a day.

Using the Microsoft Bot Framework, you also have the ability to take full control of the creation process to customize your bot in code. Bot apps can be created in C#, JavaScript, Python or Java. You can extend the capabilities of the Bot Framework with middleware that you either create yourself or bring into your code from third parties. There are even advanced capabilities available for managing complex conversation flows with branches and loops.

Ethical chatbots

Having introduced the idea above of building a Terminator HUD using Cognitive Services, it’s important to also raise awareness about fostering an environment of ethical AI and ethical thinking around AI. To borrow from the book The Future Computed, AI systems should be fair, reliable and safe, private and secure, inclusive, transparent, and accountable. As we build all forms of chatbots and virtual assistants, we should always consider what we intend our intelligent systems to do, as well as concern ourselves with what they might do unintentionally.

The ultimate convergence of AI and mixed reality

Today, chatbots are geared toward integrating skills for commerce like finding directions, locating restaurants, and providing help with a company’s products through virtual assistants. One of the chief research goals driving better chatbots is to personalize the chatbot experience. Achieving a high level of personalization will require extending current chatbots with more AI capabilities. Fortunately, this isn’t a far-future thing. As shown in the Terminator HUD tutorial above, adding Cognitive Services to your chatbots and devices is easy to do.

Because holographic headsets have many external sensors, AI will also be useful for analyzing all this visual and location data and turning it into useful information through the chatbot and Cognitive Services. For instance, cameras can be used to help translate street signs if you are in a foreign city or to identify products when you are shopping and provide helpful reviews.

Finally, AI will be needed to create realistic 3D model representations of your chatbot and overcome the uncanny valley that is currently holding back VR, AR, and MR. When all three elements are in place to augment your chatbot — personalization, computer vision, and humanized 3D modeling — we’ll be that much closer to what we’ve always hoped for — personalized AI that looks out for us as individuals.

Here is some additional reading on the convergence of chatbots and MR you will find helpful:

Increasing Business Reach with Azure Bot Service Channels

Where do bots live? It’s a common misconception that bots live on your Echo Dot, on Twitter, or on Facebook. To the extent bots call anywhere their home, it’s the cloud. Objects and apps like your iPhone and Skype are the “channels” through which people communicate with your bot.

Azure Bot Service Channels

Out of the box, Azure Bot Service supports the following channels (though the list is always growing):

  • Cortana
  • Email
  • Facebook
  • GroupMe
  • Kik
  • LINE
  • Microsoft Teams
  • Skype
  • Skype for Business
  • Slack
  • Telegram

Through middleware created by the Bot Builder Community, your business’s bots can reach additional channels like Alexa and Google.

With Direct Line, your developers can also establish communications through your bots and your business’s custom apps on the web and on devices.

Companies like Dixons Carphone, BMW, Vodafone, UEI, LaLiga, and UPS are already using Microsoft Azure Bot Service support for multiple channels to extend their Bot reach.

UPS Chatbot, for instance, delivers shipping information and answers customer questions through voice and text on Skype and Facebook Messenger. UPS, which invests more than $1 billion a year in technology, developed its chatbot in-house and plans to continue to update its functionality, including integration with the UPS My Choice® platform using Direct Line. In just the first eight months, UPS Bot has already had more than 200,000 conversations over its various channels.

LaLiga, the Spanish football league, is also reaching its huge and devoted fan base through multiple channels with Azure Bot Service. It is estimated that LaLiga touches 1.6 billion fans worldwide on social media.

Using an architecture that combines Azure Bot Service, Microsoft Bot Framework and multiple Azure Cognitive Services such as Luis and Text Analysis, LaLiga maintains bots on Skype, Alexa and Google Assistant that use natural language processing. NLP allows their chatbots to understand both English and Spanish, their regional dialects, and even the soccer slang particular to each dialect. They are even able to use a tool called Azure Monitor anomaly detection to identify new player nicknames created by fans and then match them to the correct person. In this and similar ways, LaLiga’s chatbots are always learning and adapting over time. LaLiga plans to deploy its chatbots to almost a dozen additional channels in the near future.

Conclusion

Because social media endpoints are always changing, developing for a single delivery platform is simply not cost-effective. Channels provide businesses with a way to develop a bot once but deploy it to new social media platforms as they appear on the market and gain influence. At the same time, your core bot features can constantly be improved, and these improvements will automatically benefit the pre-existing channels people use to communicate with you.

Early Retirement

Hoa with little James

A few weeks ago, I was removed from the Microsoft MVP program. This occurred as a direct result of a presentation I gave about DIY Deep Fakes to other MVPs during the MVP Summit in Redmond, Washington on March 21st. I was told on the following Monday that someone had found a slide in my presentation offensive, that this constituted a violation of the MVP Code of Conduct, and so it was time for the MVP program and I to part ways immediately. All my MVP benefits would be revoked. All my email access to the MVP program would be cancelled.

cocviolation

Normally people lose their MVPs at renewal time for not doing enough for their technical communities, whatever that means, or for revealing NDA secrets of one sort or another. All MVPs live under all manner of non-disclosure agreements with the understanding that from time to time they will receive previews or roadmaps of upcoming Microsoft technology in exchange for honest feedback. For the most part, this is a good arrangement, except that over time the scope of the NDA has grown to include trivial things that have nothing to do with Microsoft technology and a lot to do with Microsoft’s self marketing, while at the same time fewer and fewer “secrets” are shown to MVPs of any real value. But more on this later.

The customary way to accept losing one’s MVP is to go to social media, express regret over not being in contact with all the great friends one has made, state that the MVP program is Microsoft’s to run any way they want, and that you don’t really care one way or the other. And then you go cry in a fetal position for about a week and never really get over the trauma and humiliation of losing your MVP. Life goes on.

My earlier DIY Deep Fakes talk in Utrecht

Except in my case  1) I really loved being in the MVP program. I loved being able to sit next to someone I admire in the program and we both could take off our public masks at the same time and just shoot the breeze about tech stuff. I enjoyed meeting with the indie consultants who eke out a living doing technical presentations at conferences and converting them into short term development gigs. They are the high plains drifters of the new economy. And I loved moving with the same group of international experts over the past eight years from the Kinect MVP program to the Emerging Experiences MVP program to the Windows Dev  (HoloLens) MVP program and sharing their dreams for these technologies, exchanging tips and advice, developing a shorthand around discussing these tools and always helping each other out. And while I know I’ll continue to know them, it also won’t quite be the same anymore. I was lucky to have this privilege for the time that I had it and know I have lost something meaningful now that it is gone.

2) I didn’t violate the MVP Code of Conduct. I look at the thing and it says you shouldn’t show pictures of child pornography, incest, bestiality – which definitely sets the bar pretty high. And when I’ve asked about this I’ve been told that if anyone is offended by an image then the image is offensive and it is a Code of Conduct violation.

Which I don’t think is right. It isn’t right because it is arbitrary. If someone is offended by a picture of inter-racial kissing, it is then offensive? Really? It also isn’t right because it turns out the human race and the legal tradition isn’t new to changing standards of offense, which is why obscenity and offensiveness are traditionally understood to mean “offensive to prevailing standards in the adult community” rather than reported by someone as offensive.

And if I offended one person in the audience, I would really like to speak to them and understand why. Short of that, I would still like to understand how I harmed someone, because that is mortifying and not something I would ever want to do.

Instead, in an invitation marked we have some feedback about your presentation I was told to call into a Microsoft Team meeting on a Friday afternoon and then as I was about to call in had the call rescheduled for the following Monday. On the following Monday I talked to three men (and a fourth in the background who didn’t introduce himself). The meeting started off with a man high up in the MVP organization expressing regret that we hadn’t ever had a chance to meet in person, after which he quickly went into a let’s cut to the chase moment and told me I was being retired from the program for my presentation.

I asked for what and was told there was a pornographic image in my presentation – in the section of my Deep Fakes talk about the dangers of Deep Fake technology. I pointed out that the image was pixelated and was told that pixelated pornography was still pornography. I pointed out that image was part of a video clip I excerpted from a Wall Street Journal piece but was told that Microsoft has different standards of offensiveness than the Wall Street Journal does. I then pointed out that I had already done a longer version of the same presentation the week before in the Netherlands and had gotten an incredibly positive reaction. I was told that the standards in the Netherlands don’t really matter in this case because someone was offended in Redmond. Finally I pointed out that I had no intention of offending anyone and following the talk I gave in the Netherlands I didn’t expect that anyone would. I was told that my intentions didn’t matter. The lead indicated that it was time to end the call by saying that this was really hard on all of them, too. So I thanked them for their time and hung up. It was rotten and seemed like a bureaucratic thing, but there was nothing I could do at the time and needed to move on with my day.

And I did okay with that until towards the evening I received about 20  emails asking if I had broken the CoC at summit. It turned out that while I was told that I was kicked out for showing an offensive image, 400 Windows Development MVPs were told that an unacceptable violation of the Code of Conduct had occurred during the summit involving the exploitation of women and pornography—without mentioning that I had been kicked out for it. This was followed by instructions on how to handle sexual harassment situations, everyone’s obligation to report sexual harassment, what to do when victims of sexual harassment were too afraid to speak up for themselves, and finally guidance on how to recognize potentially offensive material in your presentations which boiled down to you really can’t.

In case you are wondering why so many people were able to identify me as the person being accused of sexual harassment in the email, there were only 8 speakers at the event in question, the schedules had been distributed previously, and my talk was the only one not involving databases, devops or frameworks.

I was devastated.

Before continuing, let me tell you a little more about me. I am mixed race and I grew up in Vietnam. My mother is Vietnamese and my father is a U.S. native. Vietnamese was my first language and when we got to the United States I begged my family not to speak it at home because I wanted to be American so kids at school wouldn’t make fun of me for being different. As a kid we used to host boat people who escaped from the poverty and authoritarian government in Vietnam. We used to visit with people who had been tortured in re-education camps and had no fingernails. My mother spent a lot of time depressed, having been violently exiled from her homeland, her village, her extended family and her ancestors.

For me, being mixed race means never quite being American enough but also not being able to go back to being Vietnamese like I was as a child. I don’t fully have either identity. I also normally don’t have to think about it unless other people force me to. I’m just me most of the time.

I ended up getting a classical education at a small four year college in Maryland and then entered a Phd program in Philosophy at Emory University. I did three years of course work that covered logic, feminist theory, philosophy of consciousness, ethics, critical theory, Renaissance philosophy and phenomenology – and I even taught ethics to the undergraduates on occasion. At about the time I was getting ready to write a dissertation, my wife and I were also getting ready to have our first baby and I realized I needed to go find a real job, which is how I discovered computer programming.

I bring up my background and my education not as a way to say I don’t make mistakes in judgment. I do all the time.

I bring it up to share with you how strange it felt, on that Monday morning, to have three white men who run the MVP program instruct me over the phone on inclusivity and diversity as they were kicking me out. Because feeling excluded without explanation or justification is what it feels like when you are a small mixed race kid with broken English getting picked on at school. Because your intentions don’t matter. Only the intentions of the people with power over you matter.

In retrospect, I shouldn’t find humor in the MVP leads expressing how hard it was for them to kick me out of the program. Their jobs were probably on the line. As I later came to understand, someone escalated the talk to an executive at Microsoft who, despite not being at the presentation, determined that the leads had done something horrible and they in turn had to demonstrate that they took harassment seriously by taking severe and immediate action. I’ve worked in places where employees are motivated by fear of losing their jobs. It isn’t pleasant and it is very hard to think clearly under those circumstances, much less act like yourself.

I do want to point out, though, that as a result of these Code of Conduct reinterpretations and warnings, a lot of MVPs are infected with the same fear that drove those MVP leads. They feel like anything might be interpreted as a form of sexual harassment or that they could similarly be expelled from the program without justification, discussion, or a right of appeal. It’s difficult to express how deeply wrong this is, or how bizarre it is to watch adults become stressed over what has essentially become a part-time job working for Microsoft and following Microsoft’s HR policies rather than the MVP CoC. No one should be afraid of losing the MVP arbitrarily.

I also want to clarify what I was taught about diversity and inclusivity in my ethics and women’s studies classes at Emory, because I think it will be helpful. The goals of diversity and inclusivity are to encourage people to exercise their empathy more fully, to increase understanding of those different from you, and to develop the ability to question our own preconceptions. A system that results in increased fear—or even ends up threatening the jobs of good employees—is not going to achieve any of these goals.

There’s another piece of the puzzle that may provide a deeper context for the events described above. On March 20th, an email chain was started by women who had suffered harassment and career marginalization inside Microsoft over the years. They shared their stories of institutional sexism at work, complaints ritually ignored by HR, and a culture that routinely discouraged them from telling their stories. I gave my talk on March 21st, the day after the #metoo movement finally came knocking on Microsoft’s door.

Microsoft has agreed to start taking complaints more seriously and has promised to investigate reports of misconduct more thoroughly. This is a great thing. But as Microsoft embraces these new policies, I hope they also take a look at revamping the process used to retire MVPs to actually include a formal review process. I should not have been humiliated the way I was and would hate to see any future MVP go through anything like it again.

Digital Heroism in a DeepFake World

training_preview6

I recently did a talk on deepfake machine learning which included a long intro about the dangers of deep fakes. If you don’t know what deepfakes are, just think of using photoshop to swap people’s faces, except applied to movies instead of photos, and using AI instead of a mouse and keyboard. The presentation ended with a short video clip of Rutgar Hauer’s “tears in rain” speech from Blade Runner, but replacing Hauer’s face with Famke Janssen’s.

But back to the intro – besides being used to make frivolous videos that insert Nicholas Cage into movies he was never in (you can search for it on Youtube), it is also used to create fake celebrity pornography and worse of all to create what is known as “revenge porn” or just malicious digital face swaps to humiliate women.

Noelle Martin has, in her words, become the face of the movement against image based abuse of women. After years of having her identity taken away, digitally altered, and then distributed against her will on pornography websites since she was 17 years old, she decided to regain her own narrative by speaking out publicly about the issue and increasing awareness of it. She was immediately attacked on social media for bringing attention to the issue and yet she persisted and eventually helped to criminalize image based sexual abuse in New South Wales, Australia, with a provision specifically about altered images.

Criminalization of these acts followed at the commonwealth level in Australia. She is now working to increase global awareness of the issue – especially given that the webservers that publish non-consensual altered images can be anywhere in the world. She was also a finalist in the 2019 Young Australian of the Year award for her activism against revenge porn and for raising awareness of the way modern altered image technology is being used to humiliate women.

n_martin

I did a poor job of telling her story in my presentation this week.  Beyond that, because of the nature of the wrong against her, there’s the open question of whether it is appropriate even to try to tell her story – after all, it is her story to tell and not mine.

Fortunately, Noelle has already established her own narrative loudly and forcefully. Please hear her story in her own words at Tedx Perth.

Once you’ve done that, please watch this Wall Street Journal story about deepfake technology in which she is featured.

When you’ve heard her story, please follow her twitter account @NoelleMartin94 and help amplify her voice and raise awareness about the dark side of AI technology. As much as machine learning is in many ways wonderful and has the power to make our lives easier, it also has the ability to feed the worst impulses in us. Because ML shortens the distance between thought and act, as it is intended to do, it also easily erases the consciousness that is meant to mediate our actions: our very selves.

By speaking out, Ms. Martin took control of her own narrative. Please help her spread both the warning and the cure by amplifying her story to others.

HoloLens 2 announcement Quick Recap

Basic sensors look the same between HL1 and HL2, which is nice. Still 4 monochrome cameras for SLAM, 1 TOF for spatial mapping. For me personally, knowing the size of the battery and the refresh rate of the TOF with more consistent power is really huge. Also curious about burn in from the battery-bun. Doesn’t it get hot? And the snapdragon 850 is just an overclocked 845. That’s going to be a bit hotter too? Also curious why MEMS? It’s not cheaper or lighter or much smaller than LCOS so, I think, must be the reason for the FOV improvement (right?) but  Karl Guttag complains this will make for more blurry digital images (complain? who Karl?).

Piano playing stage demo was cool but there was noticeable awkwardness in the way the hand gestures were performed (exaggerated movements) — which is actually really familiar from working with magic leap’s hand tracking. Either that needs to get better by the summer or there’s going to be some serious disappointment down the road.

Anyone seen any performance comparisons between the NVidia Tegra X2 (Magic Leap’s SOC) and Qualcomm 850 (HoloLens 2’s SOC)? I know the 845 (basically same as 850) was regarded as better than X1, but most assumed X2 would leapfrog. Haven’t found anything conclusive, though.

Doubling of FOV between HL1 and HL2 might have been miscommunicated and meant something different from what most people thought it meant. It turns out the FOV of the HL2 is actually very close to the FOV of the Magic Leap One, both of which are noticeably bigger than the original HoloLens but still significantly smaller than what everyone says they want.

My friend and VR visionary Jasper Brekelmans calculated out the HL2 FOV in degrees to be 43.57 x 29.04 with a diagonal of
52.36. Magic Leap is 40 x 30, with a diagonal of 50 (thank you magic leap for supporting whole numbers).

From now on, whenever someone talks about the field of view in AR/VR/MR/XR, we’ll all have to ask if that is being measured in degrees or steradians. Oh wells.

Internet bad boy Robert Scoble turns out to probably have one of the most interesting takes on the HoloLens 2 announcement. I hope his rehabilitation continues to go well. On the other hand it was a really bad week for tech journalists in general.

Unity even made an announcement about HoloLens 2 and even though they have been working on their own in-house core tools for XR development, way down deep in the fine print they are saying that you will need to use the Mixed Reality Toolkit v2 to develop for HoloLens – which is very eeeeenteeresting.

Coolest thing for me, outside of the device itself, was the Azure Spatial Anchors announcement. No one is really paying attention to Azure Spatial Anchors, yet, but this is a game changer. It means implementing Vernor Vinge’s belief circles. It means anyone can build their own Pokemon Go app. And it works on ARKit, ARCore and HoloLens –> so the future of XR is cross-platform.

Mike Taulty, who I can’t say enough great things about, as usual has dived in first and written up a tour of the new service.

Crap, my boss just came back from lunch. Gotta work now.

The Fork in Mixed Reality

Futuristic Spaceman - Programmer Reading Projected Information

Yogi Berra gnomically said, “when you come to a fork in the road, take it.”  On the evening of Friday, February 1st, 2019 at approximately 9 PM EST, that’s exactly what happened to Mixed Reality.

The Mixed Reality Toolkit open source project, which grew out of the earlier HoloLens Toolkit on github, was forked into the Microsoft MRTK and a cross-platform XRTK (read the announcement). While the MRTK will continue to target primarily Microsoft headsets like the HoloLens and WMR, XRTK will feature a common framework for HoloLens, Magic Leap, VR Headsets, Mobile AR – as well as HoloLens 2 and any other MR devices that eventually come on the market.

So why did this happen? The short of it is that open source projects can sometimes serve multiple divergent interests and sometimes they cannot. Microsoft was visionary in engineering and releasing the original HoloLens MR Headset. They made an equally profound and positive step back in 2016 by choosing to open source the developer SDK/Framework/Toolkit (your choice) that allows developers to build Unity apps for the HoloLens. This was the original HoloLens Toolkit (HLTK).

While the HLTK started as a primarily Microsoft engineering effort, members of the community quickly jumped in and began contributing more and more code to the point that the Microsoft contributions became a minority of overall contributions. This, it should be noted, goes against the common trend of a single company paying their own engineers to keep an open source project going. The HLTK was an open source success story.

In this regard, it is worth calling out two developers in particular, Stephen Hodgson and Simon Jackson, for the massive amounts of code and thought leadership they have contributed to the MR community. Unsung heroes barely captures what they have done.

In 2017 Microsoft started helping to build occluded WinMR (virtually the same as VR) devices with several hardware vendors and it made sense to create something that supported more than just the HoloLens. This is how the MRTK came to be. It served the same purpose as the HLTK, to accelerate development with Unity scripts and components, but now with a larger perspective about who should be served.

In turn, this gave birth to something that is generally known as MRTK vNext, an ambitious project to support not just Microsoft devices but also platforms from other vendors. And what’s even more amazing, this was again driven by the community rather than by Microsoft itself. Microsoft was truly embracing the open source mindset and not just paying lip service to it as many naysayers were claiming.

But as Magic Leap, the other major MR Headset vendor, finally released their product in fall, 2018, things began to change. Unlike Microsoft, Magic Leap developed their SDK in-house and threw massive resources at it. Meanwhile, Microsoft finally started throwing their engineers at the MRTK again after taking a long hiatus. This may have been in response to the Magic Leap announcement or equally could have been because the team was setting the stage for a HoloLens 2 announcement in early 2019.

And this was the genesis of the MR fork in the road: For Microsoft, it did not make sense to devote engineering dollars toward creating a platform that supported their competitors’ devices. In turn, it probably didn’t make sense for engineers from Google, Magic Leap, Apple, Amazon, Facebook, etc. to devote their time toward a project that was widely seen as  a vehicle for Microsoft HMDs.

And so a philosophical split needed to occur. It was necessary to fork MRTK vNext. The new XRTK (which is also pronounced “Mixed Reality Toolkit”) is a cross-platform framework for HoloLens as well as Magic Leap (Lumin SDK support is in fact already working in XRTK and is getting even more love over the weekend even as I write).

But XRTK will also be a platform that supports developing for Oculus Rift, Oculus Go, HTC Vive, Apple ARKit, Google ARCore, the new HoloLens 2 which may or may not be announced at MWC 2019, and whatever comes next in the  Mixed Reality Continuum.

So does this mean it is time to stick a fork in the Microsoft MRTK? Absolutely not. Microsoft’s MRTK will continue to do what people have long expected of it, supporting both HoloLens and Occluded WinMR devices (that is such a wicked mouthful — I hope someone will eventually give it a decent name like “Windows Surface Kinect for Azure Core DotNet Silverlight Services” or something similarly delightful).

In the meantime, while Microsoft is paying its engineers to work on the MRTK, XRTK needs fresh developers to help contribute. If you work for a player in the MR/VR/AR/XR space, please consider contributing to the project.

Or to word it in even stronger terms, if you give half a fork about the future of mixed reality, go check out  XRTK  and start making a difference today.

Why you should watch “2047: Virtual Revolution”

In the wake of Apple’s successes over the past decade, agencies and technical schools like the Savannah College of Art and Design have been pumping out web designers and a career ecosystem that supports them as well as a particularly disciplined, minimalist, flat aesthetic that can be traced back to Steve Jobs. One of the peculiarities of the rise of 3D games and VR/AR/MR/XR platforms is that these children of the Jobs revolution have little interest in working with depth. The standards of excellence in the web-based design world — much less the print-based design world it grew out of – are too different. To work in 3D feels too much like slumming.

br

But design is still king in 3D as well as on the web. When the graduates of SCAD and similar schools did not embrace 3D design, film FX designers like Ash Thorp and Greg Borenstein jumped into the space the left vacant. For a period after the release of Steven Spielberg’s Minority Report in 2002, there was a competition among FX artists to try to outdo the UIs that were created for that film. From about 2010, however, that competitive trend has mellowed out and the goal of fantasy UX in sci-fi has changed into one of working out the ergonomics of near-future tech in a way that makes it feel natural instead of theatrical. By carefully watching the development of CGI artifacts in the 21st century – as an archeologist might sift through pottery shards from the ancient past –, we can see the development of a consensus around what the future is supposed to look like. Think of it as a pragmatic futurism.

co

The 2016 French film 2047: Virtual Revolution, written and directed by Guy-Roger Duvert and staring Mike Dopud, is a great moment in the development of cinematic language in the way it takes for granted certain sci-fi visual cliches. Even more interesting is what appears to be a relatively low-budget film is able to pull off the CGI it does, indicating a general drop in price for these effects. What used to be hard and expensive is now within reach and part of the movie making vernacular.

quit

The story is about a future society that spends all its time playing online RPGs while the corporations that run these games have taken over the remnant of the world left behind. But what I found interesting was the radial interface used by players inside their RPGs.

mlui

It bears a passing similarity to the magic leap OS navigation menu.

mech

It also gives a nod to popular gaming genres like Gandam battle suits …

cd

Fantasy UX (FUX) from films like Blade Runner and The Matrix …

wow

And World of Warcraft.

Play the movie in the background while you are working if you don’t have time to get into the plot about a dystopic future in which people are willingly enslaved to VR … blah, blah, blah. But look up from time to time in order to see how the FX designers who will one day shape our futures are playing with the grammar of the VR\AR\MR\XR visual language.

The effects for 2047 appear to have been done by an FX and VR company in France called Backlight. The movie is currently free to Amazon Prime members.

For some more innovative FUX work, please take a look at 2013’s Ender’s Game or the Minority Report tv series from 2015.

Things of Note 01-02-2019

Mike Taulty has a great series on Project Prague: https://mtaulty.com/category/projectprague/

Christopher Diggins has listed out some undocumented APIs in the 3DS Max .NET SDK: https://cdiggins.github.io/blog/undocumented-3dsmax-dotnet-assemblies.html that nobody knows about. Tres cool.

Magic Leap received approximately 6,000 submissions for its Independent Creator program. I think I’m associated with at least 20 of those. Considering that cash is Magic Leap’s biggest asset, this is a great way to be using its muscle. Building up an app ecosystem for mixed reality is a great thing and will help other vendors like Microsoft and Apple down the road. And who knows. Maybe one of those 6000 proposals is the killer MR app.

There are rumors of a HoloLens v2 announcement in January, but don’t hold your breath. There are also rumors of a K4A announcement in January, which seems more likely. In the past, we’ve seen one tech announcement (Windows MR [really Windows VR]) substitute for silence regarding HoloLens, so this may be another instance of that.

Many people are asking what’s happening with the Microsoft MRTK vNext branch. It’s still out there but … who knows?

The Lumin SDK 0.19 started rolling out a couple of weeks ago and the LuminOS is now on version 0.94.0. HoloLens always got dinged for iterating too slowly while Magic Leap gets dinged for changing too quickly. What’s a bleeding edge technology company to do?

I’ve gotten through 4 of the endings of the Black Mirror movie Bandersnatch on Netflix. It makes you think, but not very hard, which is about the right pace for most of us. And Spotify has a playlist for the movie. You should listen to the Bandersnatch playlist on shuffle play, obviously.

Authentically Virtual