Zao’s Next Gen DeepFakes

The Zao app, by Changsha Shenduronghe Network Technology Co Ltd, was released on the Chinese iTunes store a week ago and was popularized in a tweet by Allan Xia.

It is not currently available through iTunes in the U.S. but with a bit of hard work I was finally able to install a copy. I was concerned that the capabilities of the app might be exaggerated but it actually exceeded my expectations. As a novelty app, it is fascinating. As an indicator of the current state and future of deepfakes, it is a moment of titanic proportions.

As of a year ago, when the machine learning tool Fake App was released, a decent deepfake took tens of hours and some fairly powerful hardware to generate. The idea of being able to create one in less than 30 seconds on a standard smartphone seemed a remote possibility at the time. Even impossible.

The Zao app also does some nice things I’ve never gotten to work well with deepfakes/faceswap or deepfacelab – for instance like handling facial hair.

… or even no hair. (This is also a freaky way  to see what you’ll look like in 15-20 years.)

What is particularly striking is the way it handles movement and multiple face angles as with this scene from Trainspotting and a young Obi Wan Kenobi. In the very first scene, it even skips over several faces and just automatically targets the particular one you specify. (In other snippets that include multiple characters, the Zao app allows you to choose which face you want to swap out.)

All this indicates that the underlying algos are quite different from the autoencoder based ones from last year. I have some ideas about how they have managed to generate deepfakes so quickly and with a much smaller set of data.

Back in the day, deepfakes required a sample of 500 source faces and 500 target faces to train the model. In general, the source images were rando and pulled out of internet posted videos. For the Zao app, there is a ten second process in which selfies are taken of you in a few different poses: mouth closed, mouth open, raised head, head to the left and blinking. By ensuring that the source images are the “correct” source images rather than random ones, they are able to make that side of the equation much more efficient.

While there is a nice selection of “target” videos and gifs for face swapping, its is still a limited number (I’d guess about 200). Additionally, there is no way to upload your own videos (as far as I could tell with the app running on one phone and Bing translator running on a second phone in the other – the app is almost entirely in simplified Chinese). The limited number of short target videos may simply be part of a curation process to make sure that the face angles are optimized for this process, mostly facing forward and with good lighting. I suspect, though, that the quantity is limited because the makers of the Zao app have also spent a good amount of time feature mapping the faces in order to facilitate the process. It’s a clever sleight of hand, combined with amazing technology, used to create a social app people are afraid of.

The deeper story is that deepfakes are here to stay and they have gotten really, really good over the past year. And deepfakes are like a box of chocolates. You can try to hide them because they are potentially bad for you. Or you can try to understand it better in order to 1) educate others about the capabilities of deepfakes and 2) find ways to spot them either through heuristics or CV algorithms.

Consider what happened with Photoshopping. We all know how powerful this technology is and how easy it is, these days, to fake an image. But we don’t worry about it today because we all know it can be done. It is not a mysterious process anymore.

Making people more aware of this tech, even popularizing it as a way of normalizing and then trivializing it, may be the best way to head off a deepfake October surprise in the 2020 U.S. elections. Because make no mistake: we will all be seeing a lot of deepfakes in October, 2020.

Things of Note 01-02-2019

Mike Taulty has a great series on Project Prague: https://mtaulty.com/category/projectprague/

Christopher Diggins has listed out some undocumented APIs in the 3DS Max .NET SDK: https://cdiggins.github.io/blog/undocumented-3dsmax-dotnet-assemblies.html that nobody knows about. Tres cool.

Magic Leap received approximately 6,000 submissions for its Independent Creator program. I think I’m associated with at least 20 of those. Considering that cash is Magic Leap’s biggest asset, this is a great way to be using its muscle. Building up an app ecosystem for mixed reality is a great thing and will help other vendors like Microsoft and Apple down the road. And who knows. Maybe one of those 6000 proposals is the killer MR app.

There are rumors of a HoloLens v2 announcement in January, but don’t hold your breath. There are also rumors of a K4A announcement in January, which seems more likely. In the past, we’ve seen one tech announcement (Windows MR [really Windows VR]) substitute for silence regarding HoloLens, so this may be another instance of that.

Many people are asking what’s happening with the Microsoft MRTK vNext branch. It’s still out there but … who knows?

The Lumin SDK 0.19 started rolling out a couple of weeks ago and the LuminOS is now on version 0.94.0. HoloLens always got dinged for iterating too slowly while Magic Leap gets dinged for changing too quickly. What’s a bleeding edge technology company to do?

I’ve gotten through 4 of the endings of the Black Mirror movie Bandersnatch on Netflix. It makes you think, but not very hard, which is about the right pace for most of us. And Spotify has a playlist for the movie. You should listen to the Bandersnatch playlist on shuffle play, obviously.

The HoloCoder’s Bookshelf

WP_20150430_06_43_49_Pro

Professions are held together by touchstones such as as a common jargon that both excludes outsiders and reinforces the sense of inclusion among insiders based on mastery of the jargon. On this level, software development has managed to surpass more traditional practices such as medicine, law or business in its ability to generate new vocabulary and maintain a sense that those who lack competence in using the jargon simply lack competence. Perhaps it is part and parcel with new fields such as software development that even practitioners of the common jargon do not always understand each other or agree on what the terms of their profession mean. Stack Overflow, in many cases, serves merely as a giant professional dictionary in progress as developers argue over what they mean by de-coupling, separation of concerns, pragmatism, architecture, elegance, and code smell.

Cultures, unlike professions, are held together not only by jargon but also by shared ideas and philosophies that delineate what is important to the tribe and what is not. Between a profession and a culture, the members of a professional culture, in turn, share a common imaginative world that allows them to discuss shared concepts in the same way that other people might discuss their favorite TV shows.

This post is an experiment to see what the shared library of augmented reality and virtual reality developers might one day look like. Digital reality development is a profession that currently does not really exist but which is already being predicted to be a multi-billion dollar industry by 2020.

HoloCoding, in other words, is a profession that exists only virtually for now. As a profession, it will envelop concerns much greater than those considered by today’s software developers. Whereas contemporary software development is mostly about collecting data, reporting on data and moving data from point A to points B and C, spatial software development will be more concerned with environments and will have to draw on complex mathematics as well as design and experiential psychology. The bookshelf of a holocoder will look remarkably different from that of a modern data coder. Here are a few ideas regarding what I would expect to find on a future developer’s bookshelf in five to ten years.

 

1. Understanding Media by Marshall McLuhan – written in the 60’s and responsible for concepts such as ‘the global village’ and hot versus cool media, McLuhan pioneered the field of media theory.  Because AR and VR are essentially new media, this book is required reading for understanding how these technologies stand side-by-side with or perhaps will supplant older media.

2. Illuminations by Walter Benjamin – while the whole work is great, the essay ‘The Work of Art in the Age of Mechanical Reproduction’ is a must read for discussing how traditional notions about creativity fit into the modern world of print and now digital reproduction (which Benjamin did not even know about). It also deals at an advanced level with how human interactions work on stage versus film and the strange effect this creates.

3. Sketching User Experiences by Bill Buxton – this classic was quickly adopted by web designers when it came out. What is sometimes forgotten is that the book largely covers the design of products and not websites or print media – products like those that can be built with HoloLens, Magic Leap and Oculus Rift. Full of insights, Buxton helps his readers to see the importance of lived experience when we design and build technology.

4. Bergsonism by Gilles Deleuze – though Deleuze is probably most famous for his collaborations with Felix Guattari, this work on the philosophical meaning of the term ‘’virtual reality’, not as a technology but rather as a way of approaching the world, is a gem.

5. Passwords by Jean Baudrillard – what Deleuze does for virtual reality, Baudrillard does for other artifacts of technological language in order to show their place in our mental cosmology. He also discusses virtual reality along the way, though not as thoroughly.

6. Mathematics for 3D Game Programming and Computer Graphics by Eric Lengeyl – this is hardcore math. You will need this. You can buy it used online for about $6. Go do that now.

7. Linear Algebra and Matrix Theory by Robert Stoll – this is a really hard book. Read the Lengeyl before trying this. This book will hurt you, by the way. After struggling with a page of this book, some people end up buying the Manga Guide to Matrix Theory thinking that there is a fun way to learn matrix math. Unfortunately, there isn’t and they always come back to this one.

8. Phenomenology of Perception by Maurice Merleau-Ponty – when it first came out, this work was often seen as an imitation of Heiddeger’s Being and Time. It may be the case that it can only be truly appreciated today when it has become much clearer, thanks to years of psychological research, that the mind reconstructs not only the visual world for us but even the physical world and our perception of 3D spaces. Merleau-Ponty pointed this out decades ago and moreover provides a phenomenology of our physical relationship to the world around us that will become vitally important to anyone trying to understand what happens when more and more of our external world becomes digitized through virtual and augmented reality technologies.

9. Philosophers Explore the Matrix – just as The Matrix is essential viewing for anyone in this field, this collection of essays is essential reading. This is the best treatment available of a pop theme being explored by real philosophers – actually most of the top American philosophers working on theories of consciousness in the 90s. Did you ever think to yourself that The Matrix raised important questions about reality, identity and consciousness? These professional philosophers agree with you.

10. Snow Crash by Neal Stephenson – sometimes to understand a technology, we must extrapolate and imagine how that technology would affect society if it were culturally pervasive and physically ubiquitous. Fortunately Neal Stephenson did that for virtual reality in this amazing book that combines cultural history, computer theory and a fast paced adventure.

$5 eBooks from Packt

The technical publisher Packt is offering eBooks for $5 through January 6th, 2015 as a holiday promotion. I encourage you to look very carefully through their selection and see what appeals. If you have time to read on, however, I’d like to explain in greater detail my mixed feelings about Packt (this was probably not the marketing department’s intention when they sent me an email asking me to publicize the promotion but I think it will ultimately be helpful to them).

Packt Publishing has always been hit or miss for me. They are typically much more adventurous regarding computer book topics than other publishers like Apress or O’Reilly (Apress is my publisher, by the way, and are pretty fantastic to work with and very professional). At the same time, I have the impression that Packt’s bar for accepting authors tends to be lower than other publishers’, which allows them to be prolific in their offerings but at the same time entails that they produce, quite honestly, some clunkers.

A specific example of one of their clunkers would be the Packt book Unity iOS Game Development Beginner’s Guide by Greg Pierce. The topic sounds great (at least it did to me) but it turns out the book mostly just copies from publicly available documentation.

To quote from one of the Amazon reviews from 2012 by C Toussieng:

“This book is unbelievably bad. What specifically? All of it. It takes information which can be easily garnered from the Unity and/or Apple websites, distills it down to a minimally useful amount, then charges you for it.

And this one from 2012 by JasonR:

The book basically covers a few pages of the Unity docs, then goes into 3rd party plugins they recommend, each plugin gets a couple of pages. Frankly, a simple search on Google will give you more insight.”

This is a shame since, even as more learning material is always appearing on the Internet which displaces the traditional place of technical books in the software ecosystem – material that is often free – there is still an important role for print books (and their digital equivalent, the eBook). While online material can be thrown out quickly, often covering about a fifth to a tenth of a chapter of a book that goes through the print publishing industry, they tend to lack the cohesiveness that is only possible in a work that has taken months to write and rewrite. A 300-page software book is a distillation of experience which has undergone multiple revisions and fact checking. A really good software book tries to tell a story.

The flip side, of course, is that modern technical books quickly become outdated while technical blog posts simply disappear. All in all, though, I find that sitting down with a book that tries to explain the broader impact of a given technology serves a different and more important purpose than a web tutorial that only shows how to perform streamlined – and often ideal –  tasks.

A propos of the thesis that good software books are distillations of years of experience – we could even say distillations of 10,000 hours of experience – I’d like to point you to some of the gems I’ve discovered through Packt Publishing over the years.

All of the Packt OpenCV books are interesting. I’m particularly fond of Mastering OpenCV with Practical Computer Vision Projects by Daniel Lélis Baggio, but I think all of them – at least the ones I’ve read – are pretty good. Daniel’s bio says that he “…started his works in computer vision through medical image processing at InCor (Instituto do Coração – Heart Institute) in São Paulo, Brazil.”

Another great one is Mastering openFrameworks: Creative Coding Demystified by Denis Perevalov. According to his bio, Denis is a computer vision research scientist at the Ural Branch of the Russian Academy of Sciences and co-author of two Russian patents of robotics.

One I really like simply because the topic is so specific is Kenny Lammers’ Unity Shaders and Effects Cookbook. His bio states that Kenny has been in the game industry for 13 years working for companies like “… Microsoft, Activision, and the late Surreal Software.”

I hope a theme is emerging here. The people who write these books actually have a lot of experience and are trying to pass their knowledge on to you in something more than easily digestible exercises. Best of all – ignoring the example from above – the material is typically highly original. It isn’t copy and pasted from 20 other websites covering the same material. Instead, the reader gets an opinionated and distinct take on the technology covered in each of these books.

What I especially appreciate about the $5 promotions Packt occasionally surfaces is that, for five dollars, you aren’t really obligated to try to read the entire book to get your money’s worth. I’ve taken advantage of similar deals in the past to simply read very specific chapters that are of interest to me such as Basic heads-up-display with custom GUI from Dr. Sebastian Koenig’s Unity for Architectural Visualization or Lighting and Rendering from Jen Rizzo’s Cinema 4D Beginner’s Guide. It’s also a great price when all I want to do is to skim a book on a topic I know pretty well in order to find out if there are any holes in my knowledge. Mastering Leap Motion by Brandon Sanders was extremely helpful for this and, indeed, there were holes in my knowledge.

According to his biography, by the way, Brandon is “… an 18-year-old roboticist who spends much of his time designing, building, and programming new and innovative systems, including simulators, autonomous coffee makers, and robots for competition. At present, he attends Gilbert Finn Polytechnic (which is a homeschool) as he prepares for college. He is the founder and owner of Mechakana Systems, a website and company devoted to robotic systems and solutions.”

The Modern Tech-Savvy Bourgeoisie

easyoneasyoff

My good friend Corey Schuman and I were talking about modern life this afternoon.  Modern life is basically pretty good – if your concern is culture.

Chinese manufacturing has kept clothes prices basically level for the past twenty years – America probably has one of the best dressed populations in the world.  If you like fashion, you are covered.

Electronic books have not only made access to classics ridiculously easy but also practically free.  With a Kindle you can download the complete works of Shakespeare, Dickens, Proust, Joyce, Plato and Aristotle for under ten dollars.  Life is good for the reader.

If you like music, a subscription to a standard music service like Zune or Rhapsody for about fifteen dollars a month (or a la carte with iTunes) will grant you access to all the great performances of the Western musical tradition, from Bach to Beethoven to Mozart to John Williams.  For very little money, anyone can hole up for a few months and become an expert on the greatest music known to man.

Fans of fine art, likewise, have free and remarkable access to high-rez images of – well — anything.  What more need be said.

And if you like movies, there is always Netflix.  According to a recent report, a third of all Internet traffic after dinner is dedicated to streaming movies from Netflix … and it isn’t all crap.  The classics are well represented.  I promised my friend a list of movies worth streaming and currently available for streaming — here they are.  The Italian masters are well-represented on Netflix, as is the French Nouvelle Vague.  German cinema from the 70’s (as well as lots of Hitchcock) was recently removed, but hopefully they will come back soon. 

The only caveat is that my cinematic tastes are heavily influenced by the Marxist film criticism I read in college, so my apologies in advance for that:

Kurosawa:
Yojimbo
Seven Samurai
Ran
Rashomon
Ikiru

Yasujiro Ozu:
Tokyo Story

De Sica:
The Bicycle Thief
Umberto D.

Pasolini:
Gospel According to St. Matthew

Hitchcock:
The 39 Steps

Truffaut:
Jules and Jim
the 400 Blows
Shoot the Piano Player

Goddard:
Breathless

Jean Renoir:
the Rules of the Game
Grand Illusion

Jean Cocteau:
Beauty and the Beast

Bunuel:
Un Chien Andalou

Fritz Lang:
M
Testament of Dr. Mabuse

Fellini:
La Strada
La Dolce Vita
8 1/2

Marcel Carne:
Children of Paradise

Bergman:
Smiles of a Summer Night
Virgin Spring
Wild Strawberries
Persona
Fanny and Alexander

Eisenstein:
The Battleship Potemkin

American Cinema:
His Girl Friday
All About Eve
The Lady Eve
The Palm Beach Story
My Man Godfrey
The Third Man
The Grapes of Wrath
The Bells of St Mary’s
On the Waterfront
Bonnie and Clyde
The Sting
Butch Cassidy and the Sundance Kid
Network
Doctor Zhivago
Mean Streets
Manhattan
A Clockwork Orange
Apocalypse Now
Blazing Saddles
Jaws

Mike Strobel’s WPF Blog

I’ve been working with Mike Strobel for several weeks now at my current client.  He is an amazingly able WPF developer who has been plugging away at the technology since the days when we were still calling it Avalon (I must say, I really prefer the Microsoft code-names for their technologies to the utilitarian acronyms they eventually morph into – for instance, isn’t the “Atlas” moniker much superior to “ASP.NET AJAX”).

A day doesn’t go by that I don’t look at his code and have a small mental epiphany usually accompanied by the mumbled statement “I didn’t know you could do that with WPF.…”

After a bit of encouragement Mike has finally started blogging.  The first post is an explication of his history with WPF and the various ways he is currently using it.  One of the interesting things he reveals is that he not only is doing both game development and enterprise development using WPF, but that he is able to apply techniques he discovers (invents?) in one domain to the other.

You can find Mike’s blog here: http://codedreams.blogspot.com .

I hope that over the next few weeks you will be as blown away as I have been with the remarkable things he has been able to do on the WPF platform.

UI Design Pattern Resources

knitting

I had a great time at Codestock this year.  It wasn’t the pot-smoking, free-loving, mind-altering tribal experience I was afraid of and I didn’t see a single guitar the entire time – though I did talk with one guitar player.

The organizers – Michael Neel, Alan Stevens and Wally McClure — did a fantastic job and the Knoxville community is quite amazing and enthusiastic.  I also got to meet many people I had previously only known by reputation.

I presented on “The Uses and Abuses of UI Design Patterns” on Saturday afternoon.  I wanted to build out some examples of using a modified MV-VM pattern in WinForms and ASP.NET before publishing the code samples, but wanted to make sure I published out the references from the slide deck.

The point of the references is that while UI design patterns are all plagued by a tendency to have fuzzy boundaries – that is, it can be difficult to compare patterns between different technologies and sometimes can even be difficult to distinguish patterns used in one technology – there is still a paper trail on the Internet that give us clues as to where the design patterns (MVC, MVC Model 2, MVP, SVC, PV, PM, MVVM) came from and how they were originally intended to be used.

In general, the Supervising Controller and Passive View patterns are the best defined, while MVC is perhaps the least well defined.  MV-VM, on the other hand, has some of the best examples on usage – not least because it is so tightly associated with WPF.

1988 – MVC: Journal of Object Oriented Programming Vol 1 Issue 3 http://portal.acm.org/citation.cfm?id=50759&dl=GUIDE&coll=GUIDE&CFID=41635617&CFTOKEN=20661505

1997-99 – MVC Model 2: Java Sun JSP Architecture http://java.sun.com/blueprints/guidelines/…/application_scenarios/index.html

2004 – MVP: Martin Fowler  http://martinfowler.com/eaaDev/uiArchs.html

2004 – Presenter Model  http://martinfowler.com/eaaDev/PresentationModel.html

2006 – Passive View and Supervising Controller: Martin Fowler http://martinfowler.com/eaaDev/ModelViewPresenter.html

2005-09 – MVVM: John Gossman http://blogs.msdn.com/johngossman/archive/2005/10/08/478683.aspx

MVVM: Josh Smith  http://msdn.microsoft.com/en-us/magazine/dd419663.aspx

An additional resource is this site http://c2.com/cgi/wiki?ModelViewControllerHistory from which we learn that the first MVC pattern can be attributed to Trygve Reenskaug sometime in the 70’s.

Perhaps the most authoritative source for the origins of the MVMV pattern, in turn, comes from the WPF MVVM toolkit (available on Codeplex) where we are told:

Model-View-ViewModel (MVVM) is a derivative of MVC that takes advantage of particular strengths of the Windows Presentation Foundation (WPF) architecture to separate the Model and the View by introducing an abstract layer between them: a “Model of the View,” or ViewModel. The origins of this pattern are obscure, but it probably derives from the Smalltalk ApplicationModel pattern, as does the PresentationModel pattern described by Martin Fowler. It was adapted for WPF use by the Expression team as they developed version 1 of Blend. Without the WPF-specific aspects, the Model-View-ViewModel pattern is identical with PresentationModel.

Atlanta .NET User Group

I will be presenting on “Working with new ASP.NET features in .NET Framework 3.5 Service Pack 1” at the Atlanta .NET User Group on Monday, October 27th.  Magenic will be providing refreshments, as usual.   The meeting will begin at 6:00 PM at Microsoft’s Offices in Alpharetta.  It’s a lot of material to pack into an hour long presentation, but I think I have a few good strategies for working with that.  The presentation will cover Dynamic Data, Entity Framework, Data Services, the Silverlight Media Control, the Ajax browser history feature built into the Script Manager, and Script Combining.  That gives me about 10 minutes per technology.  Whew.

Microsoft Corporation
1125 Sanctuary Pkwy.
Suite 300
Atlanta, GA 30004

Directions to Microsoft

Piratical Reads

pirate_freedom  strangertides

Apparently it is that time of year again.  It’s talk like a pirate day. 

Apropos of that, I’d like to recommend two good pirate books.  Pirate books, as a genre, have never seemed to quite catch on.  With Treasure Island they seem to have plateaued out, and pretty much just went underground after that.  Nevertheless, pirate books have caught the attention of some good writers willing to take the genre out for a spin.

Two of my favorites are Gene Wolfe’s Pirate Freedom and Tim Powers’ On Stranger Tides.  The first is a time travelling pirate story with Wolfe’s typically unreliable narrator, while the second revolves around Blackbeard, voodoo rituals, and Powers’ common concerns with Catholic teaching and the occult.  In the spirit of the day, I’m going to rip off some passages from these two fine novels.  From Pirate Freedom:

“What really happened was that they hollered for a parlay.  They swore they would not hurt anybody we sent to talk to them, but they would not send anybody out to talk to us.  there was a lot of jawing back and forth about that because nobody on their side could speak much French and Melind could not speak much Spanish.

“That was when I did one of the dumbest things I have ever done in my life.  I told him I spoke Spanish better than he did, and I would translate for him.  So before long Melind and I left our muskets and knives behind and went up the beach and into the edge of the rain forest to talk to them.

“There were two, a Spanish officer and a Spanish farmer. From what I saw, the officer had about ten soldiers and the farmer maybe a hundred other farmers.  Once they got us into the trees they grabbed us and searched us for weapons, and of course they found my money belt and kept the money.  Melind protested and I yelled my head off, but it did no good.  Before long they told us they would kill us both if we did not shut up about it.

“That was when I tried to jump them.  A farmer standing pretty near me had a big knife in his belt, with the handle sticking out.  I grabbed it and went for the Spanish officer.  I would have killed them all then and there if I could, and I have never hated anybody in my life the way I hated that guy.  That was my money, I had earned it with worry, hard work, and tough decisions, and they had sworn we would be okay if we left our weapons behind and came over.

“I got that officer in the side, before somebody hit me.  When I was conscious again (and feeling like something scraped off a shoe), my hands were tied behind me, and so were Melind’s.”

And from On Stranger Tides:

“‘Come on, devil,’ Blackbeard raged, a fearsome sight with his teeth and the whites of his mad eyes glittering in the glow of the smoldering match-cords woven into his man, ‘wave some more bushes in my face!‘  Not even waiting for the foreign loa’s response he waded straight into the primeval rain forest, shouting and whirling his cutlass.  ‘Coo yah, you quashie pattu-owl!‘ he bellowed, reverting almost entirely to what Shandy could now recognize as Jamaican mountain tribe patois.  ‘It takes more than one deggeh bungo duppy to scare off a tallowah hunsi kanzo!‘”

Shandy could hardly see Blackbeard now, though he saw the vines jumping and heard the chopping of the cutlass and the clatter and splash of wrecked verdure flying in all directions.  Crouched back and gripping his knife, Shandy had a moment to wonder if this maniacal raging was the only way Blackbeard allowed himself to vent fear — and then the giant pirate had burst back out of the jungle, some of his beard-trimming match-cords extinguished but his fury as awesome as before.