Natural User Interface and Semiotics

phone7_icons

The icons above are prescribed (promoted?) for Windows Phone 7 development.

When dealing with limited screen real estate, it is important to use what space one has effectively.  One way to accomplish this is to use icons rather than text to cue the user about the functionality of a phone application.

Iconography is intimately tied to the development of what people are calling the Natural User Interface (NUI) which is to be distinguished from the Graphical User Interface (GUI) that has predominated in software devices and PC’s since the mid-90’s.

The main input device for GUI’s has (let us say, in a qualified manner, “always”) been the mouse.  The main input device for NUI’s, on the other hand, is the finger, facilitated by touch sensitive devices and touch recognition software.

(Missing from these moments in the evolution of the software interface is Speech UI, which has never seemed to catch on.  This was the dominant UI foreseen by precogs of the 50’s and 60’s.  In part this may have been due to the fact that SUI is a technology that can easily be described in print sci-fi literature, whereas a NUI interface is less so.  A generation raised up on Star Trek – to which we are indebted for the design of our cell phones as well as the sliding doors at Walmart – thought it was a sure thing.  Strangely it has been largely skipped over at the very time when the processing power of computers and devices makes it plausible.  One possible explanation for this is that it simply took too long to get the hardware up to snuff and another generation of designers – one influenced by Tom Cruise’s The Minority Report rather than Star Trek – came along.  With advances in CGI, NUI is much more compelling than SUI in movies, and discards the print-based underpinnings of speech UI.  Another possibility, however, is that Speech Recognition simply got bogged down by the baggage of AI and the desire not only to make speech interfaces effective but also to give them a personality.)

So what makes a Natural User Interface natural?  For one thing, it removes the input device – whether a mouse or a keyboard – as the intermediary between the user and the device.  It is an unmediated interface.

But is this truly more natural?  Isn’t one of the definitions of humanity that we are tool using creatures.  Our interaction with the world is always mediated by tools that extend our natural capabilities.  We have swords in place of claws, books in place of memory, and wireless keyboard\mouse combinations in place of … what?  Fingers?

Looking at this from a different perspective, one of the criteria for the success of NUI will be that people intuitively understand how to use it.  For this, however, more will be required than simply having touch sensitive devices.  The UI’s created with NUI must also be intuitive.  People must be able to use an application without first reading the manual (which they haven’t done for generations, anyways).

Here we have an opening for a discussion of semiotics.  Semiotics is simply the study of signs.  (Tom Hanks — the other Tom —plays a semiotician in the Dan Brown inspired movies, though he is not called that.)  The study of signs includes language.  It includes codes.  It especially includes icons.

One of the common observations about signs is that they are natural and universal.

One can look at Egyptian hieroglyphics and, even without a Rosetta stone, have a feeling that one understands them.  Smoke always indicates fire.  Red, in the natural world, seems to often indicate danger.  An arrow seems to always draw one’s attention in a certain direction: up, down, left, right.  We intuitively understand why our distant ancestors used signs as a primitive form of writing – the meaning can be culled out of symbols without apparent cultural context.  Symbols (and gestures) can be used to communicate when there is no other common language between people.

One of the remarkable observations discovered by Semiotics as a professional discipline is that there is always a cultural aspect to signs.  Even in using simple signs, two people have to quickly agree, in the moment, on what they mean by apparently obvious glyphs.  Does a yellow light mean “slow down” or “speed up”?  When I point do I mean this item here or the one slightly to its left?  When I laugh, am I laughing with you or at you? 

All meaning is based on agreement.  As Umberto Eco might put it, we know this because we can disagree about meaning.  The definition of a sign, then, is not simply something that stands for something else, but rather something that can be misunderstood.

In designing glyphs for the Windows Phone 7 device, then, we must work with signs that can be misunderstood and are obligated to try to make them univocal in meaning.  This is one reason why the designers of the WP7 platform strongly recommend that everyone use this particular set of icons.  If everyone uses the same glyphs, there is less room for misunderstanding.

The requirement that we all come to a common understanding of what these glyphs represent also opens up the possibility that the meaning of these glyphs will change over time.  There is also the theoretical possibility that these glyphs currently mean nothing at all.  Each glyph is simply an arrangement of pixels.  It will be up to Windows Phone application designers to determine how they should be used.

Paul L. Snyder, a friend on the “Semiotics and Technology” forum, makes the following observations about these signs:

Consider the types of symbols in this collection.  We have:

* Symbols that depict a particular object, with the intent of evoking an association with that object’s primary function (an SLR camera, an envelope, an film-driven movie camera, a file folder, a present, a trash can, a pencil)

* Symbols that suggest a general idea, in the hopes that it can be related to a contextual activity (a minus sign, a plus sign, arrows, a check mark, an ‘X’, a star, a question mark)

* Symbols that rely primarily on already-established conventions outside of the computer realm (play/pause/ff/rewind)

* Composite symbols that try to suggest a more complicated idea (a star with a superimposed ‘+’, arrows pointing to a bar, two curved arrows suggesting a loop

It’s interesting to consider these symbols in light of the theory of ‘icons’ that Umberto Eco critiques in chapter 3 (see p.191, I know we aren’t there yet). Clearly, these icons are striving to be icons in a similar sense, as they are intended to:

* have the same properties as their objects,

* be similar to their objects,

* be analogous to their objects, or

* be motivated by their objects

(though I’m less sure about that last one).  Each of the symbols (with the possible exception of the large circle, for which I can’t guess a probable meaning) is hoping to rely on pre-existing culturally-coded conventions, with lesser or greater degrees of similitude.  By providing a hook that is "similar" to the action that the icon is intended to indicate, the hope is to make it easier for the user to store a mental model of these associations.  In some cases (such as the gear) this may be quite loose.  It’s interesting that several of the icons (movie camera, floppy disk) used depictions of outmoded technology in an attempt to make their meaning clearer.

Perhaps one thing that semiotics can do here is to point us to the culturally coded nature of the associations being used, which can be a starting point for assessing the probably efficacy of the symbol selections (though even so, nothing is a substitute for actual end-user testing).