The Imaginative Universal

Studies in Virtual Phenomenology -- @jamesashley

Do computers think?

September 25
by James Ashley 25. September 2009 16:20

sheep

The online Stanford Encyclopedia of Philosophy has just published David Cole’s update to the entry on The Chinese Room Argument.

The thought problem was posed by John Searle almost 30 years ago and has been a lightening rod for discussions about theories of consciousness and AI ever since.

For those unfamiliar with it, the argument is not against the notion that machines in general can think – Searle believes that minds are built on biological machines, after all – but rather against certain projects in AI that attempt to use computational theories to try to explain consciousness.  Searle’s argument is that computational models are a dead end and that thinking machines must be investigated in a different (apparently “biological”) way.

Of course, if biology can be reduced to the computational model (for instance) then Searle’s argument may be applicable to all machines and we will have to search for consciousness elsewhere.

Here’s the crux of the argument, from the SEP entry:

“The heart of the argument is an imagined human simulation of a computer, similar to Turing's Paper Machine. The human in the Chinese Room follows English instructions for manipulating Chinese symbols, where a computer “follows” a program written in a computing language. The human produces the appearance of understanding Chinese by following the symbol manipulating instructions, but does not thereby come to understand Chinese. Since a computer just does what the human does—manipulate symbols on the basis of their syntax alone—no computer, merely by following a program, comes to genuinely understand Chinese.”

If this sort of problem excites you, as it does me, then you may want to examine some of the articles about and around consciousness collected on David Chalmers’ website: http://consc.net/online .

Tags: ,

Notes from Terra | tranzlashunz

Playing with the Kindle 2's Web Browser

March 01
by James Ashley 1. March 2009 00:08

small_browser

I have been spending the day trying to upload PDF's from my safaribooksonline account to my Kindle, so far without much success.  Mobipocket Creator, which is recommended for converting various file formats to the Mobi format used by the Kindle, seems to get mixed up over the images.  I am currently trying to see if Amazon.com's converter handles them any better.

On the other hand, I've found that the new http://m.safaribooksonline.com site works fairly well on the Kindle's simplified browser (though not perfectly).  I can access my bookshelf and browse through my books.

The Basic Web browser seems very well suited for twittering, though. You can access your twitter account on the Kindle by going through http://m.twitter.com

To access the Basic Web browser on the Kindle, click on the Menu button from your home page.  Then select Experimental.  From the Experimental page, you will be able to start the Basic Web browser, which lets you search google, search Wikipedia, or simply browse to a url.

Also, contrary to my expectations, the Text-to-Speech feature on the Kindle 2 is actually rather good.  It even attempts to modify intonation based on the sentence structure.  Still not up to Morgan Freeman standards, however.

Tags: ,

Recipe | tranzlashunz

Finding the correct metaphor for text-to-speech

February 12
by James Ashley 12. February 2009 21:27

medspeech

A recent release from the Associated Press concerning the Authors Guild's concerns with the Kindle 2's text-to-speech feature left many computer programmers guffawing, but it occurs to me that for those not familiar with text-to-speech technology, the humorous implications may not be self-evident, so I will attempt to parse it:

"NEW YORK (AP) — The guild that represents authors is urging writers to be wary of a text-to-speech feature on Amazon.com Inc.'s updated Kindle electronic reading device.

 

"In a memo sent to members Thursday, the guild says the Kindle 2's "Read to Me" feature "presents a significant challenge to the publishing industry."

 

"The Kindle can read text in a somewhat stilted electronic voice. But the Authors Guild says the quality figures to "improve rapidly." And the guild worries that could undermine the market for audio books."

The quality of text-to-speech depends on the library of phonemes available on the reading device and the algorithms used to put them all together.  A simple example is when you call the operator and an automated voice reads back a phone number to you with a completely unnatural intonation, and you realize that the pronunciation of each number has been clipped and then taped back together without any sort of context.  That is a case, moreover, where the relationship between vocalization and semantics is one-to-one.  The semantic meaning of the number "1" is always mapped to the sound of someone pronouncing the word "one".   In the case of speech-to-text, no one has been sitting with the OED and carefully pronouncing every word for a similar one-to-one mapping. Instead, the software program on the reading device must use an algorithm to guess at the set of phonemes that are intended by a collection of letters and generate the sounds it associates with those phonemes. 

 

The problem of intonation is still there, along with the additional issue of the peculiarities of English spelling.  If have a GPS system in your car, then you are familiar with the results.  Bear in mind that your GPS system, in turn, is bungling up what is actually a very particularized vocabulary.  The books that the Kindle's "Read to Me" feature will be dealing with have more in common with Borges's labyrinth than Rand McNally's road atlas.

 

While text-to-speech technology will indeed improve over time, it won't be improving in the Kindle 2, which comes with one software bundle that reads in just one way.  I worked on a text-to-speech program a while back (if you have Vista, you can download it here) that combines an Eliza engine with the Vista operating system's text-to-speech functionality.  One of the things I immediately wanted to do was to be able to switch out voices, and what I quickly found out was that I couldn't get any new voices.  Vista came with a feminine voice with an American accent, and that was about it unless one wanted to use a feminine voice with a Pidgin-English accent that is included with the Chinese speech pack.  The only masculine voice Microsoft provided was available for Windows XP, and it wasn't forward compatible. 

 

It simply isn't easy to switch out voices, much less switch out speech engines on a given platform, and seeing that we aren't paying for a software package when we buy the Kindle but rather only the device (with much less power than a Microsoft operation system), it can be said with some confidence that the Kindle 2 is never going to be able to read like Morgan Freeman.

 

The Kindle 2's text-to-speech capabilities, or lack of it, is not going to undermine the market for audio books any more than public lectures by Stephen Hawking will undermine sales of his books.  They are simply different things.

"It is telling authors and publishers to consider asking Amazon to disable the audio function on e-books it licenses."

This is what is commonly referred to as the business requirement from hell.  It assumes that something is easy out of a serious misunderstanding of how a given technology actually works.  Text-to-speech technology is not based on anything inherent to the books Amazon is trying to peddle.  It isn't, for what this is worth, even associated with metadata about the books Amazon is trying to peddle.  Instead, it is a free-roaming program that will attempt to read any text you feed it.  Rather than a CD that is sold with the book, it has a greater similarity to a homunculus living inside your computer and reading everything out loud to you. 

 

The proposal from the Authors Guild assumes that something must be taken off of the e-books in order to disable the text-to-speech feature.  In fact, instructions not to read those certain e-books must be added to the e-book metadata, and each Kindle 2 homunculus must in turn be taught to look for those instructions and act accordingly, in order to fulfill this requirement.  This is a non-trivial rewrite of the underlying Kindle software as well as of the thousands of e-book images that Amazon will be selling -- nor can the files already living on people's devices be recalled to add the additional metadata.

"Amazon spokesman Drew Herdener said the company has the proper license for the text-to-speech function, which comes from Nuance Communications Inc."

This is just a legalese on Amazon's part that intentionally misunderstands the Authors Guild's concerns as well as the legal issues involved.  The Authors Guild isn't accusing Amazon of not having rights to the text-to-speech software.  They are asking whether using text-to-speech on their works doesn't violate pre-existing law. 

 

The answer to that, in turn, concerns metaphors, as many legal matters ultimately do.  What metaphor does text-to-speech fall under?  Is it like a CD of a reading of a book, which generates additional income from an author's labor?  Or is it like hiring Morgan Freeman to read Dianetics to you?  In which case, beyond the price of the physical book, Mr. Freeman should certainly be paid, but the Church of Scientology should not.

Tags: ,

Memes | tranzlashunz

BlogRoll

Download OPML file OPML