prog | More on Mac OS X voice recognition

I spent Friday evening making some initial research into those "smart house" ideas I posted about then. Here's what I came up with:

Mac OS X's built-in speech recognition isn't a freewheeling, parse-everything dictation system like Dragon's software; at any given point, it must have a list of possible words and phrases that the user might utter. Through clever programming (and lots of it) you can come up with some fairly natural "conversation trees", but that's the best you can hope for. You can't "teach" the system new words via voice -- unless you're willing to literally spell them out, which wouldn't make a very sexy primary interface.

The "cleverly-placed microphone" idea probably won't float; SR apparently needs very close proximity to work at all well. I imagine that a <= $100 wireless mic, clipped to collar or dropped in shirt pocket, might do the trick, but it would eliminate the completely "hardware-free" interaction I was envisioning.
I still like the idea of a physically static computer with speakers placed throughout a house, which monitors stuff and makes announcements and proclamations when interesting things float by. Simplest cases: Your favorite weblogger makes a new post, or you just got new email, or the weather forecast just changed. If you wanted to, you should ask -- physically ask -- the computer to elaborate, where it would make a context-appropriate speech; for email, perhaps, it would read the body of the message aloud. The speak-and-listen UI only gets you so far, of course, and eventually you'd want to sit at the PC to use its more conventional, and faster, UI. But, I'm quite curious what life would be like within these sorts of interactive aural cues, and it strikes me that experiementing with this wouldn't take too much work or expense.

S	M	T	W	T	F	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Most Popular Tags

art - 34 uses
blogs - 31 uses
books - 51 uses
boston - 50 uses
business - 74 uses
comics - 43 uses
digital games - 169 uses
email - 33 uses
family - 33 uses
food - 57 uses
friends - 33 uses
games - 321 uses
google - 33 uses
hacking - 29 uses
health - 53 uses
holidays - 56 uses
idiots - 34 uses
jmac's arcade - 31 uses
jmac.org - 34 uses
language - 38 uses
links - 82 uses
livejournal - 61 uses
money - 71 uses
movies - 109 uses
moving - 26 uses
music - 59 uses
nintendo ds - 26 uses
nostalgia - 68 uses
perl - 32 uses
perl & xml - 31 uses
podcasts - 26 uses
politics - 96 uses
programming - 38 uses
rage - 40 uses
ricky - 28 uses
science - 26 uses
sf - 77 uses
sleep - 28 uses
spam - 33 uses
tabletop games - 42 uses
television - 100 uses
the 2008 election - 46 uses
the gameshelf - 159 uses
the internet - 34 uses
the mystery hunt - 38 uses
the thon - 27 uses
travel - 31 uses
volity - 205 uses
wii - 42 uses
work - 119 uses

Flat | Top-Level Comments Only

From:

cnoocy

Keep us posted!

cortezopossum.livejournal.com

Macs have had this kind of voice recognition for several years now. I had tried it on and of for some time. I find it helpful to keeping only the built-in phrases you really use such as "What time is it?" or "Open my web browser" and remove those I don't commonly use such as "Increase the number of colors" (I don't delete them -- in case I want one I store them all in a folder called 'Unspeakable Items').

It also seemed fairly competent at being able to interpret user created commands such as 'Connect to Granicus' and 'Display Live Journal'. The way I suspected it worked was that the voice recognition system had it's own set of 'standard listenable phonemes' ... The same 'text to speech' routine which be used to piece together the listenable phonemes into a single phrase. The system could work if it's tolerances were relatively low and if there were only a few dozen phrases to try to match up (unlike a full speech recognition system which would have to recognize thousands of words).

I eventually turned it off and kept it off because it had a bad habit of crashing the computer (especially when turning 'speakable items' on and off). When on it would sometimes hear the TV and do something dumb like, "Quit all applications" or "Add this to Startup Items". I could have set it to 'only listen when a special hotkey is pressed' but that sort-of defeated the purpose.

I still have the computer do things like speak error messages and speak text in AIM. I've even considered trying to get AppleScript to pipe text from a terminal program through AppleTalk so it would talk while I was on a MU* or something but I haven't worked on that one lately.

One thing I wish people would do (and I haven't been able to find anything I could BUY) would be to create new voices for MacInTalk. It comes with about 15-20 voices but only about 4 or 5 of them are functionally useful. I'd like to have ones with different accents like Italian or German.

Oh the perils of 'cut and paste'. Did I really say: I find it helpful to keeping only... in my last reply? Ugh! I really need to proofread my posts more better.

prog.livejournal.com

Getting the Terminal to talk is actually quite easy... just use the osascript 'say' command. So:

$ osascript -e 'say "I like ham"'

From AppleScript, you can just invoke the say command directly.

Maybe in OS X, but my ancient computer is still running OS 9.1. It seems a lot of programming issues are easier in OS X... that's probably why there seems to be more shareware for OS X than I've ever seen for earlier Mac OSes.

Yeah, I thought you were still 9, but when you used the word "Terminal" my hopes rose. :)

Mac OS X is indeed much easier to program and script. Not only does it come with free (and excellent) programming tools from Apple, but it ships with a bunch of popular Unix programming languages, like Perl.

A Simple Way To Go Faster Than Light That Does Not Work

That's awful high, Jason. That's awful high.

More on Mac OS X voice recognition

Navigation

More on Mac OS X voice recognition

no subject

no subject

no subject

no subject

no subject

no subject

Profile

August 2022

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags