More on Mac OS X voice recognition
Jul. 28th, 2003 12:31 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
- Mac OS X's built-in speech recognition isn't a freewheeling, parse-everything dictation system like Dragon's software; at any given point, it must have a list of possible words and phrases that the user might utter. Through clever programming (and lots of it) you can come up with some fairly natural "conversation trees", but that's the best you can hope for. You can't "teach" the system new words via voice -- unless you're willing to literally spell them out, which wouldn't make a very sexy primary interface.
- The "cleverly-placed microphone" idea probably won't float; SR apparently needs very close proximity to work at all well. I imagine that a <= $100 wireless mic, clipped to collar or dropped in shirt pocket, might do the trick, but it would eliminate the completely "hardware-free" interaction I was envisioning.
- I still like the idea of a physically static computer with speakers placed throughout a house, which monitors stuff and makes announcements and proclamations when interesting things float by. Simplest cases: Your favorite weblogger makes a new post, or you just got new email, or the weather forecast just changed. If you wanted to, you should ask -- physically ask -- the computer to elaborate, where it would make a context-appropriate speech; for email, perhaps, it would read the body of the message aloud. The speak-and-listen UI only gets you so far, of course, and eventually you'd want to sit at the PC to use its more conventional, and faster, UI. But, I'm quite curious what life would be like within these sorts of interactive aural cues, and it strikes me that experiementing with this wouldn't take too much work or expense.
no subject
Date: 2003-07-28 10:25 am (UTC)no subject
Date: 2003-07-28 03:59 pm (UTC)It also seemed fairly competent at being able to interpret user created commands such as 'Connect to Granicus' and 'Display Live Journal'. The way I suspected it worked was that the voice recognition system had it's own set of 'standard listenable phonemes' ... The same 'text to speech' routine which be used to piece together the listenable phonemes into a single phrase. The system could work if it's tolerances were relatively low and if there were only a few dozen phrases to try to match up (unlike a full speech recognition system which would have to recognize thousands of words).
I eventually turned it off and kept it off because it had a bad habit of crashing the computer (especially when turning 'speakable items' on and off). When on it would sometimes hear the TV and do something dumb like, "Quit all applications" or "Add this to Startup Items". I could have set it to 'only listen when a special hotkey is pressed' but that sort-of defeated the purpose.
I still have the computer do things like speak error messages and speak text in AIM. I've even considered trying to get AppleScript to pipe text from a terminal program through AppleTalk so it would talk while I was on a MU* or something but I haven't worked on that one lately.
One thing I wish people would do (and I haven't been able to find anything I could BUY) would be to create new voices for MacInTalk. It comes with about 15-20 voices but only about 4 or 5 of them are functionally useful. I'd like to have ones with different accents like Italian or German.
no subject
Date: 2003-07-28 04:01 pm (UTC)no subject
Date: 2003-07-28 06:09 pm (UTC)$ osascript -e 'say "I like ham"'
From AppleScript, you can just invoke the say command directly.
no subject
Date: 2003-07-29 02:17 am (UTC)no subject
Date: 2003-07-29 06:06 am (UTC)Mac OS X is indeed much easier to program and script. Not only does it come with free (and excellent) programming tools from Apple, but it ships with a bunch of popular Unix programming languages, like Perl.