When captioning goes awry

I’m not sure whether the captioning for this year’s online Worldcon panels is automated or human transcription. One panel that I watched, the captioning was pretty much unreadable, but on another one, it’s pretty good. But even in the good one, there were some mistakes. Such as when one of the panelists used the word […]

AI-aided text-to-speech

I’ve written here in the past about speech recognition (column DD, and brief notes on Google Voice), but I haven’t written much about speech synthesis, except for a post about song synthesis and an aside in column iii. So I’m pleased to note that Google has made some remarkable improvements in text-to-speech lately. For example, […]

iii: So, Mimi…

Whenever we speak, we make music. Well, okay, that's not entirely true. But in voiced (not whispered) spoken English, every speech sound has a pitch. In most utterances, the pitches stay fairly constant from one syllable to another, though pitch rises at the end of a question and falls at the end of a statement; […]

DD: Excuse me, what was that?

Sometime around the early 1980s, Logical Business Machines (creators of computers with names like David and Goliath, and a programming language called English) released a computer system called Mike. This system came with speech-recognition software and a microphone. An executive of the company attempted to demonstrate the product on a television show; he stood at […]