By Scott Canon MCT
COLUMBIA, Mo. – It’s the place where cloak-and-dagger eavesdroppers take a sympathetic seat next to your hard-of-hearing grandmother.
The cocktail party.
It’s tough for a spy to listen in electronically on the mumblings of two people in a crowd when the jabbering of everyone else creates a conversational gumbo. Just ask Granny and she’ll tell you that her darned hearing aids seem to crank up the noise of jangling silverware and whispers in a crowd as much as the person she’s trying to listen to.
Math to the rescue.
Two University of Missouri researchers appear to have struck on a solution – at least as far as algebraic geometry is concerned – to a problem that has vexed scientists for a half century.
“We’ve found that if you sample enough of the sound,” said Dan Edidin, one of the MU mathematicians, “you can do this.”
That discovery could someday soon dramatically improve how humans bark out orders to machines, let you watch a movie unbothered by the chatty couple in the next row, give crash investigators and crime scene detectives a new tool for recreating events – and enable Big Brother to overhear more of what you say.
The cocktail party problem was first identified in the 1950s. In those days, commercial air traffic controllers would sit together in rooms, with everyone listening to the same loudspeaker to carry on overlapping conversations with scores of pilots.
“Hearing the intermixed voices of many pilots,” wrote one researcher, “made the controller’s task very difficult.”
If you’re actually at the cocktail party, it’s much easier. Without thinking about it, you read the lips of the person you want to hear, watch their hand gestures and account for the rise in volume, change in pitch, accent, cadence and all that makes up conversation. Research even suggests you anticipate a pattern of words to better understand what’s being said.
The problem comes when all those visual clues disappear and a jumble of other voices is crammed into the sonic mix. It creates what sources call the “blind source separation problem” – meaning when the human brain can rely only on sound, it becomes easily befuddled at sorting out several different sounds. Machines, because they lack the combination of intuition and experience of a lifetime of listening, have an even harder time picking out a single voice.
At the German research and development firm Siemens Corp., scientists such as Radu Balan have been toying with different methods to unlock the technology puzzle of crowd noise, and to pull out the sound of single voices from a mob.
Balan and his collaborators were working on an old problem: If you sample enough of a sound, can you re-create that sound and break it into different parts without information about the pitch?
The Siemens folks made progress by using two microphones to more selectively capture sound in stereo much the same way a set of ears might.
But to refine that engineering, Balan needed better algorithms, and building those algorithms required a math breakthrough. So he turned to Peter Casazza, an applied mathematician and a colleague of Edidin’s at MU.
“I asked them,” Balan said, “Is it possible to do this a smarter way or a different way?'”
Casazza turned to Edidin, who specializes in the highly theoretic world of pure math. At first, they thought the task was impossible.
“I said there’s not a chance in hell this will work,” Casazza said. “And there was no way to attack. We had no (math) tools to use on it.”
In fact, Edidin set out to construct a counterexample to show that what Balan wanted simply couldn’t be done.
But when he couldn’t establish its impossibility, he figured, well, perhaps it could be done.
At its deepest level, the insight that then struck him is the stuff that only people who have spent their lives exploring math can understand fully.