Listener was an app I wrote initially to respond to calls of attention when I used my isolating headphones ( BOSE triports before they broke). The goal was that within two calls, I should give the other party the idea that I have sampled their request for my attention and I am responding. The deeper problem was that every dialogue began at a very high intensity and someone had to walk and stand in front of me to get my attention.
Listener, in its original form just sampled the surroundings regularly for high intensity sounds and alerted me using Geektool. But there were some fundamental problems with using Geektool. First, it takes up some real estate. Even on my 23” screen, I can recollect moments when its existence in one of the corners was a pain. Also, the intensity technique didn’t work so well – the intensity threshold needed to be re-calibrated every single time the surroundings changed.
Now, Listener features an even more powerful technique of detecting attention calls. It includes a VAD algorithm by S. Milanovic, Z. Lukac, A. Domazetovic. Their paper is available on the internet. Another check for call detection is that, sampled sound going over 48 dB is considered for notification. So, far the only cases where Listener has wrongly recognized speech was in a lobby very close to a piano. So this setting is not ideal for this situation.
Also, if there is a skype call going on you will be talking and Listener will go mad detecting speech from your lips which you don’t want. So, notifications go off during a skype call.
For sampling ambient sound, I use the module pyaudio – an excellent piece of work (dude’s from MIT… duh). You can find py-audio at: http://people.csail.mit.edu/hubert/pyaudio/ .
For notifications I use Growl. I think for using this dinky script you will need the Growl Python bindings which can be obtained from: http://growl.info/downloads_developers.php.
If you can’t compile Growl on your machine, don’t despair, the script repeat.sh throws the printed output from this script in the file /var/tmp/audio.log, so you can still set this up with Geektool (That is pretty simple, if you ask in the comments, I will show you how to do that)
Once you get all that working and get everything from http://shriphani.com/scripts/listener/ . Once you place all those files in a dir, you need to do:
> sh repeat.sh
and it should get things going.
Screenshots! These are all that I can provide. I can’t record video right now. So, you are stuck with a screenshot of a Growl notification:
And yeah, ignore that dtella thingy you see. There’s lots of fundamental research on networks that is conducted on my Macbook pro.
Goals
I want to use a stronger VAD but that will have to wait. Also, I want to eventually drop the use of applescript (there is a skype python module around and I can compile stuff on my machine again). Because of applescript the whole thing looks like a hack and there is an incredible amount of I/O even in such a low target functionality app. And from dropping applescript follows the idea that I need to move to a 100% Python implementation of Listener’s functionality. I have no plans to stop working on this app as it is pretty darn useful for me.
Have a nice day and thanks for visiting.




Listener Gets a VAD // Jan 21, 2010 at 1:15 am
[...] Listener [...]
WAV Files, Spring Semester, Research Hopes, IPL, // Feb 13, 2010 at 8:14 pm
[...] Listener [...]