Visual Microphone: Passive extraction of sound from just seeing objects

Image of demonstration of recovering sound from video
[Source: Davis et al. 2014]

Hi Guys, In this new post from the "All about intelligent machines blog", I would like to take you to a very interesting advancement done in the recent years.

As from the heading you would have guessed that this post is something to do with the Extraction of Sound. And yes you guessed it right. Actually, today I am going to shed light upon the technology that exists now in our world to partially extract the exact same words which we said to each other while talking by just pointing a high speed camera from very distance to any object nearby us (like, a plastic wrapper, leaves, etc.). Yes, you read it right, so lets jump into it.


Some background as to how sound gives out information:

As we know that sound waves are fluctuations in pressure that travel through a medium so when sound hits an object, it causes the surface of that object to move and depending on various conditions, the surface may move with the surrounding medium or deform according to its vibration modes. In both cases, the pattern of motion contains useful information that can be used to recover sound or learn about the object’s structure.

Actually,the vibrations in objects due to sound have been used in recent years for remote sound acquisition, but the current technology till now were only active in nature, but this technology which was proposed by Abe Davis and his research team showed that it is possible to recover comprehensible speech and music in a room from just a video of a bag of chips or any other material that can display vibrations. Which shows that it is passive in nature.

At first glance when I came across this paper, I was astonished and it seemed that I am living in science fiction world where mind baffling things can happen and nothing is impossible.


Some Technical Details:

Davis and his team approached this problem of recovering the sounds from high speed footage of a variety of objects with different properties by using both real and simulated data to examine some of the factors that affect our ability to visually recover sound. 

Note that the high speed camera is used here because our naked eye and normal video camera cannot capture the subtle changes in the objects due to surrounding disturbances like sound, etc.

Photo of seeing the sound on graph

They evaluated the quality of recovered sounds using intelligibility and SNR(Signal to noise ratio) metrics and provide input and recovered audio samples for direct comparison. They also explored on how to leverage the rolling shutter in regular consumer cameras to recover audio from standard frame-rate videos, and used the spatial resolution of their method to visualize how the sound-related vibrations vary over an object’s surface, which then can be used to recover the vibration modes of an object.

So, their method visually detects small vibrations in an object responding to sound, and converts those vibrations back into an audio signal, turning visible everyday objects into potential microphones.


Limitations:

Other than sampling rate, this technique is mostly limited by the magnification of the lens. As a result, to recover intelligible sound from far away objects, this technique may need a powerful zoom lens.

You can watch the detailed video here:
                                                                                Courtesy:SIGGRAPH 2014

The Conclusion:

So, concluding here with something to think about. Though this is a new technique which has important applications in surveillance and security, such as eavesdropping on a conversation from afar, but it can also be extended to many applications like:
  • Recovering sound lost due to some corruption in the audio file
  • Study the properties of a material based on how they vibrate
  • Visualizing sound based on its vibration
And, many more. If you have a better idea on how this technology can be used, please comment down below as I will be most happy to brainstorm them.

If you are interested to deep dive into this topic then you can this link for the insights.

And now, Its time for me to go and leave you to explore more of this astonishing universe. As Nikola Tesla once said:

If you want to find the secrets of the universe, think in terms of energy, frequency and vibration.