BdSound works with several devices, operative systems and scenarios, but with the same goal: make audio sound clear. We even named our technology suite after this goal: S2C, which means Simply Sounds Clear.
Our S2C suite is a full set of proprietary technologies and IP that our engineers use with care and expertise to always provide the best audio experience to our customers, and to their customers.
We often need to deal with noisy environments, and we need to use our technologies to remove noise.
In this series of posts, we will describe such BdSound proprietary IPs. The first episode described solutions based on acquisition from multiple microphones through a set of techniques called beamforming. In this episode, we will cover the basic processing steps for performing noise reduction with just one microphone.
Different kinds of noises
We live in a noisy world, and our voice communication is always affected by noise. However, not all noises are equals. Imagine a typical in-car scenario, when road noise, wind and speed bumps can make our calls quite difficult to be understood. We can distinguish two main classes of noise:
- stationary noises are characterized by a clear pattern that is repeated over time, for example the road noise of a car that is proceeding with constant velocity;
- non-stationary noises do not have such clear pattern, because they are too short (for example, the noise due to a car hitting a speed bump) or their pattern is not constant and varies too much (for example, the road noise of a car that is accelerating).
In this article we show a common procedure to reduce stationary noise. The idea is to emulate what our body and brain do. It is very common to stop noticing noise if this is stationary and not too loud: road noise, heating, ventilation and air conditioning, etc. In some way, it’s like our brain is able to capture a footprint of the noise and automatically remove it. This is, however, a tiresome process that a noise reduction algorithm can dramatically relieve.
Listen to the noise and remove it
The first step to mimic human’s ability to capture this footprint is to know when it is a good moment to take an audio snapshot of the noise. The best moment is usually when nobody is talking, and we have a clear “view” of the noise. For this reason, the first step employs a Voice Activity Detector, or VAD, a module that detects whether there is voice activity or not in the considered audio.
When nobody is talking, it is the moment to estimate noise, which means, to capture a footprint of the noise. Of course, like following actual footprints, the estimation stage makes use of previous steps. By constantly estimating the noise, it is possible to refine its footprint or update it if the noise changes.
The noise is mainly estimated during non-speech instants, while it is always removed, both during speech and non-speech sections. Think of the removal stage as a simple math operation: since the microphone is capturing voice plus noise, clean voice is the result of microphone minus the estimated noise. Et voilà, you can hear a frustrating call from a noisy car turning into a pleasant conversation.
It’s not that easy
The noise reduction algorithm iterates the noise estimation and removal steps to find the better estimation of the noise and to adapt it to possible changes in the environment. In fact, this kind of technique is very effective when the noise to be removed is mainly stationary. This is because while the noise is not changing, the system can constantly refine its internal estimation until it reaches convergence, i.e., a point when the noise estimate, and therefore the noise reduction level, is optimal.
For example, listen to this audio when the driver is accelerating, hence changing the noise profile.
Here is the audio cleaned: we can clearly hear that this kind of algorithms takes some time to adapt the profile
In this post we provide an overview of the basic processing steps behind many noise reduction algorithms. As a matter of fact, our solutions are also based on some of these steps. Thanks to our expertise, we refined and optimized our noise reduction IPs to make voice sound clear in every product. This is the reason why you can find our noise removal technology in many devices, which operate on a wide variety of scenarios. For example, with regard to the automotive market, several car models use our IP to provide the best audio experience to their users. Moreover, as a part of our services, we provide assistance during ITU-T P.1110 and P.1100, Apple Carplay® and Alexa Auto certifications, successfully walking with our customers from design to production.
We have also explained that there are some scenarios that cannot be handled just using this basic processing, for example rapidly changing noise scenarios, or even impulsive sounds, such as typing on a keyboard or noise due to a car hitting a speed bump, which are much shorter than the required convergence time.
Of course, BdSound technology is able to deal with harsh noisy situation and fast changing noises: this will be shown in the next episode. We will describe how our engineers have developed an innovative noise reduction technology based on artificial intelligence, which is a foundational technology that we are used to use every day, e.g. when we unlock the phone with our face or when we check-in at the airport.
Stay tuned to learn how BdSound has overcome the problem of critical noise conditions toward its mission to make a world without noise.