Sony released an article today, introducing a “black technology” innovation of Sony- AI sound separation technology . This technology can extract a single sound from a mixed sound source. Since the sound signal has only two dimensions, it is very difficult to separate the sound with traditional technology. However, in 2013, Sony introduced AI artificial intelligence technology to take this further.
At present, this technology has achieved results in restoring classic movies, eliminating the noise of smartphones, and realizing real-time karaoke functions for music streaming services, and will be applied to more fields in the future.
Sony R&D staff Yuki Koto said in an interview: AI sound separation technology can remove unnecessary noise from audio data, and only extract the sound of human voices or other specific musical instruments. When humans are listening to a performance in which multiple sounds are mixed together, they can distinguish each instrument, or when conducting a dialogue, even if we are surrounded by a large group of people, we can naturally focus on one sound. These are the unique abilities of human beings, and until recently, it was extremely difficult for computers to do this. Some people describe this task as mixing two juices and then extracting one of them. But in the past few years, due to the introduction of new AI methods, this technology has been greatly improved.
Mitsuto Yuki said that the sound separation is performed by AI, and people can teach computers to accomplish this task . A guitar has a specific sound or frequency, which can be learned by neural networks. No matter how many sounds are mixed, our AI system can recognize these characteristics.
Yuliqi, another Sony developer, said that neural networks can learn to recognize audio features in so-called training . In this training, the neural network will see a lot of music-more music than we hear in a lifetime-and the target sound we should extract. This information is enough for the neural network to learn sound separation.
Naijatechnews has learned that the dialogue and sound effects of many classic movies are on the same audio track. To optimize them, you need to extract the human voice from them. Sony’s AI system can successfully extract a single sound effect from the master. In the 4K remastered ultra-definition version of the two-step movie “Lawrence of Arabia” and “Gandhi”, the sound engineer of Sony Pictures Entertainment used this technology to extract The sound, reproduced with Dolby Atmos, creates an immersive sound field.
According to IT House, Sony’s AI sound separation technology can also be applied to scenes outside of movies, such as cleaning up human voices recorded through microphones. Sony’s autonomous entertainment robot “Puppy” aibo can use this technology to better recognize human voices and remove background noise to improve speech recognition capabilities . The most practical application for users is to use sound separation technology to separate the original vocals in a song into a karaoke accompaniment.
Mitsuto Yuki also expressed that he hopes that technology can be like a time machine, allowing past and present artists to collaborate across time and space. Sony PCL and Sony Music Solutions have just begun to use our technology to provide services to the outside world, so there will definitely be more applications, and I look forward to future prospects.
For more such interesting article like this, app/softwares, games, Gadget Reviews, comparisons, troubleshooting guides, listicles, and tips & tricks related to Windows, Android, iOS, and macOS, follow us on Google News, Facebook, Instagram, Twitter, YouTube, and Pinterest.