Sick of Zoom meetings? Of course you are! So put on your headphones and play the video above. It's not actually a Zoom call but a simulation of what a Zoom call would sound like, using High Fidelity's very impressive 3D spatial audio. As you can hear, it sounds as if the callers displayed on the left or right are literally sitting in front of you on the left or right. It's actually a mock-up created to demonstrate how spatial audio can make Zoom calls less artificial and exhausting.
As High Fidelity founder Philip Rosedale explains:
[D]o you notice how much easier it is to understand who is speaking when Spatial Audio is on? It should be easier to make out the words when two or more people talk at the same time, too. Over the course of the day, it really makes a difference in reducing fatigue and enjoying conversations...
Why Haven’t Zoom or Teams Already Added Spatial Audio As A Feature?
Because it’s quite difficult. We’ve been working on it for six years with a highly skilled audio engineering and networking team... To address these problems, we’ve moved the spatialization to the cloud. That way, each person can receive just one stereo stream of audio that has the spatialized sounds of everyone talking, already fit into it. We’ve designed the spatialization into the compression and mixing, and designed it to work with short audio frames, so that no latency is added.
Much more on the process on High Fidelity's blog. The good news for Zoom (and hopefully all us exhausted Zoom users) is Philip concludes the post by announcing an API license for High Fidelity spatial audio.
I do hope Zoom takes Philip up on the offer, because spatial audio does improve the calling experience. However, I think Zoom would need to change its checkerboard format to truly leverage the technology. In the demo video above, I still find it a bit difficult to identify which person is talking, because the individual callers are far too bunched up together on the screen. Instead, you'd want to space them out much more on the display -- perhaps placing them around the image of a boardroom table or an auditorium, so users also have a visual reference to anchor the audio. (I.E. "The guy by the whiteboard is talking now, the lady near the water pitcher is speaking up", etc.)
My understanding is that Zoom (and MS Teams) will be releasing new client layouts in forthcoming updates to more resemble real life environments, such as lecture theatres, sitting around a meeting table etc., in which case you can locate attendees by rotational location (turn your head)
Posted by: Cyberpsych | Thursday, December 10, 2020 at 04:15 PM
Seems like High Fidelity is following the footsteps of Dirac that talked about this a few months ago in a Forbes article https://www.forbes.com/sites/marksparrow/2020/08/07/why-do-we-suffer-from-zoom-fatigue-its-all-about-the-sound/?sh=7a2cb5da4d87
Dirac is planning to license an SDK to build their tech into clients like Zoom. I wonder what High Fidelity means by API? A web service vs. an SDK? Closer to Vivox than Dirac? Don't know, but a better plan than the 2D dots things. Seems to be a space with only half a dozen customers though, Zoom, Microsoft, Discord, few others...and all large enough to build this in-house.
Posted by: seph | Friday, December 11, 2020 at 01:13 PM