Above: One of these pics is reality, the other is Microsoft Flight Simulator -- but which?
I'm blogging Matthew Ball’s must-read, nine part metaverse primer this Summer; my take on Part 1 is here, and my coverage of Part 2 is here.
Part 3 of Matthew's Metaverse Primer, Networking and the Metaverse, touches on similar territory as the paper I wrote for Samsung Next last year, but with a big difference. (More on that below.) This section explores all the networking technology needed to make highly immersive, highly multiplayer applications possible. As an example of that, Matthew touches on how Microsoft's Flight Simulator (current version) uses streaming to display a mirror world:
Microsoft Flight Simulator is the most realistic and expansive consumer simulation in history. It includes 2 trillion individually rendered trees, 1.5 billion buildings and nearly every road, mountain, city and airport globally… all of which look like the ‘real thing’, because they’re based on high-quality scans of the real thing. But to pull this off, Microsoft Flight Simulator requires over 2.5 petabytes of data — or 2,500,000 gigabytes... Microsoft Flight Simulator works by storing a core amount of data on your local device (which also runs the game, like any console game and unlike cloud-based game-streaming services like Stadia). But when users are online, Microsoft then streams immense volumes of data to the local player’s device on an as-needed basis.
This becomes even more complicated when other users, not to mention new content they create, are part of your virtual world. But while broadband penetration and bandwidth speeds continue to increase, any multi-user Metaverse worth the name will run up against the simple fact that our broadband can't ever run faster than the speed of light:
[W]hile the Metaverse isn’t a fast-twitch AAA game, its social nature and desired importance means it will require low latency. Slight facial movements are incredibly important to human conversation — and we’re incredibly sensitive to slight mistakes and synchronization issues (hence the uncanny valley problem in CGI)...
Unfortunately, latency is the hardest and slowest to fix of all network attributes. Part of the issue stems from, as mentioned above, how few services and applications need ultra-low latency delivery. This constrains the business case for any network operator or latency-focused content-delivery network (CDN) — and the business case here is already challenged and in contention with the fundamental laws of physics.
At 11,000–12,500km, it takes light 40–45ms to travel from NYC to Tokyo or Mumbai. This meets all low-latency thresholds. Yet while most of the internet backbone is fiber optics, fiber-optic cable falls ~30% short of the speed of light as it’s rarely in a vacuum (+ loss is typically 3.5 dB/km)... Furthermore, network congestion can result in traffic being routed even less directly in order to ensure reliable and continuous delivery, rather than minimizing latency. This is why average latency from NYC to Tokyo is over 3× the time it takes light to travel between the two cities, and 4–5× from NYC to Mumbai.
Emphasis mine. Or to put it another way: The metaverse of our dreams will always be hobbled by the speed of light.
In fact, the way Matthew lays out all the bandwidth and latency hurdles facing metaverse applications leaves me significantly less confident than I was while writing the Samsung Next article. Which takes me to my own thoughts on Part 3:
It's very possible that the networking challenges that Matthew lays out cannot be fully addressed. In fact they might even become somewhat worst, if VR and its even higher graphics requirements demand a streaming solution. If your vision of the metaverse is highly realistic avatars from around the world which respond to facecam capture and motion capture, while also supporting those avatars in contiguous spaces where hundreds of users can interact -- even create content -- at the same time, I might want to wait patiently for another 2-3 decades or so.
This is probably why the most successful virtual worlds which are described as metaverses do not rely on users with high-end desktop computers, or dedicated, heavy bandwidth. Rather, theses metaverses -- ROBLOX, Fortnite, a few others -- are optimized to run wirelessly on mobile, and consequently, are very much not dependent on high-end, ultra-realistic graphics. (Fortnite for example has very engaging, highly responsive physics, but graphics which are decidedly cartoonish, and avatars have a limited range of customization and expression options.)
And that's quite OK, if we're willing to accept a mass market metaverse which sacrifices high fidelity in order to achieve high concurrency and low latency among a global userbase. Because of this speed of light hurdle, we may also see the rise of "local", micro metaverses, with all the high fidelity graphics affordances we aspire to-- but confined to players within, say, a 500 mile radius of the local server.
Whatever evolves, I'm fairly confident the latency challenge isn't going away. When it comes to the metaverse, Einstein is the ultimate cockblocker.
Thanks for exposing that series of metaverse articles - listened through first 3 chapters and it feels quite adequate + gives some fresh thoughts on latency limitation problem.
Posted by: Lex4art | Tuesday, July 27, 2021 at 11:26 PM
This speed of light latency problem has me pondering quantum entanglement and instant changes at infinite distances. Aldo just as we now have AI doing upscaling of images or choosing what it thinks it needs to render before it does some element of forward prediction helps with latency mitigation. Kind of like games do already if packets drop they predict your next position (not always accurately but it’s a start)
Posted by: Epredator | Wednesday, July 28, 2021 at 01:33 PM
Spot on with latency. On human expression and lip synch though, I have to wonder if anyone has done an experiment on what people are viewing in a virtual world. I mean as a breakdown on where that "camera" time is spent, broken down in percentages.
Is it scenery, close ups on faces, whatever they happen to be interacting with? That would be an important first step into prioritizing what's important visually to the end user. If people do indeed focus on an avatars face while they're speaking then sure, expressions can be important. That may be especially true for meetings, or romance.
But as Wagner says, it doesn't seem to matter in existing successful worlds.
Posted by: Kyz | Wednesday, July 28, 2021 at 01:40 PM
"This speed of light latency problem has me pondering quantum entanglement"
Holy cow, does this mean we can't get an ideal metaverse until we figure out APPLIED QUANTUM ENTANGLEMENT technology?!
Posted by: Wagner James Au | Wednesday, July 28, 2021 at 05:42 PM
>>Holy cow, does this mean we can't get an ideal metaverse until we figure out APPLIED QUANTUM ENTANGLEMENT technology?!
Yep.
Posted by: Lex4art | Thursday, July 29, 2021 at 10:58 AM
But anyway, latency is only tip of the iceberg of problems to solve. I've finished listening through whole "Metaverse primer" articles (last 3 chapters where quite huge and watery on my taste, no clear picture) - IMHO there are some significant tech improvements are needed even for "far from ideal full of compromises single country scale not metaverse but just a decent virtual world", like somehow start to mass-producing "vacuum fibers" (to have full light speed medium that allow 40-60ms latency coverage at least for US (and servers should be located nearby geographical center). Math is also the problem - even very coarse real-world alike physics for clothing on all characters, fleshy soft-bodies and decent destruction is out of the possibilities for math-based processors. So, we need something that I can call "context based processors" - e.g. like current CPUs run x86_x64/ARM/etc command set to process mathematical context those "context processors" will run special, non-math context command set. But there are not so much breakthroughs on that horizon - neural networks dedicated CPUs still math-based monstrosities, only abridged to the core for NN math operations range...
Meh.
Posted by: Lex4art | Thursday, July 29, 2021 at 11:16 AM
But something interesting can appear even at current gen tech - some types of virtual fun (like slowly building stuff in Minecraft) didn't need much latency, just hide delays smart enough way from user when he interacts with world and this works well already. World-scale virtual world also maybe not worth it - simply because language barriers/culture barriers makes person-to-person interaction not that interesting and quite clumsy. And if there are super cool virtual art was created in one distant country - maybe it will be enough simply copy it and bring to all other countries data-centers to share at least art with good latency & content downloading speed... will se ).
Posted by: Lex4art | Thursday, July 29, 2021 at 11:26 AM
Oh, and how can I forgot about "cherry on top of the metaverse cake" - networking model for that kind of project is server-side-does-most-of-the-stuff so we can have secure payments & content distribution, no cheaters and no trespassing in VIP/personal zones. This is how Second life, Sine space and World of Tanks build - but this also means 2x latency - you hit movement key -> this goes to server and it calculates movement amount/permissions -> returns to you result so to animate character locally using received data. So, very limited amount of active metaverse activities for that kind of connection but this is the only way to do things secure.
Posted by: Lex4art | Thursday, July 29, 2021 at 12:05 PM