In the broader context of AR, VR and The Metaverse, we’re talking about a few different types of experiences. We can think of the virtual side as a fiction-versus-nonfiction paradigm. Some of our virtualized experiences will take place in virtual worlds that are entirely fictional, while other virtual worlds will be “non-fictional” digital twins of real-world spaces. But in an augmented or merged reality (AR/MR) context, virtual objects can be placed into and manipulated within a given space in our AR-assisted visualization of the real world.
In order for these experiences to feel real, for them to be proportional and to create the sense of space we experience in the real world, those real-world spaces have to be mapped in such a way that our highly visual brains can process that information.
Enter Spatial Mapping
Spatial mapping is the process by which AR/MR devices combine the sensory data they collect about their environment and perform complex geometrical calculations on that data in order to reconstruct a three-dimensional rendering of a space. These devices use cameras, infrared light and LiDAR sensors to collect this data, and the computational algorithms can be run either on the device itself, in a network node, or some combination of the two.
A lot of this technology will exist in the background and won’t necessarily be the highlight of the user experience, but an accurately mapped space will dramatically impact how real the experience feels to the user. In these virtualized experiences of the future, real world surfaces can be overlaid with virtualized colors, textures and images, while virtual objects can be “placed” on a real, physical surface.
In a mixed and augmented reality context, the spatial mapping has to be done in real time, using sensors and cameras on the device itself. These devices will have the capability to create a live representation of the environment, so that virtual elements (such as wall textures, advertisements, or Pokémon) will interact much better with the real environment than they did in the early iterations of augmented reality applications like Pokémon Go.
In order for the metaverse (or even augmented reality) to really take off, it will need to move toward becoming a device-independent and decentralized platform, much like the internet is today. It will need its own protocols and infrastructure as well. Open standards and protocols like OpenXR are attempting to solve this already.
In the OpenXR protocol, there are common APIs from each of the devices, then there are development platforms with rendering engines that are built on top of those APIs (engines like Unity and Unreal). Unity, for example, doesn’t do the spatial mapping itself, but rather relies on the spatial mapping functionality of each of the other devices in the system. There will be many approaches to solving these problems. The devices can gather and create the spatial map information – point cloud or mesh information – then, through a common API, deliver that information through a rendering engine and back to the display.
Edge Computing Will Be Key
This all obviously happens almost instantaneously – for VR, latency for these systems needs to be below twenty milliseconds to prevent nausea in users. Spatial mapping functionality is a very compelling use case for edge computing. As these experiences begin moving toward a distributed model, the companies involved will have to move some of the spatial mapping functionality to the edge. The devices, especially AR devices, will be too small and lightweight to handle complex spatial mapping computations onboard, and will need to offload to the much more powerful edge servers to improve rendering performance.
There may eventually be several different approaches to what is done at the edge and what is not. For example, you could shift all of your sensor information (such as LiDAR with camera data) to the edge and only encode the video that’s being sent back up to the device. Or you could generate the spatial map on the device, then transport that data to the network where it could be merged with a larger database of spatial map data created by other users.
Right now, most of the work being done in this area is proprietary and each company has its own devices. It’s a very fragmented landscape, for now. But as time goes on, these proprietary solutions will grow and merge together, and standards will be created so that no matter which device you’re using, you can access the same virtual environments as anyone else no matter the device or platform used.
It remains to be seen which standards body (or bodies) will take the lead on this kind of standardization project. Whatever it is, it will need to look something like what 3GPP does for cellular wireless or what MPEG does for video encoding: large groups of industry players collaborating on common standards to ensure interoperability and high-quality user experiences.
In some ways, this standardization work is already underway in certain areas. For example, 3GPP is working on standards that are specific to XR, and MPEG is working on immersive video standards related to point cloud compression. There are other groups working on standards to optimize transport for XR data. At some point, there will be a need to standardize spatial mapping data, but the community hasn’t evolved to that point yet.
Why is that important? Because if anyone can share spatial mapping data, mesh data, and point cloud data into the network, then anyone will be able to download that data and use it on any device. That’s not possible yet, but the work will get done, and AR/MR experiences will undoubtedly be part of our future.