The Rise of Spatial Intelligence

While much of today’s AI revolution is focused on text and language—think ChatGPT, large language models, and generative text tools—or on 2D image understanding (e.g. OCR or image segmentation) and 2D image generation (e.g. MidJourney)—another dimension of intelligence is quietly taking shape. One that’s equally fundamental to how we, as humans, interact with the world: spatial intelligence.

Spatial intelligence represents a holistic approach to interacting with the physical world. It doesn’t just perceive its surroundings (as perceptual computing does) or manipulate 3D spaces (as spatial computing does). It layers intelligence and action on top of understanding, enabling machines to make meaningful decisions in context and respond accordingly.

But what exactly does this mean, and why does it matter?

Defining Spatial Intelligence

Spatial intelligence builds upon the concepts of perceptual computing and spatial computing, which we explored in our previous blog.

  • Perceptual computing enables machines to perceive and understand their surroundings, much like human senses. It allows them to gather and process data from the environment, thereby "sensing" the world.
  • Spatial computing goes a step further by enabling machines to interact with and manipulate 3D spaces, transforming how we interact with technology by making these interactions more natural and intuitive.

Spatial intelligence takes these ideas to the next level by combining three core components: perception, understanding, and action.

  • Perception: Borrowing from perceptual computing, spatial intelligence begins with a machine’s ability to gather and process data about its environment—seeing, sensing, and interpreting the world as humans do.
  • Understanding: It builds context—analyzing what it perceives to make sense of what it sees. This is where spatial intelligence diverges from image recognition. Instead of simply identifying objects or patterns, it understands relationships between them, situational context, and implications for interaction.
  • Action: Finally, spatial intelligence makes intelligent decisions and takes action. Unlike traditional autonomous algorithms that merely avoid obstacles or follow preset instructions, it adapts to its environment in meaningful ways—whether collaborating with humans, augmenting real-world workflows, or enhancing digital experiences in physical spaces.

In short, spatial intelligence bridges the gap between machines that simply recognize the world and those that actively participate in it.

Examples of Spatial Intelligence in Action

This emerging field has sparked exciting developments. Let’s take a look at two companies at the forefront:

World Labs

World Labs is led by Fei-Fei Li—widely regarded as the 'Godmother of AI'—alongside the co-inventor of Neural Radiance Fields (NeRFs) and a team of other influential researchers. While specific details about their technology remain under wraps, the company recently secured $230 million in funding to build a "Large World Model." This ambitious model aims to perceive, generate, and interact with the physical world, redefining how machines engage with their surroundings.

Fei-Fei Li has been a vocal proponent of this vision, outlining her belief in the transformative potential of machine perception in multiple talks, including her recent TED Talk just six months ago. In it, she discussed how AI can extend beyond recognition to meaningful understanding and interaction with the world—a foundational philosophy for World Labs.

Niantic

Niantic, known for revolutionizing augmented reality gaming with Pokémon GO, has spent over a decade building foundational spatial technologies. They’re now leveraging this expertise to move beyond gaming and create a geospatial platform for understanding and interacting with the physical world.

Courtesy of Niantic

Niantic’s platform includes tools like Scaniverse, which enables detailed 3D object capture, and Visual Positioning System (VPS), which anchors these objects in real-world locations for augmented interactions. These technologies underpin their vision for spatial intelligence: use cases like enhancing real-world locations with AR visuals and audio, optimizing warehousing and logistics workflows, and enabling AR-based remote collaboration with 3D objects.

By combining years of expertise with forward-looking use cases, Niantic is laying the groundwork for a more intelligent interaction between people, machines, and the world around us.

Why Spatial Intelligence Matters

Spatial intelligence is more than a technological breakthrough—it’s a paradigm shift in how we think about machines’ role in the physical world. By moving beyond recognition to understanding and action, spatial intelligence holds the potential to redefine industries, from logistics to entertainment to urban planning.

Imagine robots that not only navigate spaces but work seamlessly alongside humans in warehouses. Or augmented reality tools that enable architects and designers to interact with life-sized 3D models in real-world settings. These applications, and many more, represent the next wave of innovation in AI and computer vision.

The convergence of perceptual computing, spatial computing, and spatial intelligence will open doors to solutions that are more intuitive, interactive, and impactful. It’s an exciting frontier, and we’re only beginning to see what’s possible