Will AI’s Future Form Factor Be AR Glasses?

Introduction

Throughout human evolution, vision and hearing have been the primary ways we perceive and interact with the world. Yet, for the past several decades of modern computing, technology has constrained us into unnatural forms of interaction: keyboards, mouse clicks, and touchscreens. These tools, while transformative, have required us to adapt to machines rather than the other way around.


Artificial Intelligence, as we know it today, followed this trajectory, with chat windows and text prompts dominating its early interactions. However, 2024 has marked a shift toward more intuitive engagements powered by advancements in Natural Language Processing (NLP) and Natural Language Understanding (NLU). Take Google’s recent commercials, for example, showcasing AI as a voice companion on their latest Pixel phones.

But last week, Google unveiled announcements that signal an even bigger leap forward—one that could redefine how we engage with AI in the future. Could AR glasses become the interface that unlocks AI’s true potential? Let’s explore.

Google’s Announcements: Gemini 2.0 and Android XR

Google made two major announcements last week in collaboration with Samsung and Qualcomm:

  1. Gemini 2.0: Google’s next-generation AI model, is now equipped with advanced vision and spatial understanding.
  2. Android XR: A new operating system designed for AR and XR (Extended Reality) devices, built to harness the power of Gemini.

Together, these announcements provide a roadmap for AI’s future—one where devices understand and reason about the physical world in ways that align with our natural senses.

Gemini 2.0: AI with Vision and Reasoning

At the heart of this shift is Gemini 2.0, Google’s latest AI model. This iteration is designed not just for text-based interactions but for engaging with the physical world. Some key features include:

  • Visual Understanding: Gemini 2.0 can read signs, interpret bus routes, and identify plants in real time - all from video.
  • NLP and NLU Capabilities: It continues to excel at natural language understanding and real-time translation.
  • Flash Model: A new smaller and more efficient version of the model—designed for phones and wearables— that actually outperforms the full Gemini 1.5 Pro model.
  • Reasoning and Memory: Gemini can remember and reason about the information it gathers, making it more contextually aware and useful.

The implications: AI with vision and reasoning can transform how we understand and interact with the real world.

Android XR: A New OS for AR and XR Devices

To complement Gemini’s capabilities, Google also announced Android XR, a new operating system for AR and XR devices. Developed in partnership with Samsung and Qualcomm, Android XR aims to:

  • Enable Immersive Experiences: From virtual reality (VR) to augmented reality (AR), Android XR facilitates seamless transitions between virtual and physical spaces.
  • Build a Thriving App Ecosystem: By supporting OpenXR, WebXR, and Unity, Google is encouraging developers to create a wide variety of apps. Could this ecosystem someday rival the success of the iPhone App Store or even Google’s own Play Store? With Gemini at its core, Android XR has the potential to attract a diverse range of applications, from productivity tools to immersive entertainment.

The first hardware device running Android XR, codenamed Project Moohan, is being developed by Samsung and is expected to launch next year. Google also mentioned and showed prototype glasses that will soon be used for real-world testing.

The long-term goal is to create an ecosystem of devices and apps, making AR glasses a compelling platform.

The Future: AI-First AR Glasses

These announcements hint at Google’s vision for AI’s “future form factor.” By embedding Gemini into AR glasses powered by Android XR, Google is positioning these devices as the ultimate interface for AI. Imagine:

  • Glasses that not only show you directions but understand your environment to offer real-time recommendations.
  • An AI assistant that combines voice, vision, and spatial awareness to provide proactive insights.
  • Seamless transitions between work, play, and exploration—all without pulling out your phone.

Yet, while the potential is exciting, the path to mainstream adoption of AR glasses is far from guaranteed. Google isn’t alone in this pursuit—competitors like Apple and Meta are also exploring AR and XR ecosystems. But by combining AI, AR, and a robust OS, Google’s approach could set a new standard. The question is: will developers and consumers buy into this vision?

Conclusion

The announcements of Gemini 2.0 and Android XR mark a significant step forward in aligning technology with our natural modes of interaction. While the road to mainstream adoption of AR glasses is still long, these innovations provide a compelling glimpse into the future. One question remains: Will AR glasses truly become AI’s future form factor? Google might think so—and they’re making bold moves to turn that vision into reality.