My AI prediction for 2026
Looking ahead to the near future
No regular tech blog is complete without at least one end-of-the-year prediction, and this one will be no exception. Of course we’ll get model updates every few weeks with ever-increasing benchmark scores. Beyond that, though, I think there is one significant development that will change the way we think about and interact with AI. What I’m talking about is real-time AI generated interactive video. I believe this is a highly under-appreciated and under-discussed development, and it’s imminent. As I’ll explain, there are significant technical hurdles to overcome before we see truly interactive video, but it’s definitely on the horizon.
Imagine this. There’s a grinding noise coming from your car engine. Right now you can pop the hood and point your phone camera at your engine. The ChatGPT app can currently see and hear what you do, but it can only respond with text or spoken words. It can make suggestions, but can’t show you anything. Imagine if you can look at your phone screen and see an AI generated image that looks just like your engine, with an AI generated person who is showing you exactly what to do to diagnose, and subsequently fix, the problem. They can point directly to the right engine part that you should look at. As you reach in and pull on a tube, they can mirror your actions and show you where to look next. That kind of interaction will be much more effective than trying to verbally walk you through what to do.
Now imagine you are planning a trip, and you want to virtually explore where you are going in advance. You can do something like this now with Google’s street view, but that’s still a far cry from a realistic walk-through. And you won’t be limited to the current time. You’ll be able to virtually walk around ancient Egypt, or 18th century Vienna. Or an alien world. You’ll be able to interact with any people you encounter, and ask them questions about their surroundings, their life in that time/place, or anything else. Many video games now offer this kind of experience, but they’re limited to a specific setting. With AI interactive video, you’ll be able to experience anywhere you can think of. This technology will not only be useful for the curious traveler or the student of history, it will trigger a revolution in real-time generative video games.
Of course, the most basic AI application that we’ve become familiar with, chatbots, will become more realistic. You’ll be able to see a virtual person that you can talk to, creating a much more personal experience. AI companion apps will be completely transformed. For better or for worse, you can imagine the implications for more adult AI interactions.
The lines between all these applications will become fuzzy. For example, imagine watching your favorite episode of Friends, and being able to take control of a character halfway through. Is that a video game? Do the other characters become AI companions? Is this a new form of entertainment?
Is interactive video a good thing?
Many people will read all this and be horrified. They’re right to be concerned:
As real-time AI generated video becomes viable, it will likely increase the amount of time we all spend staring at computer screens, rather than interacting with the real world and each other. This could lead to more loneliness and mental health issues.
Model providers will have to be much more careful with age-related controls; its bad enough when children are exposed to inappropriate sexual and violent content, but it’ll be much worse when they can be active participants.
All the concerns people have with AI in general will be amplified with interactive video: concerns about energy, intellectual property, deep fakes, etc.
On the other hand, interactive video will open up many positive use-cases. Interactive 3D environments are not new. They already exist in everything from flight simulators for training pilots to surgical simulators for training doctors. Adding AI to these systems only makes them more realistic, which adds to their utility. With interactive AI-generated video, we’ll have better tutoring systems for education. Scientists will be able to run more accurate virtual experiments, etc.
One of the biggest pushes for AI-generated interactive video is for robotics. Training a robot requires many trial-and-error interactions with its environment. In the physical world, this takes a very long time. In a virtual world the same interactions can be sped up significantly, leading to more functional robots. The more realistic that virtual world is, the better the robot will be at transferring what it learned there to the physical world.
With AI interactive video, as with all general purpose technology, I prefer not to think in terms of “good” and “bad.” Specific applications will fall more clearly into one of those two categories, but such a black and white categorization of the technology itself is much murkier.
How close are we?
None of this will become reality until we overcome several technical hurdles. The biggest problem is that the speed of AI inference is going to have to dramatically improve to generate video in real-time. However, speed improvements tend to fall on an exponential curve: if a single image now takes 30 seconds to generate, in a few months that’ll be down to 15 seconds, then 7.5 seconds a few months after that, etc. It won’t take long until we’re down to a fraction of a second, which is what is required for real-time video.
There are a few shortcuts to mitigate the speed issue, and we’re already seeing some real-time video generation that takes advantage of these. One is to restrict the kind of video that is being generated in some way, so that a simpler AI model can be used. For example, Alibaba recently announced a generative video AI application called Live Avatar in which users interact with a stationary virtual character. Even more recently, AI video company Runway just released three real-time video models, one of which is another virtual, interactive avatar.
Another significant hurdle is going to be the underlying intelligence of the AI that powers the generated video. In order to walk around a realistic virtual world, the AI model will have to understand the physics of the environment and be able to predict how all the elements there interact. Such an AI model is called a world model, and many companies are working on this. Google has made great strides with their impressive Genie 3 model, which allows for real-time single user interactions with a simulated physical environment for up to a few minutes. As mentioned above, Runway just released three generative video models, one of which is a direct competitor to Genie 3. Finally, the startup World Labs is based around developing world models, and their product, Marble, looks very promising.
Finally, being able to see an AI generated character isn’t going to change how intelligent it seems when you interact with it. Like current chatbots, they will still mix surprisingly intelligent insights with confident hallucinations. Those interactions are only going to be as good as the current state of AI in general, which is improving with every model release. The closer we get to AGI, the more realistic interactions in an AI generated virtual world will be.
2026 won’t necessarily be the year in which interactive AI generated video goes mainstream, but we’ll definitely see significant progress. By the end of the year you’ll be able to interact with a virtual “person” who looks and sounds like they’re as real as someone on the other end of a Zoom call. You’ll also be able to navigate through (and control) an entire photorealistic world that you create with a prompt.
This may not be the year everyone replaces their mechanic with an app, but it will be the year we stop thinking of video as something we simply watch. For decades, screens have been windows we look at. By the end of 2026, they will become doors we can walk through. That will make today’s internet look like a silent film.
David Bachman is a professor of Mathematics, Data Science, and Computer Science. To learn more about David’s work, visit his AI speaking and consulting site, his faculty page, or explore his mathematical art portfolio.


