Google DeepMind’s new generative model makes Super Mario–like games from scratch

OpenAI’s recent reveal of its stunning generative model Sora pushed the envelope of what’s possible with text-to-video technology. Now Google DeepMind brings us text-to-video games. The new model, called Genie, can take a short description, a hand-drawn sketch, or a photo and turn it into a playable video game in the style of classic 2D…
Google DeepMind’s new generative model makes Super Mario–like games from scratch

“It’s cool work,” says Matthew Guzdial, an AI researcher at the University of Alberta, who developed a similar game generator a few years ago. 

Genie was trained on 30,000 hours of video of hundreds of 2D platform games taken from the internet. Others have taken that approach before, says Guzdial. His own game generator learned from videos to create abstract platformers. Nivida used video data to train a model called GameGAN, which could produce clones of games like Pac-Man.

But all these examples trained the model with input actions and button presses on a controller, as well as video footage: a video frame showing Mario jumping was paired with the “jump” action, and so on. Tagging video footage with input actions takes a lot of work, which has limited the amount of training data available. 

In contrast, Genie was trained on video footage alone. It then learned which of eight possible actions would cause the game character in a video to change its position. This turned countless hours of existing online video into potential training data. 

example of game generated from a crayon sketch
Genie can generate simple games from hand-drawn sketches

GOOGLE DEEPMIND

Genie generates each new frame of the game on the fly depending on the action the player takes. Press Jump, and Genie updates the current image to show the game character jumping; press Left and the image changes to show the character moved to the left. The game ticks along action by action, each new frame generated from scratch as the player plays. 

Future versions of Genie could run faster. “There is no fundamental limitation that prevents us from reaching 30 frames per second,” says Tim Rocktäschel, a research scientist at Google DeepMind who leads the team behind the work. “Genie uses many of the same technologies as contemporary large language models, where there has been significant progress in improving inference speed.” 

Genie learned some common visual quirks found in platformers. Many games of this type use parallax, where the foreground moves sideways faster than the background. Genie often adds this effect to the games it generates.  

While Genie is an in-house research project and won’t be released, Guzdial notes that the Google DeepMind team says it could one day be turned into a game-making tool—something he’s working on too. “I’m definitely interested to see what they build,” he says.