A computer was trained to play Qbert and immediately broke the game in a way no human ever has

Qbert

Machine learning researchers taught a machine how to play Qbert for Atari.
The computer program found a bizarre way to rack up 1 million points by playing the game "in what seems to be a random manner" and making the entire stage flash.
Artificial intelligence agents often find techniques to win games that a human would never discover.

While the jury's still out on whether today's machine-learning techniques will ever create a program that could rival human intelligence, one thing about the future of artificial intelligence is clear: The machines are really good at playing games.

And when the machines get good at the games, sometimes they come up with bizarre strategies and tactics that a human never would.

For example: In a new unreviewed paper posted on Arxiv, which we saw through a tweet from researcher Miles Brundage, three researchers from the University of Freiburg in Germany trained an agent using evolutionary strategies to play eight different Atari games from over 30 years ago.

For one of the games, "Qbert," the AI found a way to exploit a bug in between levels, make the entire stage flash, and then rack up unlimited points.

Seriously. Even if you have never played "Qbert," you can tell that the agent is crushing the game. (The goal of the game is to visit every square in the level and make them change colors by jumping on them.)

Here's the video — the glitch starts at about 20 seconds in.

Here's how Patryk Chrabaszcz and the other researchers describe what the agent is doing in the paper:

In the second interesting solution, the agent discovers an in-game bug. First, it completes the first level and then starts to jump from platform to platform in what seems to be a random manner. For a reason unknown to us, the game does not advance to the second round but the platforms start to blink and the agent quickly gains a huge amount of points (close to 1 million for our episode time limit). Interestingly, the policy network is not always able to exploit this in-game bug and 22/30 of the evaluation runs (same network weights but different initial environment conditions) yield a low score.

The strategies that AI agents take to win games are often fascinating. When Google's AlphaGo Zero agent beat the world's best Go player, its lead designer bragged that it found strategies that hadn't been used in the thousands of years the game has been played. "It found these human moves, it tried them, then ultimately it found something it prefers,” AlphaGo's lead programmer David Silver said at the time.

It's also worth noting that the Qbert agent described in the new paper is using a different machine-learning technique from AlphaGo Zero's reinforcement learning.

The bottom line is that machine-learning researchers love games. The rules are clear, you can run them thousands or millions of times, and they're just plain fun— even when the machines start breaking the games.

Read the entire paper here.

Join the conversation about this story »

NOW WATCH: How Silicon Valley's sexist 'bro culture' affects everyone — and how to fix it

A computer was trained to play Qbert and immediately broke the game in a way no human ever has

Read the entire paper here.

Trending Articles

Police confirm man stabbed to death in Selsdon was Andrew David Else of Croydon

Black Angus Grilled Artichokes

The 10 Tennessee Cities With The Largest Black Population For 2021

Practice Sheet of Right form of verbs for HSC Students

Sniper: Ghost Warrior 3: Трейнер/Trainer (+17) [1.0 - 1.02] {FLiNG}

IN COURT: Full list of people sentenced at Northampton Magistrates’ Court

Biker War Hit The Heartland In First Part Of 1980s: Outlaws’ Murder In Toledo...

Eladio Carrión – DON KBRN [iTunes Plus M4A]

Shatta Wale – You Shock Me (Prod. by Willis Beatz)

A/L Technology Stream – Subject combinations, Syllabuses and Teacher guides

FLASHBACK WITH SIRASA FM AT GALGAMUWA 2022

Jessica Carradero Lopez Arrested by Miami-Dade County Corrections on Dec 17,...

James Carpinello

Dedicated Rettendon youth worker Stacey Hansell found collapsed in field

Who’s been sentenced at Northampton Magistrates’ Court

GERVASE JOHN

Gordian S01e01-73 [H264 - Ita Jap Ac3 - SoftSub Ita]

Wiz Khalifa – Kush + Orange Juice 2 [iTunes Plus M4A]

Ndebele names

(Notes & Audio) The 26 Promises of Allah to the Ummah