Breaking News

Western Digital at Computex 2025 Intel Unveils New GPUs for AI and Workstations at Computex 2025 Samsung Elevates OLED TV Gaming Experience With NVIDIA G-SYNC Compatibility Innodisk at COMPUTEX 2025 SAMA Unveils 26 Cutting-Edge Gaming PC Cases, Power Supplies, and Cooling Systems at COMPUTEX 2025

logo

  • Share Us
    • Facebook
    • Twitter
  • Home
  • Home
  • News
  • Reviews
  • Essays
  • Forum
  • Legacy
  • About
    • Submit News

    • Contact Us
    • Privacy

    • Promotion
    • Advertise

    • RSS Feed
    • Site Map

Search form

DeepMind AI Beats Professional StarCraft II Players

DeepMind AI Beats Professional StarCraft II Players

Enterprise & IT Jan 25,2019 0

Google-owned artificial intelligence lab DeepMind announced on Thursday that its new "AlphaStar" AI had beaten two of the world's best StarCraft II players.

Dario Wunsch and Grzego rz Komincz, ranked 44th and 13th in the world respectively, both lost 5-0 by DeepMind's AlphaStar in two separate best-of-five matches.

The victories are being hailed as a major breakthrough by academics and AI industry watchers, cosnidering the complexity of StarCraft.

StarCraft is a complex real-time strategy (RTS) game and building an AI capable of beating the best human players has emerged as a "grand challenge" over the last few years. Made by Blizzard Entertainment, StarCraft II is a PC game with various playing modes. The 1v1 tournament is the most popular in esports, and this is the one DeepMind focused on.

“The history of AI has been marked by a number of significant benchmark victories in different games,” David Silver, DeepMind’s research co-lead, said after the matches. “And I hope — though there’s clearly work to do — that people in the future may look back at [today] and perhaps consider this as another step forward for what AI systems can do.”

There are several different ways to play the game, but in esports the most common is a 1v1 tournament played over five games. To start, a player must choose to play one of three different alien “races” - Zerg, Protoss or Terran, all of which have distinctive characteristics and abilities (although professional players tend to specialise in one race). Each player starts with a number of worker units, which gather basic resources to build more units and structures and create new technologies. These in turn allow a player to harvest other resources, build more sophisticated bases and structures, and develop new capabilities that can be used to outwit the opponent. To win, a player must carefully balance big-picture management of their economy - known as macro - along with low-level control of their individual units - known as micro.

The need to balance short and long-term goals and adapt to unexpected situations, poses a huge challenge for systems that have often tended to be brittle and inflexible. Mastering this problem requires breakthroughs in several AI research challenges including:

  • Game theory: StarCraft is a game where, just like rock-paper-scissors, there is no single best strategy. As such, an AI training process needs to continually explore and expand the frontiers of strategic knowledge.
  • Imperfect information: Unlike games like chess or Go where players see everything, crucial information is hidden from a StarCraft player and must be actively discovered by “scouting”.
  • Long term planning: Like many real-world problems cause-and-effect is not instantaneous. Games can also take anywhere up to one hour to complete, meaning actions taken early in the game may not pay off for a long time.
  • Real time: Unlike traditional board games where players alternate turns between subsequent moves, StarCraft players must perform actions continually as the game clock progresses.
  • Large action space: Hundreds of different units and buildings must be controlled at once, in real-time, resulting in a combinatorial space of possibilities. On top of this, actions are hierarchical and can be modified and augmented. Our parameterization of the game has an average of approximately 10 to the 26 legal actions at every time-step.

AlphaStar’s behaviour is generated by a deep neural network that receives input data from the raw game interface (a list of units and their properties), and outputs a sequence of instructions that constitute an action within the game.

AlphaStar also uses a novel multi-agent learning algorithm. The neural network was initially trained by supervised learning from anonymised human games released by Blizzard. This allowed AlphaStar to learn, by imitation, the basic micro and macro-strategies used by players on the StarCraft ladder. This initial agent defeated the built-in “Elite” level AI - around gold level for a human player - in 95% of games.

In order to train AlphaStar, DeepMibnd's team built a highly scalable distributed training setup using Google's v3 TPUs that supports a population of agents learning from many thousands of parallel instances of StarCraft II. The AlphaStar league was run for 14 days, using 16 TPUs for each agent. During training, each agent experienced up to 200 years of real-time StarCraft play. The final AlphaStar agent consists of the components of the Nash distribution of the league - in other words, the most effective mixture of strategies that have been discovered - that run on a single desktop GPU.

Professional StarCraft players such as TLO and MaNa are able to issue hundreds of actions per minute (APM) on average.

In its games against TLO and MaNa, AlphaStar had an average APM of around 280, significantly lower than the professional players, although its actions may be more precise. This lower APM is, in part, because AlphaStar starts its training using replays and thus mimics the way humans play the game. Additionally, AlphaStar reacts with a delay between observation and action of 350ms on average.

Additionally, and subsequent to the matches, DeepMind's team developed a second version of AlphaStar. Like human players, this version of AlphaStar chooses when and where to move the camera, its perception is restricted to on-screen information, and action locations are restricted to its viewable region.

Performance of AlphaStar using the raw interface and the camera interface, showing the newly trained camera agent rapidly catching up with and almost equalling the performance of the agent using the raw interface.

Deepmind trained two new agents, one using the raw interface and one that must learn to control the camera, against the AlphaStar league. Each agent was initially trained by supervised learning from human data followed by the reinforcement learning procedure outlined above. Deepmind says that the version of AlphaStar using the camera interface was almost as strong as the raw interface, exceeding 7000 MMR on our internal leaderboard. In an exhibition match, MaNa defeated a prototype version of AlphaStar using the camera interface, that was trained for just 7 days.

"We hope to evaluate a fully trained instance of the camera interface in the near future," Deepmind said.

These results suggest that AlphaStar’s success against MaNa and TLO was in fact due to superior macro and micro-strategic decision-making, rather than superior click-rate, faster reaction times, or the raw interface.

While StarCraft is just a game, albeit a complex one, we think that the techniques behind AlphaStar could be useful in solving other problems. For example, its neural network architecture is capable of modelling very long sequences of likely actions - with games often lasting up to an hour with tens of thousands of moves - based on imperfect information. Each frame of StarCraft is used as one step of input, with the neural network predicting the expected sequence of actions for the rest of the game after every frame. The fundamental problem of making complex predictions over very long sequences of data appears in many real world challenges, such as weather prediction, climate modelling, language understanding and more. "We’re very excited about the potential to make significant advances in these domains using learnings and developments from the AlphaStar project," Deepmind added.

Deepmind believes that some of their training methods may prove useful in the study of safe and robust AI. One of the great challenges in AI is the number of ways in which systems could go wrong, and StarCraft pros have previously found it easy to beat AI systems by finding inventive ways to provoke these mistakes. AlphaStar’s league-based training process finds the approaches that are most reliable and least likely to go wrong.

Tags: deepmindArtificial Intelligence
Previous Post
Samsung A9 Pro Comes With 'Infinity-O Display', LG Q9 Supports Hi-Fi Quad DAC
Next Post
Sonos Plans Wireless Headphones: report

Related Posts

  • What Is Explainable AI?

  • Fujitsu AI-Video Recognition Technology Promotes Hand Washing Etiquette and Hygiene in the Workplace

  • PAC-MAN Recreated with AI by NVIDIA Researchers

  • Chinese Sogou Introduces 3D AI News Anchor

  • Microsoft Announces New AI Supercomputer

  • Sony and Microsoft to Create AI-powered Smart Cameras

  • Researchers Use Analog AI hardware to Support Deep Learning Inference Without Great Accuracy

  • Nvidia Unveils New Ampere Data Center Chips, Ampere Computers, and More

Latest News

Western Digital at Computex 2025
Enterprise & IT

Western Digital at Computex 2025

Intel Unveils New GPUs for AI and Workstations at Computex 2025
GPUs

Intel Unveils New GPUs for AI and Workstations at Computex 2025

Samsung Elevates OLED TV Gaming Experience With NVIDIA G-SYNC Compatibility
Consumer Electronics

Samsung Elevates OLED TV Gaming Experience With NVIDIA G-SYNC Compatibility

Innodisk at COMPUTEX 2025
Enterprise & IT

Innodisk at COMPUTEX 2025

SAMA Unveils 26 Cutting-Edge Gaming PC Cases, Power Supplies, and Cooling Systems at COMPUTEX 2025
Cooling Systems

SAMA Unveils 26 Cutting-Edge Gaming PC Cases, Power Supplies, and Cooling Systems at COMPUTEX 2025

Popular Reviews

be quiet! Light Loop 360mm

be quiet! Light Loop 360mm

be quiet! Dark Rock 5

be quiet! Dark Rock 5

be quiet! Dark Mount Keyboard

be quiet! Dark Mount Keyboard

G.skill Trident Z5 Neo RGB DDR5-6000 64GB CL30

G.skill Trident Z5 Neo RGB DDR5-6000 64GB CL30

Arctic Liquid Freezer III 420 - 360

Arctic Liquid Freezer III 420 - 360

Crucial Pro OC 32GB DDR5-6000 CL36 White

Crucial Pro OC 32GB DDR5-6000 CL36 White

Crucial T705 2TB NVME White

Crucial T705 2TB NVME White

be quiet! Light Base 600 LX

be quiet! Light Base 600 LX

Main menu

  • Home
  • News
  • Reviews
  • Essays
  • Forum
  • Legacy
  • About
    • Submit News

    • Contact Us
    • Privacy

    • Promotion
    • Advertise

    • RSS Feed
    • Site Map
  • About
  • Privacy
  • Contact Us
  • Promotional Opportunities @ CdrInfo.com
  • Advertise on out site
  • Submit your News to our site
  • RSS Feed