Writing an AI for a turn-based game

The question of whether a computer can think is no more interesting than the question of whether a submarine can swim. Edsger Dijkstra

I took the last week off work to try to make a video game. While the game is still a work in progress, I really enjoyed making the AI for it. And I think the AI turned out really well.

You can see a demo here of the final product

The game

Automatank (working title) is a turn-based robot programming game. It’s inspired by one of my favourite board games: RoboRally.

Each player is given 9 random moves. The player then picks 5 of these moves and arranges them in an order for the robot to execute. Each move either moves the robot forward, moves it backward, or rotates the robot.

There are hazards along the way: walls, conveyors belts, bottomless pits, and other obstacles.

The goal is to capture all the flags before the other robots.

A Basic AI

A game AI isn’t an AI in the machine learning sense. It’s just a program which attempts to play a game in a semi-competant way.

The AI I’m writing needs to try to determine the best move it can make based on the current game state.

It can be tempting, when faced with a task like this, to try to encode the behaviours a human would make when playing the game. Things like:

Turn towards the next flag if you aren’t facing it
Move forward if facing the next flag

This approach doesn’t really work. There may be walls or hazards in the way. The AI may not have the moves it needs. Writing special cases for these quickly explodes in complexity and isn’t really feasible.

Instead what I do is consider all possible hands and pick the best one.

export default class SimpleAI {
  constructor({ map, robot }) {
    this.map = map;
    this.robot = robot;
  }

  chooseBestHand() {
    let bestHand = null;
    let bestScore = -Infinity;

    for (candidateHand of this.possibleHands()) {
      const candidateScore = this.scoreOfHand(candidateHand);

      if (candidateScore > bestScore) {
        bestScore = candidateScore;
        bestHand = candidateHand;
      }
    }

    return bestHand;
  }

  * possibleHands() {
    // I cheat here.
    // instead of choosing 5 moves from 9 random moves and finding all permutations,
    // I just generate 100 random 5 card hands.
    // It feels about the same.

    let iterations = 100;
    while (iterations--) {
      const hand = [randomCard(), randomCard(), randomCard(), randomCard(), randomCard()];
      yield hand;
    }
  }

  scoreOfHand(hand) {
    // TODO
  }
}

This approach is good. Something similar should work for most turn based games. The challenge is how to score each hand.

In order to evaluate how good a hand is, the AI needs to be able simulate the outcome of that hand and then picking the best outcome.

Fortunately, I kept this in mind when writing the game itself. The code which simulates the game can be run standalone and I can reuse this for the AI.

Here’s the first attempt at an AI

scoreOfHand(hand) {
  const robotAfterSimulation = simulateCommands({
    robot: this.robot,
    map: this.map,
    commands: hand
  });

  if (robotAfterSimulation.dead) {
    // Falling off cliff = bad
    return -10000;
  } else {
    // Collecting flags = good
    return 10000 * robotAfterSimulation.flagsCollected;
  }
}

This works (somewhat). The AI simulates its potential choices so it definitely won’t pick commands which lead to it’s death. If it comes across a hand which will have it collect another flag, it will do that. Aside from that, it will move randomly (but safely).

Pathing

Since the simulation only looks one turn ahead, the behaviour doesn’t seem “intelligent”. If it doesn’t see a way to collect a flag in one turn, it just moves randomly.

The AI doesn’t know what moves it will get next round, so it can’t simulate past the current hand. Even if it could it would be too slow (exponentially) to test all future hands until it could find a flag.

Instead what I did is gave it better hints of when it is making progress towards the goal.

One quick obvious heuristic we use is trying to reduce the distance between the robot and the flag.

function distanceBetween(a, b) {
  // Manhattan distance
  return Math.abs(a.x - b.x) + Math.abs(a.y - b.y);
}

let score = 0;

// Collecting flags = very good
score += 10000 * robotAfterSimulation.flagsCollected;

// Being near flags = also good
const nextFlag = this.map.flags[robotAfterSimulation.nextFlag];
score -= distanceBetween(robotAfterSimulation, nextFlag);

return score;

This is much better. Instead of meandering until they randomly become close to a flag, the AI immediately tries to close it’s distance with the flag.

It’s not without issues. Although the short term planning understands the workings of walls and conveyors, the long term planning doesn’t. This is pretty apparent when the robot comes across a large wall in-between it and the flag. To a human player it’s obvious to just go around, but our Manhattan distance metric doesn’t give us any help here.

To have better pathing, we can use the shortest path as a metric instead.

Using shortest path we can answer the question “if I get all the moves I need, how many moves will it take to reach the flag?”.

Shortest Path

To solve for the shortest path, I used a breadth-first search (BFS). But I think it’s useful to consider to consider this a simplified case of Dijkstra’s algorithm.

With BFS we visit nodes in a queue. We start at the source node and add its immediate neighbours. These neighbours (at a distance of 1 from our source) are visited and their unvisited neighbours are added to the queue. We continue this way until we arrive at the destination, and return its distance.

In the case of this game, the nodes represent the possible positions of our robots (x, y, and direction) and the node’s neighbours are other positions which can be reached in one turn.

Once again we’re able to reuse our method to simulate the game.

function shortestPath(map, source, target) {
  const visitedList = new VisitedList();

  // Start at our source node
  const queue = [source];

  while (this.queue.length !== 0) {
    // Loop over nodes in FIFO order
    const position = this.queue.shift();

    // Only visit each node once (the shortest path to it)
    if (visitedList.isVisited(position)) {
      continue;
    }
    visitedList.setVisited(position);

    if (target.x === position.x && target.y === position.y) {
      // We've reached the target!
      return position.distance;
    }

    for (let cmd of allPossibleCommands) {
      // Simulate the command from the current position
      const robot = simulateCommands({
        robot: buildRobot({ position }),
        map: this.map,
        commands: [cmd]
      });

      // If this is a valid and safe move, add it to our queue
      if (isAlive(robot) {
        queue.push({
          ...robot.position,
          distance: distance + 1
        });
      }
    }
  }

  return Infinity;
}

The AI is now eerily smart.

They can navigate mazes. They will seek out conveyor belts that will move them closer to the flag. Since each direction is its own node it will even try to end the turn pointing in a good direction in anticipation of the next turn.

It’s a bit slow, though.

Precomputing distances

What was previously a few simple math operations is now running a full simulation of the game from (potentially) every single position on the board. The AI needs to do 100 times per turn!

I could use A* or add some caching, both of which would be faster. But what I’d really like is to precompute the paths, so that the scoring function can be constant time again.

There are algorithms to calculate all pairs shortest path, but this is a bit slow and needs a bit too much storage: for a 30-by-30 map (30 * 30 * 4) ** 2 == 12960000 12 megabytes even if stored as bytes.

Fortunately, the AI doesn’t need paths between all pairs, just paths leading to the four flags on the map.

It happens that Dijkstra’s Algorithm (or the simpler BFS variant) actually solves all paths from one source to any destination. So it does exactly what we need, only backwards. To use this I generate an adjacency list by doing a simulation of every move from every possible starting position. Then I run BFS from the flag and storing distances each time it visits a new position until it has stored.

// Convert a { x, y, direction } map position to an integer index
function positionIndex(map, { x, y, direction }) {
  return (y * map.width * 4) + (x * 4) + direction;
}

function buildEdgeList({ map }) {
  const edgeList = [];

  // From each possible start position
  for (let start of allValidStartPositions({ map })) {
    let destinations = [];
    const startIndex = positionIndex(map, start);

    // Simulate each command
    for (let cmd of allCommands) {
      const dest = simulateCommands({
        robot: buildRobot({ position: start }),
        map,
        commands: [cmd]
      });

      // Skip the command if it kills the robot
      if (!isAlive(dest)) {
        continue;
      }

      const destIndex = positionIndex(map, dest.position);

      edgeList[destIndex] = edgeList[destIndex] || [];
      edgeList[destIndex].push(startIndex);
    }
  }

  return edgeList;
}

const DISTANCE_MAX = 255;

export default class ShortestPath {
  constructor({ target, map }) {
    this.map = map;

    this.distances = new Uint8Array(map.width * map.height * 4);
    this.distances.fill(DISTANCE_MAX);

    const edgeList = buildEdgeList({ map });
    const queue = [];

    // We don't care which direction we face when capturing the flag,
    // so add them all with distance 0
    for (let direction = 0; direction < 4; direction++) {
      queue.push({
        idx: positionIndex(map, { ...target, direction }),
        distance: 0
      });
    }

    while (queue.length !== 0) {
      const position = queue.shift();
      const { idx, distance } = position;

      if (this.distances[idx] !== DISTANCE_MAX) {
        continue;
      }

      this.distances[idx] = distance;

      for (let nextIdx of edgeList[idx] || []) {
        queue.push({ idx: nextIdx, distance: distance + 1 });
      }
    }
  }

  distanceFrom(from) {
    return this.distances[positionIndex(this.map, from)];
  }
}

This is pretty much instant, and can be done once when the game loads and shared among all AIs.

Conclusion

I really like how this AI turned out. Strategies don’t need to be explicitly defined but can be inferred by reusing the game’s implementation.

Watch some AIs duke it out: http://automatank.butt.team/demo

Or play against them at: http://automatank.butt.team/versus_ai