Nerde Nolzda

Tioj2000: APF and Genetic Algorithms

Background

I dug up this post from the drafts folder from more than two years ago. No idea why I did not publish it then. Anyway, I cleaned it up and here it is.


I ran into this interesting problem on TIOJ a while ago. Basically, you have to write an AI for (a slightly simplified version?) the game Charles.

I used an artificial potential field approach to solve the problem. Basically, the balls, walls (so that we are less likely to be stuck in corners), and other stuff we want to stay away from have a repulsive force, while tools have an attractive force. The forces are in the form of a * (d ^ r) where d is the distance and a and r are constants. A small, random force is also added to prevent getting stuck.

#include "lib2000.h"
#include <bits/stdc++.h>
using namespace std;

float params[] = {7.664862, 1.5181553, 2.0193303, 0.8539136, 1.7233956, 1.9903361, 2.9100385, 3.1485682, -0.9199132, 0.26406258, 0.00005, -1};
#define X_SIDE_A params[0]
#define X_SIDE_R params[1]
#define Y_SIDE_A params[2]
#define Y_SIDE_R params[3]
#define BALL_A params[4]
#define BALL_R params[5]
#define SPEC_A params[6]
#define SPEC_R params[7]
#define TOOL_A params[8]
#define TOOL_R params[9]
#define RND_FORCE 0.00465

mt19937_64 rng;

pos_type operator +(const pos_type &lhs, const pos_type &rhs) {
  return { lhs.x + rhs.x, lhs.y + rhs.y };
}
pos_type operator +=(pos_type &lhs, const pos_type &rhs) {
  lhs.x += rhs.x, lhs.y += rhs.y;
  return lhs;
}
typedef pair<pos_type, pos_type> PP;
PP operator +=(PP &lhs, const PP &rhs) {
  lhs.first += rhs.first, lhs.second += rhs.second;
  return lhs;
}

typedef pos_type vect;

void init() { rng = mt19937_64(71227122); }

vect createForce(pos_type o, pos_type pos, float a, float r) {
  float dx = o.x - pos.x, dy = o.y - pos.y, d = hypot(dx, dy);
  d = max(1E-6F, d);
  float force = a / pow(d, r);
  force = max(-1E6F, min(force, 1E6F));
  vect res = { force * dx / d, force * dy / d };
  return res;
}

float gunDir = 0;

pos_type play() {
  auto p = get_pos();
  vect force = { 0, 0 };
  force += createForce(p, { 100, p.y }, X_SIDE_A, X_SIDE_R);
  force += createForce(p, { 0, p.y }, X_SIDE_A, X_SIDE_R);
  force += createForce(p, { p.x, 50 }, Y_SIDE_A, Y_SIDE_R);
  force += createForce(p, { p.x, 0 }, Y_SIDE_A, Y_SIDE_R);
  auto balls = get_ball(), balls_spec = get_ball_spec();
  if (gun_time() != -1) {
    set_gun(gunDir += 0.1);
  } else {
    for (auto b: balls) force += createForce(p, b, BALL_A, BALL_R);
    for (auto b: balls_spec) force += createForce(p, b, BALL_A, BALL_R);
    auto tools = get_tool();
    for (auto t: tools) {
      if (t.type) force += createForce(p, t.pos, TOOL_A, TOOL_R);
    }
    auto specs = get_spec();
    for (auto s: specs) force += createForce(p, s.pos, SPEC_A, SPEC_R);
  }
  uniform_real_distribution<float> dis(-RND_FORCE, RND_FORCE);
  force += { dis(rng), dis(rng) };
  return p + force;
}

To tune the parameters, I applied a genetic algorithm. This part is written in Scala, communicating with the program via standard output, with the original lib2000.h modified to accept the parameters from argv and output only the score to stdout. This way, I can easily run the simulations in parallel with .par.

object Config {
  val Creatures = 1000
    val Heat = 1F
    val HeatMult = 0.99F
    val MutateChance = 0.1F
    val CrossChance = 0.95F
    val GeneLength = 10
    val RunCount = 20
    val ProgName = "./a.out"
    def initRange(x: Int): (Float, Float) = x match {
      case x if x % 2 == 1 => (-2, 2) // R
        case x if x < 8 => (0, 20) // Repulse
        case x if x >= 8 => (-20, 0) // Attract
    }
  def maxRange(x: Int) = (initRange(x)._1 * 2.5F, initRange(x)._2 * 2.5F)
    def maxAdd(x: Int): Float = x match {
      case x if x % 2 == 1 => 1.5F // R
      case _ => 15 // A
    }
}

object Utils {
  def lowerBound[A](seq: Seq[A], value: A)(implicit o: Ordering[A]) : Int = {
    import scala.collection.Searching._
      Math.max(0, seq.search(value).insertionPoint - 1)
  }

  def randFloat(l: Float, r: Float) = util.Random.nextFloat() * (r - l) + l;
  def randFloat(r: Float): Float = randFloat(-r, r)
}

case class Gene(values: Seq[Float]) {
  lazy val fitness: Int = {
    import sys.process._
      (0 until Config.RunCount).par.map { _ =>
        (Config.ProgName + " " + values.mkString(" ")).!!.trim.toInt
      }.sum / Config.RunCount
  }

  def cross(rhs: Gene): Gene = {
    val idx = util.Random.nextInt(values.size)
      Gene(values.take(idx) ++ rhs.values.drop(idx))
  }

  def mutate(heat: Float): Gene = {
    val idx = util.Random.nextInt(values.size)
      val ((l, r), v) = (Config.maxRange(idx), values(idx))
      val ma = Config.maxAdd(idx) * heat
      val n = Utils.randFloat(Math.max(l, v - ma), Math.min(r, v + ma))
      Gene(values.updated(idx, n))
  }
}

object GA {
  var best: Gene = _

    @annotation.tailrec
    def run(pool: IndexedSeq[Gene], gen: Int = 0, heat: Float = Config.Heat) {
      pool.par.foreach(_.fitness)
        val sorted = pool.sortBy(_.fitness)
        if (best.fitness < sorted.last.fitness) best = sorted.last
          log(sorted, gen, heat)
            val sum = sorted.view.map(_.fitness).sum.toFloat
            val chance = sorted.map(_.fitness / sum).scanLeft(0.0F)(_ + _)
            val newCount = Math.round(Config.Creatures * Config.CrossChance)
            val newPool = (0 until newCount).par.map { _ =>
              val cs =
                List.fill(2)(sorted(Math.min(sorted.size - 1,
                        Utils.lowerBound(chance, util.Random.nextFloat()))))
                val crossed = cs(0) cross cs(1)
                if (util.Random.nextFloat() < Config.MutateChance) crossed.mutate(heat)
                else crossed
            }.toIndexedSeq ++ sorted.drop(newCount)
      run(newPool, gen + 1, Math.max(heat * Config.HeatMult, 1E-4F))
    }

  def log(pool: IndexedSeq[Gene], gen: Int, heat: Float) {
    println(s"Gen $gen / Fit ${best.fitness} / Best $best / Heat $heat")
      val p = new java.io.PrintWriter(new java.io.File(s"log${gen % 2}.txt"))
      try { p.println(s"$pool"); p.println(s"$best") } finally { p.close() }
  }

  def initPool(): IndexedSeq[Gene] = {
    (0 until Config.Creatures) map { _ =>
      Gene((0 until Config.GeneLength) map { x =>
          val r = Config.initRange(x)
          Utils.randFloat(r._1, r._2)
          })
    }
  }

  def main(args: Array[String]): Unit = run(initPool())
}

Possible Improvments

Currently, the best score I can get is about 12000 rounds (240 seconds). Here are some possible improvments that may be done:

  • Use the tools better, e.g., try to “kill” as many balls using the blades
  • Fine-tuned coefficients for different tools
  • Try to escape if many balls are surrounding and trapping you
    • Compare the total absolute value of the forces?
  • Different optimization method

Related Posts

0 comments

Post a comment

Send an email to comment@nerde.pw.