Constructing Agents for Connect-4: Opening Databases

This post is the 6th part of a series of 7 articles:

  1. Constructing Agents for Connect-4: Initial Notes
  2. Constructing Agents for Connect-4: Tree Search Algorithms
  3. Constructing Agents for Connect-4: Board Representations
  4. Constructing Agents for Connect-4: Move Ordering
  5. Constructing Agents for Connect-4: Transposition Tables
  6. Constructing Agents for Connect-4: Opening Databases
  7. Constructing Agents for Connect-4: Final Considerations

As with many other games, the further course and outcome of a game for Connect-4 is already dependent on the opening phase. For example, the 1st player must place his first stone in the middle column to win the game. In all other cases, only a draw is reached or the game is even lost. For this reason, a strong player must be able to control the opening in order to avoid own mistakes and to be able to punish the opponent’s mistakes immediately. Often, however, it is very difficult to analyze the positions of the opening phase, because a high search depth is necessary to determine the exact game theoretical values.
In order to master the opening phase, many games therefore use opening books, which usually were created with high computing effort. These can typically give the computer program an advantage and save a lot of computing time during the opening phase. For games like Connect-4, there is another advantage: During the course of the game, the state space decreases further and further, so that after leaving the opening book, individual positions can often be correctly evaluated without further assistance and in acceptable time.

  Note: The related Java framework for Connect-4 is available on GitHub and can be downloaded from: http://github.com/MarkusThill/Connect-Four.

The 8-ply Database

The first extensive database, which is also usable for other programmers, was published by John Tromp in 1994. The calculation of the database was carried out on several computers and lasted 40,000 hours (approx. four and a half years). The strongest Connect-4 programs - including Mustrum and a program by John Tromp himself - use this database in some way. The database contains all non-terminal positions with eight stones, but all positions which are identical after mirroring along the middle column are only present in one variant in the database. Of the total of 67,557 positions, 44,473 (65.83%) positions lead to the 1st player’s win, 16,635 (24.62%) to a defeat and 6,449 (9,55%) to a draw. However, all positions with at least one immediate threat to either player have not been calculated. As a result, the database is not quite complete. Positions with an immediate threat to the player who is moving next don’t have to be included in the database, as he will win the game in the next ply anyway. However, since positions with an immediate threat to the the player to move can be neutralized in the next ply and the game is then continued, it is not possible to make a direct assessment of these situations using the opening database. The missing positions had to be calculated and added to the database. A total of 19,336 positions were missing, which were analyzed with an early version of our Alpha Beta agent on a Pentium-4 computer in about two weeks. The now complete 8-Ply database consists of 86,893 positions and assigns the theoretical outcome of the game (victory / defeat / draw from the point of view of the player to move) to each position. Thus, all positions with eight stones and less easily can be evaluated by the alpha-beta algorithm, which cancels the search for eight stones and retrieves the result from the database. This enables an exact evaluation of positions with eight or fewer stones within a few milliseconds. After leaving the opening book, the alpha beta agent - created in the course of our work - can analyze practically all positions in less than one second. This is quite sufficient for most purposes. Only a few positions with 9-12 pieces can lead to a slightly longer thinking time (up to 10 seconds).

The 12-ply Database

The 8-Ply Opening Database, which was reviewed in the previous section, has a major drawback that is not immediately obvious: all of the positions contain scores for a win, loss or draw. However, no statement can be made as to how far victories or defeats are away from the current position. However, these distances are particularly important for the 2nd player in order to prevent early defeats. For this reason, an alpha-beta agent often has to search beyond the limits of the opening book in order to avoid bad moves for the trailing player. This information could also be helpful for the 1st player. Although it always wins in perfect play, early wins should generally be given preference. For this reason, and in order to achieve further runtime advantages, it was decided to have a further opening database - this time, however, with all game positions with 12 pieces - calculated.
For this purpose, all possible legal 4,200,899 positions were determined. Here too, identical positions (mirrored at the middle column) were only stored in one variant. This leads to significant savings in memory and computing time.
Subsequently, the individual positions were analyzed and the theoretical result including distance to the end of the game was determined. The computation was carried out on a Pentium-4 computer and took about three weeks. To validate the results obtained, an 8-Ply database was created from the 12-Ply database and compared with the existing database. For simple position evaluations without the need of distances, a 12 ply database is sufficient, which contains only information on win/loss or draw of the positions. In this case, a separate opening book is also available. As described below, it requires less than 1/3 of the original memory requirements. Furthermore, a 10-Ply-database was created, which is not really used in this project.

Compressing the Databases with suitable Representations

Since the individual databases contain many game positions, a suitable coding of the individual positions has to be found, which minimizes the required memory. This can be done by taking advantage of the fact that all positions in a database contain the same number of stones which is comparatively small. The coding of a game state in 2 x 42 bit does not seem to make sense here, since the 12-Ply database with 4,200,899 positions alone would require about 50 MB of memory. Instead, we will look at a coding that corresponds to a Huffman coding and exploits the gravity character of the game. The following observation is important: Above an empty field there can only be more empty fields. For a clear description of a state of play, it is therefore sufficient to list the stones of both players for each column in a defined order, without the empty fields being relevant. Only the columns have to be separated from each other. In addition to the two possible stones, only a separator character is needed to achieve a unique encoding. This is illustrated in the figure below.

alt and id

Figure 1: Huffman encoding of the two possible stones (S1 & S2) and the separator character (TR).

To code a position, proceed as follows: The columns are evaluated from left to right, the cells within the columns from bottom to top. Depending on the configuration of the respective fields, the corresponding code word is appended to the current sequence. If an empty field is reached, append the code word for the separator to the sequence and proceed to the next column. Once all columns have been processed, no separator is required for the last column and the sequence is complete. This means that a total of six separators and four or six stones are required for both players. For this reason, the code word of length one is assigned to the separator. The player who moves does not need to be coded, as he can be determined from the position itself, so the state of play is completely described by the position. If the position is coded, the value of the position is appended to the sequence for the sake of simplicity. If there are no distances to the end of the game, two bits are sufficient to rate the position as a win/defeat of the 1st player or draw. This means that a position with eight or twelve stones including evaluation can be coded in exactly three or four bytes. The following Figure shows an example of how to code a position with eight blocks.

alt and id

Figure 2: Encoding of a game position with eight stones: 24 bits (3 bytes) are required. The last two bits represent the value of the position. In this case, the 1st player is expected to win in the course of the game. The position would be encoded as 0 10 11 10 0 0 11 10 11 10 11 0 0 0 11.

Five bytes are reserved for the 12-ply database with the respective distances to the end of the game, whereby the fifth byte contains the value. The following representation is used for this:

where the distance indicates the theoretical distance to the end of the game from the current position (consisting of 12 stones).

Further Compressing of the Databases

If you consider the fact that all databases are complete and therefore contain all necessary legal positions, you can further reduce the size of the databases with the simple position evaluations (victory / defeat / draw). If you remove all positions from the 8- and 12-ply database with an expected win of the 1st player - which make up about 60% of the positions in both databases - the exact evaluation of a position can still be done. If a position does not show up in the database, it is simply assumed that the 1st player will win. However, the mirrored variant of the position must also be taken into consideration. This step reduces the 8-Ply database from 86,893 (250 kB) to 34,286 game positions (100 kB). The new 12-Ply database now contains 1,735,945 positions (6.6 MB) of originally 4,200,899 (16.0 MB). Since it is impossible to discard any position in the 12-Ply database with the exact winning distances (otherwise the distance information would be lost), all 4,200,899 positions corresponding to a size of 20.0 MB are kept in it. Due to the fact that under certain circumstances a high number of accesses to the databases could occur in a very short time, they should be loaded into the working memory at runtime. To speed up the search within the databases, they have been sorted so that you can use the binary search. This reduces the complexity of the search to O (log (n)), which means a significant increase compared to the linear search. In many cases, the mirrored variant of a position has to be searched in the database as well. As a result, the maximum number of accesses per position is approx. 2x 15 (8-Ply database), 2x21 (12-Ply database without winning distances) and 2x 22 (12-Ply database with exact winning distances).

Downloading the Databases

The databases can all be downloaded from GitHub:

  1. The 8-Ply database
  2. The 12-Ply database without winning distances
  3. The 12-Ply database with winning distances

Appendix

Java-Classes for Loading the Databases

CFour/src/openingBook/Book.java view raw
 package openingBook;

import java.io.IOException;
import java.io.InputStream;

/**
 * There are 3 different opning-books available: <br>
 *         1. Opening book with all positions with 8 pieces (win/draw/loss) <br>
 *         2. Opening book with all positions with 12 pieces (win/draw/loss) <br>
 *         3. Opening book with all positions with 12 pieces (win/draw/loss with
 *         exact distance) <br>
 * 
 *         One of these books can be selected by setting the bookNr to 0,1 or 2.
 *         
 * @author Markus Thill 
 */
public class Book {

	// Constants

	private static final String BOOKPATH[] = { "book.dat", "bookDeep.dat",
			"bookDeepDist.dat" };
	private static final int BOOKSIZE[] = { 34286, 1735945, 4200899 };
	private static final int STONENUM[] = { 8, 12, 12 };
	private static final int MASKPOSITION = 0xFFFFFFFC;
	private static final int MASKVALUE = 0x3;
	private static final long MSBINT = 0x80000000L;

	public static final int NORMALBOOK = 0;
	public static final int DEEPBOOK = 1;
	public static final int DISTDEEPBOOK = 2;

	// Selected book-Nr

	private int bookNr;

	// Input-stream for reading the book

	private InputStream file = null;

	// All rows of the opening-book. Each row is coded in a special format

	// (exact 24- or 32-Bit)

	private int book[];

	// Only for Deep-book with Exact Distance

	byte vals[];

	/**
	 * @param bookNr
	 *            Selected book
	 */
	public Book(int bookNr) {
		this.bookNr = bookNr;
	}

	/**
	 * Open the selected book from the selected path
	 * 
	 * @throws IOException
	 */
	public void openBook() throws IOException {
		file = getClass().getResourceAsStream(BOOKPATH[bookNr]);
		if (file == null)
			throw (new IOException("Could not open File"));
	}

	/**
	 * Close the selected book
	 * 
	 * @throws IOException
	 */
	public void closeBook() throws IOException {
		file.close();
	}

	/**
	 * Read the book from the HDD and save in the RAM.
	 * 
	 * @throws IOException
	 */
	public void readBook() throws IOException {
		book = new int[BOOKSIZE[bookNr]];

		if (bookNr == DISTDEEPBOOK)
			vals = new byte[BOOKSIZE[bookNr]];

		int shiftNum, temp;
		for (int i = 0; i < BOOKSIZE[bookNr]; i++) {
			shiftNum = (bookNr == NORMALBOOK ? 16 : 24);
			temp = 0;
			for (; shiftNum >= 0; shiftNum -= 8) {
				int b = file.read();
				temp |= (b << shiftNum);
			}
			book[i] = temp;

			if (bookNr == DISTDEEPBOOK) {
				byte b = (byte) file.read();
				vals[i] = b;
			}
		}
	}

	/**
	 * Search in the opening book for the coded board and return the value for
	 * this board. A fast binary-search is used.
	 * 
	 * @param codedPos
	 *            Position coded in an Integer (see class ConnectFour for
	 *            details)
	 * @param codedPosMirrored
	 *            Mirrored Position coded in an Integer (see class ConnectFour
	 *            for details)
	 * @return Game-Theoretic Value for this board
	 */
	public int getValue(int codedPos, int codedPosMirrored) {
		int code = 0, code2 = 0, pos, pos2, step;
		pos = pos2 = step = BOOKSIZE[bookNr] - 1;

		// Binary Search

		while (step > 0) {
			step = (step != 1 ? (step + (step & 1)) >> 1 : 0);
			if (pos < BOOKSIZE[bookNr] && pos >= 0)
				code = book[pos] & MASKPOSITION;
			if (pos2 < BOOKSIZE[bookNr] && pos2 >= 0)
				code2 = book[pos2] & MASKPOSITION;

			if (codedPos < code)
				pos -= step;
			else if (codedPos > code)
				pos += step;
			else if (codedPos == code)
				if (bookNr != DISTDEEPBOOK)
					return (book[pos] & MASKVALUE);
				else
					return vals[pos];

			if (codedPosMirrored < code2)
				pos2 -= step;
			else if (codedPosMirrored > code2)
				pos2 += step;
			else if (codedPosMirrored == code2)
				if (bookNr != DISTDEEPBOOK)
					return (book[pos2] & MASKVALUE);
				else
					return vals[pos2];
		}
		return 2; //Value was not found in database, must be a win for X

	}

	/**
	 * Get a Board and its value at an specified index of this opening-book
	 * 
	 * @param index
	 *            Position in the opening-book
	 * @param board
	 *            Return-Value: Contains the board
	 * @return Value for the board
	 */
	public int getBoard(int index, int board[][]) {
		int hCode = book[index];
		int col = 0, row = 0;

		long b1 = 0, b2 = 0;

		if (board == null)
			board = new int[7][6];
		if (bookNr == NORMALBOOK)
			hCode <<= 8; // Move to Front of Integer

		for (int i = 0; i < STONENUM[bookNr] + 6; i++) {
			b1 = hCode & MSBINT;
			hCode <<= 1;
			// Column is Full

			if (b1 == 0L) {
				col++;
				row = 0;
				continue;
			}

			b2 = hCode & MSBINT;
			hCode <<= 1;
			if (b2 != 0L)
				board[col][row++] = 2;
			else
				board[col][row++] = 1;
		}
		if (bookNr != DISTDEEPBOOK)
			return book[index] & MASKVALUE;
		return vals[index];
	}

	/**
	 * @return Size of the opening-book in bytes
	 */
	public int getBookSize() {
		return BOOKSIZE[bookNr];
	}
}

 

CFour/src/openingBook/BookSum.java view raw
 package openingBook;

import java.io.IOException;

/**
 * @author Markus Thill
 * 
 *         Load all opening Books in this class.
 */
public class BookSum {

	private Book openingBook = null;
	private Book openingBookDeep = null;
	private Book openingBookDeepDist = null;

	public BookSum() {
	}

	public Book getOpeningBook() {

		if (openingBook == null) {
			openingBook = new Book(Book.NORMALBOOK);
			try {
				openingBook.openBook();
				openingBook.readBook();
				openingBook.closeBook();
			} catch (IOException e) {
				e.printStackTrace();
			}
		}
		return openingBook;
	}

	public Book getOpeningBookDeep() {
		if (openingBookDeep == null) {
			openingBookDeep = new Book(Book.DEEPBOOK);
			try {
				openingBookDeep.openBook();
				openingBookDeep.readBook();
				openingBookDeep.closeBook();
			} catch (IOException e) {
				e.printStackTrace();
			}
		}
		return openingBookDeep;
	}

	public Book getOpeningBookDeepDist() {
		if (openingBookDeepDist == null) {
			openingBookDeepDist = new Book(Book.DISTDEEPBOOK);
			try {

				openingBookDeepDist.openBook();
				openingBookDeepDist.readBook();
				openingBookDeepDist.closeBook();
			} catch (IOException e) {
				e.printStackTrace();
			}
		}
		return openingBookDeepDist;
	}
}

 

Note: Several paragraphs of this blog post are in parts taken from my Master’s thesis or from a translation of my Bachelor thesis and some earlier project work.

Markus Thill

Markus Thill
I studied computer engineering (B.Sc.) and Automation & IT (M.Eng.). Generally, I am interested in machine learning (ML) approaches (in the broadest sense), but particularly in the fields of time series analysis, anomaly detection, Reinforcement Learning (e.g. for board games), Deep Learning (DL) and incremental (on-line) learning procedures.

Deriving a Closed-Form Solution of the Fibonacci Sequence using the Z-Transform

The Fibonacci sequence might be one of the most famous sequences in the field of mathmatics and computer science. Already high school stu...… Continue reading