Winboard Forum

by **H.G.Muller** » 28 Mar 2013, 22:15

As there seems to be little progress with the Chess lessons, I will start a short course myself for how to build a Chess engine, based on a real-life example. The case at hand will be Fairy-Max, for which I am currently doing a re-write to make it look more like a conventional engine, rather than its ultra-simple ancestor micro-Max. This re-write uses intelligible variable names.

Despite this, Fairy-Max remains a very simple engine, and is therefore very suitable as an introduction to Chess programming. The current source is only about 1100 lines, of which the AI takes up only 500. Almost each of these 500 lines is commented, to further explain what it does.

It will be assumed that the reader is a programmer, with good command of the C programming language. The Fairy-Max code makes extensive use of bit-wise operations & | and ^, as well as logical operators && || and !, and the fact that boolean expressions are equivalent to integers 0 or 1 for false and true, and integers are equivalent to boolean expressions (0 to false, and the rest to true).

by **H.G.Muller** » 31 Mar 2013, 19:07

Before discussing actual code, I will first explain the structure of the search routine in pseudo-code, allowing us to build up the rather complex search algorithm step by step.

1. Negamax

A negamax search always considers scores from the point of view of the side to move, and assumes that player will always play the best move, so that the score of the node become the score of the best move. As a result, it always searches for a maximum score. At the end of the branches, it uses a static evaluation to assign a score to the position.

Code: Select all: int Search(int depth) { int bestScore = -INF; MOVE bestMove = INVALID; if(depth == 0) return Evaluate(); // at branch end, use evaluation heuristic on position for(EVERY move) { // loop over moves MakeMove(move); // updates game state (board, side to move) score = -Search(depth-1); // recursion (flip sign because we change POV) UnMake(); if(score > bestScore) { // score accounting: remember best (from side-to-move POV) bestScore = score; bestMove = move; } } return { bestScore, bestMove }; // bestMove only used in root, to actually play it }

by **H.G.Muller** » 31 Mar 2013, 19:09

2. Alpha-beta pruning

Some branches need not be searched, because whatever their score, one of the players will get a better one by deviating from it earlier.
This is expressed by the 'search window' {alpha, beta}: in any node, alpha is the score of the best move for the current side to move that has already been found in any of the nodes on the path leading to (and including) the current node. Beta is the same for the opponent, but seen from the current mover's POV. If a move scores below alpha, we are not interested how much below alpha it scores, as we are not going to play it, because we already did have a fully verified move that scores alpha, and will prefer that. If we have a move that scores above beta, it is not important how much above beta it scores, as our score above beta for the opponent looks as a score below his alpha,
which is the situation described in the previous sentence: he will now prefer the verified move that scores beta (from our POV), so we will never get the chance to play the move that (for us) scored above it.

So we can stop searching alternative moves from a position as soon as we find one move that scores above beta. This move is then a refutation of the opponent's preceding move, which is apparently so bad he is never going to play it. Trying to show it is even more bad is a waste of time, so we will abort the search of that node at this point. This is called a beta cutoff.

This sounds pretty complex, but the program that does it is actually quite simple. Every time we find a better move, we increase alpha.
This is passed to all daughter nodes, which, two ply later, will include their best moves in it (if they are even better), etc. In the intermediate node this our alpha becomes the opponent's beta (after the sign flip to effect the change in POV characteristic of negamax), and finally passed to the grand-children of the current node as alpha again:

Code: Select all: int Search(int alpha, int beta, int depth) // *** pass search window {alpha, beta} { int bestScore = -INF; MOVE bestMove = INVALID; if(depth == 0) return Evaluate(); // at branch end, use evaluation heuristic on position for(EVERY move) { // loop over moves MakeMove(move); // updates game state (board, side to move) score = -Search(-beta, -alpha, depth-1); // recursion (flip sign because we change POV) UnMake(); if(score > bestScore) { // *** score accounting: remember best (from side-to-move POV) bestScore = score; if(score > alpha) { // *** narrow search window for remaining moves alpha = score; bestMove = move; if(score >= beta) break; // *** beta cutoff: previous ply is refuted by this move, so we can stop here to take that back } } } // next move return { bestScore, bestMove }; // bestMove only used in root, to actually play it }

(New or changed lines will be marked with *** in their accompanying comment, to highlight them.)

by **H.G.Muller** » 31 Mar 2013, 19:13

3. Iterative Deepening

The efficiency of alpha-beta benefits extremely from searching the best move first. If one of the moves in the node will produce a beta cutoff,
none of the other moves need to be searched. So every move you searched before it in that node was a pure waste of time. The effect of good move ordering is so large, that it pays to first do a pilot search at lower depth. Then it hurts less when we don't immediately find the cut-move, and waste time on some other moves first. But then we can try that cut-move as first move in the search of higher depth, in the hope that the position was not so complex that the lower-depth search was completely off. Together the searches will then take less time (because alpha-beta cuts away more nodes) than a single full-depth search that plunges in without any information.

By the same argument the pilot search can be made cheaper by using an even less deep pilot search for that, etc. So we iterate over depth, starting with low depth, and then gradually cranking it up. Each depth iteration then can start with information it obtained from the previous one, which highly speeds it up. This iteration can be done both in the root of the search tree (the current game position), as well as in deeper search nodes ('Internal' Iterative Deepening).

Code: Select all: int Search(int alpha, int beta, int depth) { int bestScore; MOVE bestMove = INVALID; if(depth == 0) return Evaluate(); // at branch end, use evaluation heuristic on position for(iterDepth from 1 to depth) { // *** loop over all depths int iterAlpha = alpha; // *** as we touch alpha, we need to reset it to the original value for each new depth bestScore = -INF; // for every depth we start with a clean slate PutInFront(bestMove); // *** put best move in front of list, so this depth iteration tries it first for(EVERY move) { // loop over moves MakeMove(move); // updates game state (board, side to move) score = -Search(-beta, -alpha, depth-1); // recursion (flip sign because we change POV) UnMake(); if(score > bestScore) { // score accounting: remember best (from side-to-move POV) bestScore = score; if(score > iterAlpha) { iterAlpha = score; bestMove = move; if(score >= beta) break; // beta cutoff: previous ply is refuted by this move } } } // next move } // next depth return { bestScore, bestMove }; // bestMove only used in root, to actually play it }

by **H.G.Muller** » 31 Mar 2013, 19:17

3. Transposition table

Due to (I)ID as well as move transpositions, the same position will be visited many times during the search, for being searched at consecutively higher depth (IID) the same depth (transpositions) or lower depth (after wasting of tempos). It therefore pays to store information on nodes that already have been searched in a global 'hash' table, so that the info can be re-used when the node is re-visited, rather than having to redo the search that obtained it.

Code: Select all: int Search(int alpha, int beta, int depth) { int bestScore; MOVE bestMove = INVALID; int startDepth = 1; HASHENTRY *hash = HashProbe(); // *** check if this position was visited before, and if we still have info about it if(depth == 0) return Evaluate(); // at branch end, use evaluation heuristic on position if(hash) { // *** position was searched before, and found in the table if(UsefulScore(hash, alpha, beta)) { // *** upper/lower bound might be useless with current {alpha, beta} startDepth = hash->depth + 1; // *** skip all depth iterations that were already done bestScore = hash->score; // *** this only in case we do iterDepth loop zero times (i.e. hash->depth >= depth) } bestMove = hash->move; // *** use move anyway, for lack of a better alternative } for(iterDepth from startDepth to depth) { // loop over all depths (*** note start depth!) int iterAlpha = alpha; // as we touched alpha, we need to reset it to the original value bestScore = -INF; // for every depth we start with a clean slate PutInFront(bestMove); // put best move in front of list, so this depth iteration tries it first for(EVERY move) { // loop over moves MakeMove(move); // updates game state (board, side to move) score = -Search(-beta, -alpha, depth-1); // recursion (flip sign because we change POV) UnMake(); if(score > bestScore) { // score accounting: remember best (from side-to-move POV) bestScore = score; if(score > iterAlpha) { iterAlpha = score; bestMove = move; if(score >= beta) break; // beta cutoff: previous ply is refuted by this move } } } // next move HashStore(iterDepth, move, // *** save what we need in next depth iteration in global table bestScore, alpha, beta); // *** alpha and beta are needed to determine meaning of score (bound type) } // next depth return { bestScore, bestMove }; // bestMove only used in root, to actually play it }

Bound types

This is a crucial consequence of alpha-beta pruning: the scores we pass around are only bounds on the true score, and seldom exact scores. The variable bestScore is only the best score SO FAR, until we have really searched all moves. But if we take a beta cutoff because we found a move that is good enough to refute the preceding opponent move, we skip the remaining moves, and there is no limit to how good these could be. All we now was that bestScore would be AT LEAST the score we found, (but what we found was already good enough to satisfy us), so it is a lower bound on the true score. Conversely, in the parent node, the score for this refuted move will be an upper bound, because we flip its sign. So score can be lower bounds (we took a beta cutoff), upper bounds (a daughter node took a beta cutoff, and might have worse punishment up it sleeve), or exact (our best move had an exact score, and on all the others we have an upper bound that was lower).

We have to be very much aware of the meaning of the score when we pass it around from one node to another through the hash table. In the node that came up with the score we know alpha and beta, and by comparing the score against those we can see what type of bound it is. But different nodes could have different {alpha, beta} (depending on the path to it, and the scores of its side branches). So a score we 'imported' from another node searching the same position cannot be interpreted against the current {alph, beta}. We will have to tranfer the information on its bound type with it through the hash table. And if we now have different alpha or beta, the work of the earlier search might not do us any good in the current node. E.g. if the retrieved result was made with a pretty low beta, (say -200), a quite poor move (say score -100) might have been enough to give a beta cutoff, so that -100 is a lower bound. But if we have beta = +100 in the current node, that same move will not produce a beta cutoff now. We would have to search the moves we cut off before, in the hope there is one that will score above +100. And it is cheapest to start searching for such moves at low depth. So search results with a bound type that is inconclusive are best ignored, and redone with the currently valid {alpha, beta}.

Code: Select all: void HashStore(HASHENTRY hash, int depth, MOVE move, int score, int alpha, int beta) { char boundType = 0; if(!hash) hash = ReserveHashEntry(); hash->move = move; hash->depth = depth; hash->score = score; if(score > alpha) boundType |= LOWERBOUND; // if >= beta the true score could be higher, because we did ignore moves after cutoff if(score < beta) boundType |= UPPERBOUND; // if <= alpha true score could be lower, because the daughters had beta cutoffs hash->bound = boundType; // note that scores between alpha and beta are exact, so both upper and lower bound } Boolean UsefulScore(HASHENTRY hash, int alpha, int beta) { // scores above alpha must be lower bounds, or the current alpha is lower than the one used for calculating the score if(hash->score > alpha && !(hash->bound & LOWERBOUND)) return 0; // scores below beta must be upper bounds, or the current beta is larger than the one used for calculating the score if(hash->score < beta && !(hash->bound & UPPERBOUND)) return 0; return 1; // if the score was exact, true upper bound <= alpha or true lower bound >= beta, it is meaningful }

by **H.G.Muller** » 31 Mar 2013, 19:31

4. Supplying the moves

We will now take a closr look to the pseudo-code responsible for looping over all moves. So far this was written (even more schematically):

Code: Select all: for(EVERY move) { // Make / Search / UnMake // score accounting }

We will now worry about how this looping over all moves is done in detail. One might expect there would be something like the generation of all moves in the loop initialization, writing the moves in a list, and then looping through that list. But we already started with a hash move, which we don't have to generate ourselves, and in many nodes this would be a cut move, and thus the only move needed. So we will try to postpone the generation of moves to until after it has been established the hash move was not a cut move. The supply of new moves, including their generation, is thus moved to the end of the loop, after the search code. That code could then be invoked before the move generation, to search the hash move.

Because we might not always have a hash move, the search code will have to be made conditional on the presence of one:

Code: Select all: move = hash->move; // somewhere in hash-probe code current = 0; // rewind move list in each depth iteration do { if(move != INVALID) { // skip search if we entered the loop with invalid hash move // Make / Search / UnMake // score accounting // this could break from the loop through beta cutoff } move = NextMove(move); // get next move, generating them when necessary } while(move != INVALID); MOVE NextMove(MOVE move) { if(phase == HASHMOVE) { // initialized like this when we entered the node GenerateAllMoves(); // after hash move we need to generate our own PutInFront(move); // but put the hash move (if any) in front of them current = -(move == INVALID); // put it one before next move, -1 without hash (start with 0), 0 if we had (start with 1) phase = CAPTURES; // after hash move captures are next in line } if(phase == CAPTURES) { // the captures have to be sorted in the right order before searching them move = ExtractMostPromisingCapture(current); // swaps the move to the front of the remaining moves if(!move) { // out of captures ExtractKillerMove(); // put killers (if found) in front of the remaining moves phase = NONCAPTS; // continue searching the non-captures } } move = moveList[++current]; // take the next move from the list return move; }

by **H.G.Muller** » 31 Mar 2013, 19:39

5. Quiescence Search

In QS we search only captures. The underlying assumption is that we almost always have a non-capture that conserves the current evaluation score. So either we take this score ('stand pat'), or, if it is not good enough, we try to improve it by capturing something.

Code: Select all: int Search(int alpha, int beta, int depth) { int bestScore = -INF; MOVE bestMove = INVALID; if(depth <= 0) { // *** in QS bestScore = Evaluate(); // *** use evaluation heuristic on position if(bestScore > alpha) { // *** score accounting for stand pat alpha = bestScore; // *** if(bestScore >= beta) return bestScore; // *** stand-pat cutoff } } for(EVERY move) { // loop over moves if(depth == 0 && !IsCapture(move)) continue; // *** skip all non-captures MakeMove(move); // updates game state (board, side to move) score = -Search(-beta, -alpha, depth-1); // recursion (flip sign because we change POV) UnMake(); if(score > bestScore) { // score accounting: remember best (from side-to-move POV) bestScore = score; if(score > alpha) { alpha = score; bestMove = move; if(score >= beta) break; // beta cutoff: previous ply is refuted by this move } } } // next move return { bestScore, bestMove }; // bestMove only used in root, to actually play it }

Note that IID makes no sense; the loop over moves is never executed more than once. The IID loop of the normal search starts at an iterDepth of at least 1, which is a 1-ply search, already more than QS, because it also searches captures. One cannot really say QS is a '0-ply search', though, because the moves that are searched (the captures) are searched at 1-ply. It is more like the first (and only) depth iteration is only performed half (the captures part). This is in fact a very useful way to look at things, and suggests how we can use the normal Search routine for QS with minimal changes.

We let the IID loop run to max(depth, 1) instead of depth, so that even for depth <= 0 the first iteration can be done. The stand-pat code can be done (conditionally to depth <= 0) in place of the bestScore = -INF initialization at the start of each depth iteration. So in the first (and in fact every) iteration of a 1-ply search (or deeper) bestScore will be set to -INF as usual, but in QS, where there only will be one iteration, it will be set to the evaluation of the current position. Because the moves will already be sorted captures first, we can break out of the moves loop as soon as we see the first capture. For example by having NextMove return INVALID instead of the killer move when we are in QS.

The sorting of the moves in QS is very important. We must try to capture the most valuable piece first, for two reasons:
1) to make it more difficult for the opponent to get even (as when he cannot, we can stand pat, and it is done)
2) to minimize the number of counter-strikes he has (as valuable pieces usually have many moves, amongst which many captures)
So it limits the QS tree both in depth and in width! A secondary concern is to capture with the least valuable piece first, as this in general produces the best result when the piece is protected. This is called MVV/LVA ordering, and we achieve it by assigning a sort key like victimValue - attackerPieceType to each capture. Here victimValue is in centiPawn, or in any case increases in big steps, while the attacker piece type just numbers the piece types 1, 2, 3..., in order of increasing value. The routine ExtractMostPromisingCapture() then looks for the move with the largest sort key in the list of remaining moves.

Code: Select all: int Search(int alpha, int beta, int depth) { int bestScore; MOVE bestMove = INVALID; int startDepth = 1; int phase = HASHMOVE; HASHENTRY *hash = HashProbe(); if(hash) { // position was searched before, and found in the table if(UsefulScore(hash, alpha, beta)) { // upper/lower bound might be useless with current {alpha, beta} startDepth = hash->depth + 1; // skip all depth iterations that were already done bestScore = hash->score; // this only in case we do iterDepth loop zero times (i.e. hash->depth >= depth) } move = hash->move; // *** use move anyway, for lack of a better alternative (put in 'move', not bestMove) } for(iterDepth from startDepth to max(1,depth)) { // loop over all depths (and always allow iterDepth=1 iteration) int iterAlpha = alpha; // as we touched alpha, we need to reset it to the original value int current = 0; // move number if(depth <= 0) { // *** in QS bestScore = Evaluate(); // *** use evaluation heuristic on position (as guessed score of best non-capture) if(bestScore > alpha) { // *** score accounting for stand pat alpha = bestScore; // *** if(bestScore >= beta) return bestScore; // *** stand-pat cutoff } if(!IsCapture(move)) bestMove = INVALID; // in QS we only can use a hash move that is a capture! } else bestScore = -INF; // for every depth we start with a clean slate do { // loop over moves if(move) { MakeMove(move); // updates game state (board, side to move) score = -Search(-beta, -alpha, depth-1); // recursion (flip sign because we change POV) UnMake(); if(score > bestScore) { // score accounting: remember best (from side-to-move POV) bestScore = score; if(score > iterAlpha) { iterAlpha = score; bestMove = move; if(current) PutInFront(bestMove); // *** put best move in front of list (if not yet there), so next depth iteration tries it first if(score >= beta) break; // beta cutoff: previous ply is refuted by this move } } } // INLINED NextMove CODE if(phase == HASHMOVE) { // initialized like this when we entered the node GenerateAllMoves(); // after hash move we need to generate our own PutInFront(move); // but put the hash move (if any) in front of them current = -(move == INVALID); // put it 1 before next move, -1 if we had no hash (to start with 0), 0 if we had (start with 1) phase = CAPTURES; // after hash move captures are next in line } if(phase == CAPTURES) { // the captures have to be sorted in the right order before searching them move = ExtractMostPromisingCapture(current); // swaps the move to the front of the remaining moves if(!move) { // out of captures if(depth <= 0) { move = INVALID; break; } // *** in QS we can now break out of the move loop, as we won't consider non-captures ExtractKillerMove(); // put killers (if found) in front of the remaining moves phase = NONCAPTS; // continue searching the non-captures } } move = moveList[++current]; // take the next move from the list } while(move != INVALID); // next move HashStore(iterDepth, move, // save what we need in next depth iteration in global table bestScore, alpha, beta); // alpha and beta are needed to determine meaning of score (bound type) } // next depth return { bestScore, bestMove }; // bestMove only used in root, to actually play it }

Note that we moved putting the best move in front to the place where we do find a new best move. This because the code has evolved such that the hash move is treated special, not taken from the list, but already assigned to the 'move' variable when we first enter the loop. It has the additional advantage that any move that is certified good (i.e. not a fail low that could be arbitrarily bad) will float to the head of the list, not just the single best.

by **Robert Pope** » 10 May 2013, 15:22

Just curious - did you ever finish your Fairy-Max rewrite? I looked on your site but didn't find anything.

by **H.G.Muller** » 13 May 2013, 21:44

I haven't had much time lately. It participated in a 10x8 tourney as 'Max-Plus', but this revealed a bug (not seeing mate coming), which did not reproduce, and thus is not fixed yet. No soyuurces were published so far.

Winboard Forum

Anatomy of a simple engine: Fairy-Max

Anatomy of a simple engine: Fairy-Max

Negamax

Alpha-beta pruning

Iterative Deepening

Transposition table

Move supply

Quiescence Search

Re: Anatomy of a simple engine: Fairy-Max

Re: Anatomy of a simple engine: Fairy-Max

Who is online