OTHELLO
Implementation by Arificial Intelligence





oIntroduction :

Othelo is basically a very simple game and as we beleive the popularity of this game is due to this very simple reason.The moves and the rules are very easy compare to its originator Chess.But the game itself requires players with highly adaptive thinking and foresight.

The game is very much like chess.The board is a 8X8 matrics with the adjacent squares colored differently,oscilating between black and white.The game starts with 2 pieces for both of the team,positioned in the middle diagonally.
 

---- ---- ---- ---- ---- ---- ---- ----
. . . . . . . .
. . . . .. . . .
. . . ...B ...W . . .
. . . ....W ....B . . .
. . . . . . . .
. . . . . . . .
. . . .. . . . .

 
 

oRules:


 

oImplementation:

Opening knowledge

All strong programs use opening books and update their books automatically after each game. Most of the top programs have opening books which to some extent are based on IOS games, so the books have a large overlap. What differs is how the information from the games is used. A common approach (used by e.g. Hannibal, Logistello, Zebra, Brutus and Turtle) is to go through all positions from all games in the game database and determine the best move not played in any database game. This is time-consuming as a deep search must be performed for each position, but once this is done, updating the book on the fly is easy - after each game played, all new positions (and maybe some old) are searched for the best deviation. Using the resulting set of positions and moves (both deviations and moves actually played) the best book lines for both sides can be determined using a simple minimax search.

Some other programs (e.g. Tadpole and Spif) don't store deviations from each position but instead calculate deviations on the fly. The main advantage of this approach is that there is no need to calculate the deviations (which is very time-consuming). There are several drawbacks though; this kind of opening book is much more vulnerable to weak moves in the game database which can lead to dead lost positions straight out of book.

In our programme for the simplicity of the programme we kept the opening to the oponenet.As the huge databsae to store the opening books were not introduced to us we refrained from any kind of learning in the game.The programme immedietly starts seraching for the best move just after the first move by the oponent.
 

Endgame

As the number of moves available to each player decreases towards the end of the game, the programs can search much deeper in the endgame phase of the game. This enables them to play the endgame much better than any human; For computer programs, the endgame starts in what most humans would refer to the late midgame; with 20-30 moves left of the game.

The search algorithms used in the midgame work well in the endgame as well (including a variation of MPC). The objective changes though: For a program, the endgame starts when it can calculate who is winning, i.e., follow all variations until no player has any move left. This is usually with 24-30 moves left for the top programs. When the program knows who is winning, it starts to optimize the score (win by as much as possible). This usually is much harder than determining who is winning (unless the position is a draw) and can usually be accomplished with 21-26 moves left.
 
 
 

oSearching

All the strong programs of today determine the best move in a position by searching many moves ahead: Obviously, the number of positions arising even after only a few moves grows very fast; in most midgame positions, each players have about 10 moves each, a search in the depth of 9 results in a billion (109) positions to investigate. Fortunately, it can be proved (mathematically) that a program need not consider all these positions in order to find the best move. Instead a procedure called alpha-beta pruning (invented in the mid-70's by Donald Knuth and others) is used which greatly reduces the number of positions which need be examined. A good Othello program would only have to look at 100'000-1'000'000 positions in order to look 9 moves ahead instead of a billion without alpha-beta. There are several variations of alpha-beta, the most widespread being nega-scout (discovered by Alexander Reinefeld), but they do not differ fundamentally from plain alpha-beta pruning. For alpha-beta pruning, it is essential that the best moves in a position are searched first. This can be accomplished using several heuristics. One is to store "killer responses" to each move - if my opponent plays the X-square g2, I definitely should look at h1 first to see if the corner can be captured, and if this is a wise thing to do. Another useful heuristic is to perform shallow searches; before starting on a depth 12 search, the program searches the position to depth 2 and finds the move which looks best after the shallow search, which obviously takes negligeable time compared to the full search. The strongest Othello programs of today (Logistello, Hannibal, Zebra, Turtle and Brutus) all use some form of selective search mechanism. The idea is very simple and resembles the thinking of human players: In most positions, many moves are obviously bad and need only be looked at long enough to determine that they are bad. This is usually implemented as follows: When searching a position to depth 12 (i.e., 12 moves ahead), the program starts by searching all moves available to depth 4 and discards the moves which seemed really bad when searched to depth 4. The remaining moves are searched to depth 12. The concept of "really bad" is formalized by generating lots of statistics of how well a search to depth 4 predicts the result of a search to depth 12. This procedure is called Multi Prob-Cut (invented by Michael Buro) and the example above uses the "cut pair" 4/12. Obviously other cut pairs can be used, and most programs use several cut pairs (usually one for each search depth up to 13 or so).

The search strategy used by us:

Assume for the time being that we have an evaluation function at our disposal. We use a simple technique called "MiniMax Search" to look ahead. MiniMax search is called thusly because when we try to determine the best move, we always consider the best moves of each player. We have to consider alternatingly the maximum and the minimum of the evaluation function's results.

At some point, we have to stop looking ahead - simply because even in a simple game like Reversi, there are far too many possibilities to consider. We just cannot look ahead from the first move until the end of the game. Therefore, we need to limit the search depth of our look-ahead. When we have reached the last level, which is called the horizon because we can't look beyond it, we use the evaluation function to find out whether the position we arrived at is good. Finally, we get a game tree like the one below:

                                   Top

                                    |

                    +---------------+----------------+

                    |                                |

                    A                                B              (0) MAX

         +----------+---------+           +----------+----------+

         a                    b           c                     d   (1) MIN

    +----+----+          +----+----+ +----+----+                |

    A    B    C          D         E F         G                H   (2) MAX
 

 
 
 

 

This figure shows the game tree resulting from a 2-move look-ahead. On the lowest level (2), we just call the evaluation function to find out which of the moves A to C, D and E, F and G, and H, is the best. We take the maximum of these evaluations as the evaluation for moves a, b, c, and d on level one. One that level, we now take the minimum of these values as the evaluation for moves A and B on level 0. On the top level, we again take the maximum and then we execute the chosen move.

We used MiniMax Search in a slightly modified fahision. Assuming human player is good enough we have to consider his/her good moves. We assumed that human player should move play his/her best. Our evaluation function returns values with respect to individual players (not always from computer's view). So we also assumed that human player will move accordingly his maximum benefit. We take the path which returns the maximum result for the evaluation function for both human and computer moves.
 
 
 

oPosition evaluation

This is by far the most important part of the program, if this module is weak it matters little if the program has great search algorithms. There are some different paradigms for creating evaluation functions. They describe the evolution of Zebra's knowledge and most Othello programs can be placed in one of these categories.

The idea behind this type of evaluation function is that different squares have different values - corners are good and the squares next to corners are bad. Disregarding symmetries, there are 10 different positions on a board, and each of these is given a value for each of the three possibilities: black disk, white disk and empty. A more sophisticated approach is to have different values for each position during the different stages of the game; e.g. corners are more important in the opening and early midgame than in the endgame.

Programs with this type of evaluation are invariably weak On the other hand, it is very easy to program this evaluation function, so we started with this evaluation function.But it was soon discovered that this evaluation most of the time creates a bad move.We soon switched over to other more dynamic evaluation function.

This more sophisticated approach substitutes the extremely local view of the board used in the disk-square tables by a more global view of the board. The key observation is that most human players strive to maximize mobility (number of moves available) and minimize frontier disks (disks adjacent to empty squares). These measures, or at least good approximations, can be found very quickly if coded efficiently, and they increase playing strength a lot.

Most programs with mobility-based evaluation functions also have some knowledge of edge and corner configurations and try to minimize the number of disks during the early midgame, another strategy used by human players.

As mentioned above, programs of moderate strength often incorporate some knowledge about edge and corner configurations. Mobility maximization and frontier minimization are global features, but they can be broken down into local configurations which can be added together. The same goes for disk minimization.

This leads to the following generalization: Only use local configurations (patterns) in the evaluation function. The usual implementation is to evaluate each row, column, diagonal and corner configuration separately and add together the values. For this to work out well, lots of different patterns have to be evaluated - there are 3321 different edge configurations alone, and when all the configurations are counted, there are tens of thousands. To make bad things worse, the relative values of the different configurations are highly game-stage dependent. Clearly, the process of determining values for all configurations must be done automatically. This is done by taking a large database of games played between strong players and calculating statistics for each configuration in each game stage from all the games. Exactly how this is done varies from program to program .

In the process of determining the relative values of the different configurations, one must decide what the evaluation function should try to predict. The most common choices are

Our Evaluation function:

Our evaluation function uses a combination of mobility and corner advantages. Atfirst we had considered mobility only, but the result was not good enough. It was not very difficult to win over the computer. So we included corner advantages in our evaluation function. We also keep some values for current material advantages. We think that corners are most important positions of the board. Becasue it cannot be flipped by the opponent. So every player would like to move to the corners as soon as he/she get a chance. Mobility is also an important issue. A better criterion for determining a position is to count the number of moves a player can make. If you only have a few moves, it's likely that your opponent can play such that you are forced to make bad moves. Therefore, if your mobility is high, the position is better for you than if it's low. We cannot neglect the current material advantage. We used this but with less importance than the other issues.
 
 
 

oSome concepts that were not considered:

Good Reversi players often will try not to play an edge move.The reason is that edge moves interact with the pieces on the adjacent edges, turning more pieces and thus creating opportunities for the opponent. Our programme has no notion of this.It keeps a blind eye to this matter. Having access to a certain region on the board is important. If we cannot access a region, our opponent can. A player can for instance lose access because all the edge discs in that area are his own. We don't think it's necessary to deal with access explicitly: the potential mobility already should take care of that. Parity is especially important in corner regions. A player can lose parity if s/he loses access to a region of an odd number of fields. This may prove a disadvantage later on because players set pieces alternatingly, so the player might be forced to make a bad move. The programme lacks a opening library to start a certain attack pattern.This led us to force the oponent to move first.Simple mobilty is sufficient enough to avoid parllel opening.
 
 
 

oConclusion:

The game gave us a very first hand experience in programming a game with the computer thinking.The theoritical knowledge from the class really came to help in every step of programming the game.The search strategies and Evaluation functions designed were time and time evaluated and changed to create better thinking programme.The satire that we faced while evaluating its performance was time and time we were praying for our own programme to defeat us.In a way we were praying for a Frankenestein.We hope we were able to create one in the end!

Want to play it?

Return to my home