OTHELLO
Implementation
by Arificial Intelligence
Introduction
:
Othelo is basically a very simple game and as we beleive the popularity of this game is due to this very simple reason.The moves and the rules are very easy compare to its originator Chess.But the game itself requires players with highly adaptive thinking and foresight.
The game is very
much like chess.The board is a 8X8 matrics with the adjacent squares colored
differently,oscilating between black and white.The game starts with 2 pieces
for both of the team,positioned in the middle diagonally.
| ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
| . | . | . | . | . | . | . | . |
| . | . | . | . | .. | . | . | . |
| . | . | . | ...B | ...W | . | . | . |
| . | . | . | ....W | ....B | . | . | . |
| . | . | . | . | . | . | . | . |
| . | . | . | . | . | . | . | . |
| . | . | . | .. | . | . | . | . |
Rules:
Implementation:
Opening knowledge
All strong programs use opening books and update their books automatically after each game. Most of the top programs have opening books which to some extent are based on IOS games, so the books have a large overlap. What differs is how the information from the games is used. A common approach (used by e.g. Hannibal, Logistello, Zebra, Brutus and Turtle) is to go through all positions from all games in the game database and determine the best move not played in any database game. This is time-consuming as a deep search must be performed for each position, but once this is done, updating the book on the fly is easy - after each game played, all new positions (and maybe some old) are searched for the best deviation. Using the resulting set of positions and moves (both deviations and moves actually played) the best book lines for both sides can be determined using a simple minimax search.
Some other programs (e.g. Tadpole and Spif) don't store deviations from each position but instead calculate deviations on the fly. The main advantage of this approach is that there is no need to calculate the deviations (which is very time-consuming). There are several drawbacks though; this kind of opening book is much more vulnerable to weak moves in the game database which can lead to dead lost positions straight out of book.
In our programme
for the simplicity of the programme we kept the opening to the oponenet.As
the huge databsae to store the opening books were not introduced to us
we refrained from any kind of learning in the game.The programme immedietly
starts seraching for the best move just after the first move by the oponent.
Endgame
As the number of moves available to each player decreases towards the end of the game, the programs can search much deeper in the endgame phase of the game. This enables them to play the endgame much better than any human; For computer programs, the endgame starts in what most humans would refer to the late midgame; with 20-30 moves left of the game.
The search algorithms
used in the midgame work well in the endgame as well (including a variation
of MPC). The objective changes though: For a program, the endgame starts
when it can calculate who is winning, i.e., follow all variations until
no player has any move left. This is usually with 24-30 moves left for
the top programs. When the program knows who is winning, it starts to optimize
the score (win by as much as possible). This usually is much harder than
determining who is winning (unless the position is a draw) and can usually
be accomplished with 21-26 moves left.
Searching
The search strategy used by us:
At some point, we have to stop looking ahead - simply because even in a simple game like Reversi, there are far too many possibilities to consider. We just cannot look ahead from the first move until the end of the game. Therefore, we need to limit the search depth of our look-ahead. When we have reached the last level, which is called the horizon because we can't look beyond it, we use the evaluation function to find out whether the position we arrived at is good. Finally, we get a game tree like the one below:
Top | +---------------+----------------+ | | A B (0) MAX +----------+---------+ +----------+----------+ a b c d (1) MIN +----+----+ +----+----+ +----+----+ | A B C D E F G H (2) MAX
This figure shows the game tree resulting from a 2-move look-ahead. On the lowest level (2), we just call the evaluation function to find out which of the moves A to C, D and E, F and G, and H, is the best. We take the maximum of these evaluations as the evaluation for moves a, b, c, and d on level one. One that level, we now take the minimum of these values as the evaluation for moves A and B on level 0. On the top level, we again take the maximum and then we execute the chosen move.
We used MiniMax Search
in a slightly modified fahision. Assuming human player is good enough we
have to consider his/her good moves. We assumed that human player should
move play his/her best. Our evaluation function returns values with respect
to individual players (not always from computer's view). So we also assumed
that human player will move accordingly his maximum benefit. We take the
path which returns the maximum result for the evaluation function for both
human and computer moves.
Position
evaluation
This is by far the most important part of the program, if this module is weak it matters little if the program has great search algorithms. There are some different paradigms for creating evaluation functions. They describe the evolution of Zebra's knowledge and most Othello programs can be placed in one of these categories.
Programs with this type of evaluation are invariably weak On the other hand, it is very easy to program this evaluation function, so we started with this evaluation function.But it was soon discovered that this evaluation most of the time creates a bad move.We soon switched over to other more dynamic evaluation function.
Most programs with mobility-based evaluation functions also have some knowledge of edge and corner configurations and try to minimize the number of disks during the early midgame, another strategy used by human players.
This leads to the following generalization: Only use local configurations (patterns) in the evaluation function. The usual implementation is to evaluate each row, column, diagonal and corner configuration separately and add together the values. For this to work out well, lots of different patterns have to be evaluated - there are 3321 different edge configurations alone, and when all the configurations are counted, there are tens of thousands. To make bad things worse, the relative values of the different configurations are highly game-stage dependent. Clearly, the process of determining values for all configurations must be done automatically. This is done by taking a large database of games played between strong players and calculating statistics for each configuration in each game stage from all the games. Exactly how this is done varies from program to program .
In the process of determining the relative values of the different configurations, one must decide what the evaluation function should try to predict. The most common choices are
Our evaluation function
uses a combination of mobility and corner advantages. Atfirst we had considered
mobility only, but the result was not good enough. It was not very difficult
to win over the computer. So we included corner advantages in our evaluation
function. We also keep some values for current material advantages. We
think that corners are most important positions of the board. Becasue it
cannot be flipped by the opponent. So every player would like to move to
the corners as soon as he/she get a chance. Mobility is also an important
issue. A better criterion for determining a position is to count the number
of moves a player can make. If you only have a few moves, it's likely that
your opponent can play such that you are forced to make bad moves. Therefore,
if your mobility is high, the position is better for you than if it's low.
We cannot neglect the current material advantage. We used this but with
less importance than the other issues.
Some
concepts that were not considered:
Conclusion:
The game gave us a very first hand experience in programming a game with the computer thinking.The theoritical knowledge from the class really came to help in every step of programming the game.The search strategies and Evaluation functions designed were time and time evaluated and changed to create better thinking programme.The satire that we faced while evaluating its performance was time and time we were praying for our own programme to defeat us.In a way we were praying for a Frankenestein.We hope we were able to create one in the end!