Can a Computer Learn?

Abstract

Even as a child, everyone knows how to play tic tac toe. The fun game of Xs and Os on a 3x3 board to make a line of 3 in a row. But can a computer learn to do it, from not knowing even what the goal is, to winning every time you play against it? After testing with all the settings 3 times, I found that, yes, it can learn how to play, but thus far, none of the settings are perfect.

Purpose

The purpose of my experiment is to determine if my computer can learn how to play the simple game tic-tac-toe from start to finish. Can it learn how to stop making invalid moves, can it learn how to become undefeatable, or will it learn at all?

Hypothesis

I think that if I create a learning tic-tac-toe player, then it can learn how to play. For this to work the best, there are configurations that I think need to be made to learn the fastest and play best. I am testing multiple options for neural network learning. Including network type A or B (A using 2 input nodes per position on the board and a squash function that results in values between 0 and 1. B uses 1 input node per position and a squash function that results in values between -1 and 1), number of hidden layers and number of nodes per layer. I hypothesize that option B; 2 hidden layers; and 9 nodes/layer will be the most optimal answer. I think that if it was A then it would take longer to learn to play because more nodes divides the processing. If it had 1 hidden layer, it would learn how to play. However, it wouldn't be a good player. With 3 and 4 hidden layers it would miss important rules so it won't be able to learn. I think 2 hidden layers is the balance. With 1-6 nodes per layer, it takes longer to train because it doesn't look at a lot of information. With more that 9 and it will understand more about tic-tac-toe at the cost of more games to learn the rules. I think the most balanced answer will be 9 nodes per layer.

Materials list

Ubuntu computer version 14.4.2 desktop version available at http://www.ubuntu.com/download/alternative-downloads
JDK ( version 1.7.0_75)
JRE ( version 1.7.0_75)
Learning tic tac toe program at kevin.bloomquist.com/ttt. Link to the source code: java_ttt_neural_network.tgz and/or the playable version

Procedure

Install programs from kevin.bloomquist.com/ttt into the folder Documents/java/TTT then create a folder inside it called config.
Open a terminal, change directory to TTT and run the command "java ttt init". This will create config files in TTT/config. These new config files are the settings for our program.
Now run the command "java ttt config/[config file].xml". If you want to save the output to a file, add to the end ">[filename]" to create a new file or ">>[filename]" to add to a file already created.
Read and examine data. The program will output one line of information per game for a total of 9,999 games. Each line indicating win/loss, invalid game, winning percent, and invalid game percent. For example 5569:1:0:70.0:0.0:A:4:12 represents results for game #5569 as a win, no invalid moves made. The running average wins over the last 1000 games is 70.0%, no invalid games in the last 10000 games. Network type of A, with 4 hidden layers and 12 nodes per layer.

Consolidated results from 1,079,892 games played

Results

In this experiment, I found that the most significant variable isn't the network type, or hidden layers. The most consequential variable is the nodes per layer.

Conclusion

In conclusion, my hypothesis was incorrect. I thought that the settings B,2,9 were the best. However, this setting learned how to play, but not how to win with a score of just 52.5%. The best was actually A,2,18. I found that if the network option was A then it would actually learn better slightly, probably because of more input nodes, rather than longer. I found that the number of hidden layers is not a major factor. With 1-9 nodes per layer, it never learns how to win. In this experiment, I learned what a neural network is and learned a handful more about java programming.