Abstract
This thesis is split into two independent parts.
The first is an investigation of some practical aspects of Marcus Hutter's Universal Artificial Intelligence theory.
The main contributions are to show how a very general agent can be built and analysed using the mathematical tools of this theory.
Before the work presented in this thesis, it was an open question as to whether this theory was of any relevance to reinforcement learning practitioners.
This work suggests that it is indeed relevant and worthy of future investigation.
The second part of this thesis looks at self-play learning in two player, deterministic, adversarial turn-based games.
The main contribution is the introduction of a new technique for training the weights of a heuristic evaluation function from data collected by classical game tree search algorithms.
This method is shown to outperform previous self-play training routines based on Temporal Difference learning when applied to the game of Chess.
In particular, the main highlight was using this technique to construct a Chess program that learnt to play master level Chess by tuning a set of initially random weights from self play games.