1 - Interface to FANN¶ ↑
$ cd src/tictactoe ; pushd `bundle list ruby-fann` $ cd ext/ruby_fann
Pick up changes to FANN from ruby-fann
$ make clean ; make
Training Artificial Neural Network (ANN)
ruby_fann.c::Init_ruby_fann() Provides the ruby wrapper around the C fann lib fann_train_data.c::fann_train_on_data() fann_train_epoch() fann_train_epoch_irpropm() Cycles through each training sample fann_train.c::fann_compute_MSE() Calc diff of output neurons from expected output given in trg sample
2 - Reinforcement learning¶ ↑
Software agents taking actions in an environment so as to maximise some notion of cumulative reward
In economics and game theory, reinforcement learning may be used to explain how equilibrium may arise under bounded rationality
Typically formulated as a Markov decision process (MDP)
Mathematical framework for modeling decision making in situations where outcomes are partly random and partly under the control of the decision maker MDP is a discrete time stochastic control process At each time step the process is in some state "s" and the decision maker may choose any action "a" that is available in state "s" The process responds at the next time step by randomly moving into a new state "s'" and giving the decision maker a corresponding reward R.a(s,s') Given "s" and "a", the process is conditionally independent of all previous states and actions Extension of Markov chains so if only 1 action exists and all rewards are the same (ie zero) then MDP reduces to Markov chain $ cd ~/Documents/ann ; wget -r -np -k https://webdocs.cs.ualberta.ca/~sutton/book/ebook/ -r = recursive -np = don't follow links to parent directories -k = make links in downloaded HTML/CSS point to local files