Supervised learning (also known as behavioral cloning) if the agent is learning what to do in an observe-act cycle) is a type of decision making method.
constituents
- input space: \(\mathcal{X}\)
- output space: \(\mathcal{Y}\)
- hypothesis/model/prediction: \(h : \mathcal{X} \to \mathcal{Y}\)
requirements
Our ultimate goal is to learn a good model \(h\) from the training set:
- what “good” means is hard to define
- we generally want to use the model on new data, not just the training set
continuous \(\mathcal{Y}\) is then called a regression problem; discrete \(\mathcal{Y}\) is called a classification problem.
That is, we want our hypothesis to behave as \(h_{\theta}\qty (x^{(i)}) \approx y^{(i)}\).
additional information
training set
The training set is a set of pairs:
\begin{equation} \qty {\qty(x^{(1)}, y^{(1)}) \dots \qty (x^{(n)}, y^{(n)})} \end{equation}
such that \(x^{(j)} \in \mathcal{X}, y^{(j)} \in \mathcal{Y}\).
We call \(n\) the training set size.
main procedure
- provide the agent with some examples
- use an automated learning algorithm to generalize from the example
This is good for typically representative situations, but if you are throwing an agent into a completely unfamiliar situation, supervised learning cannot perform better.
Disadvantages
- the labeled data is finite
- limited by the quality of performance in the training data
- interpolation between states are finite
cost function
see cost function