Key components
- Task/Objective (“Automated Driving to reach destination [here]”)
- Resources (state) (“sensors, fuel, etc.”)
- Uncertainties (“What in the world is happening”)
- Actions (“turn left”)
In one line: an agent makes decisions via the balance of observation with uncertainty. This is called the observe-act cycle.
See also connectionism
Applications
- Stock shelving
- Automated driving
- Space missions
- Sports
- Congestion modeling
- Online dating
- Traffic light control
decision making methods
- explicit programming: “just code it up” — try this first if you are building something, which should establish a baseline: guess all possible states, and hard code strategies for all of them
- supervised learning: manually solve representative states, hard code strategies for them, make model interpolate between them
- optimization: create optimization objective connected to a model of the environment, optimize that objective
- planning: using model of the environment directly to predict best moves
- reinforcement learning: make agent interact with environment directly, and optimize its score of success in the environment without a model
Method | Model Visible? | Strategy Hard-Coded? |
---|---|---|
explicit programming | yes, all states fully known | yes |
supervised learning | no, only a sample of it | yes, only a sample of it |
optimization | no, except reward | no |
planning | yes | no |
reinforcement learning |