Create an approximation of the value function \(U_{\phi}\) using Approximate Value Function, and use Policy Gradient to optimize an monte-carlo tree search policy
Create an approximation of the value function \(U_{\phi}\) using Approximate Value Function, and use Policy Gradient to optimize an monte-carlo tree search policy