E-PCA

We want to solve huge POMDP in the real world, but the belief states are huge. Notably, reachable beliefs are very small given an initial belief.

Why is vanilla PCA bad

PCA as a denoising procedure: the underlying data is some data which is normally noised. This is not strictly true, the points don’t have normal noise.

Better PCA: E-PCA

Instead of Euclidean distance, we use

\begin{equation} L(U,V) = \mid X-UV\mid^{2} \end{equation}

as a metric, where:

\(U\) the feature

specifically:

\begin{equation} F(z) - yz + F^{*}(y) \end{equation}

where \(F\) is any convex objective that is problem specific that you choose,

Bregman Divergence forces the underlying matricies’ bases to be non-negative

Overall Methods

collect sample beliefs
apply the belifs into E-PCA
Discretize the E-PCA’d belifs into a new state space \(S\)
Recalculate R (\(R(b) = b \cdot R(s)\)) and T (we simply sample \(b,o\) and calculate \(update(b,a,o)\)) for that state space S; congratulations, you are now solving an MDP
value iteration