Reinforcement learning And POMDP
Posted
by Betamoo
on Stack Overflow
See other posts from Stack Overflow
or by Betamoo
Published on 2010-05-01T16:04:39Z
Indexed on
2010/05/01
16:07 UTC
Read the original article
Hit count: 376
machine-learning
|markov-models
|reinforcement-learning
|probability
|neural-network
- I am trying to use Multi-Layer NN to implement probability function in Partially Observable Markov Process..
- I thought inputs to the NN would be: current state, selected action, result state; The output is a probability in [0,1] (prob. that performing selected action on current state will lead to result state)
- In training, I fed the inputs stated before, into the NN, and I taught it the output=1.0 for each case that already occurred.
The problem :
For nearly all test case the output probability is near 0.95.. no output was under 0.9 !
Even for nearly impossible results, it gave that high prob.
PS:I think this is because I taught it happened cases only, but not un-happened ones.. But I can not at each step in the episode teach it the output=0.0 for every un-happened action!
Any suggestions how to over come this problem? Or may be another way to use NN or to implement prob function?
Thanks
© Stack Overflow or respective owner