Reinforcement learning And POMDP

Posted by Betamoo on Stack Overflow See other posts from Stack Overflow or by Betamoo
Published on 2010-05-01T16:04:39Z Indexed on 2010/05/01 16:07 UTC
Read the original article Hit count: 376

  • I am trying to use Multi-Layer NN to implement probability function in Partially Observable Markov Process..
  • I thought inputs to the NN would be: current state, selected action, result state; The output is a probability in [0,1] (prob. that performing selected action on current state will lead to result state)
  • In training, I fed the inputs stated before, into the NN, and I taught it the output=1.0 for each case that already occurred.

The problem :
For nearly all test case the output probability is near 0.95.. no output was under 0.9 ! Even for nearly impossible results, it gave that high prob.

PS:I think this is because I taught it happened cases only, but not un-happened ones.. But I can not at each step in the episode teach it the output=0.0 for every un-happened action!

Any suggestions how to over come this problem? Or may be another way to use NN or to implement prob function?

Thanks

© Stack Overflow or respective owner

Related posts about machine-learning

Related posts about markov-models