Update Rule in Temporal difference

Posted by Betamoo on Stack Overflow See other posts from Stack Overflow or by Betamoo
Published on 2010-05-28T12:45:08Z Indexed on 2010/05/29 0:52 UTC
Read the original article Hit count: 432

Filed under:

artificial-intelligence

|

machine-learning

|

markov-models

|

q-learning

|

temporal-difference

The update rule TD(0) Q-Learning:

Q(t-1) = (1-alpha) * Q(t-1) + (alpha) * (Reward(t-1) + gamma* Max( Q(t) ) )
Then take either the current best action (to optimize) or a random action (to explorer)

Where MaxNextQ is the maximum Q that can be got in the next state...

But in TD(1) I think update rule will be:

Q(t-2) = (1-alpha) * Q(t-2) + (alpha) * (Reward(t-2) + gamma * Reward(t-1) + gamma * gamma * Max( Q(t) ) )

My question:
The term gamma * Reward(t-1) means that I will always take my best action at t-1 .. which I think will prevent exploring..
Can someone give me a hint?

Thanks

© Stack Overflow or respective owner

Related posts about artificial-intelligence

Artificial Intelligence Research within Microsoft

as seen on ASP.net Weblogs - Search for 'ASP.net Weblogs'
I've got the contact with Eric Horvitz today, interview with him about Artificial Intelligence-related research within Microsoft you can see below: ...(read more) >>> More
Artificial Intelligence Research within Microsoft

as seen on ASP.net Weblogs - Search for 'ASP.net Weblogs'
I've got the contact with Eric Horvitz today, interview with him about Artificial Intelligence-related research within Microsoft you can see below: ...(read more) >>> More
Design for a machine learning artificial intelligence framework

as seen on Stack Overflow - Search for 'Stack Overflow'
This is a community wiki which aims to provide a good design for a machine learning/artificial intelligence framework (ML/AI framework). Please contribute to the design of a language-agnostic framework which would allow multiple ML/AI algorithms to be plugged into a single framework which: runs… >>> More
Design for a machine learning artificial intelligence framework (community wiki)

as seen on Stack Overflow - Search for 'Stack Overflow'
This is a community wiki which aims to provide a good design for a machine learning/artificial intelligence framework (ML/AI framework). Please contribute to the design of a language-agnostic framework which would allow multiple ML/AI algorithms to be plugged into a single framework which: runs… >>> More
what languages are used in AI research today?

as seen on Stack Overflow - Search for 'Stack Overflow'
hi. I am currently dabbling in expert systems, emacs lisp, and reading up about artificial intelligence. Traditionally, artificial intelligence is associated with LISP and expert systems with CLIPS. However, I have noticed in computational sciences how much Python is being used. What about the… >>> More

Related posts about machine-learning

Machine learning challenge: diagnosing program in java/groovy (datamining, machine learning)

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi All! I'm planning to develop program in Java which will provide diagnosis. The data set is divided into two parts one for training and the other for testing. My program should learn to classify from the training data (BTW which contain answer for 30 questions each in new column, each record in… >>> More
Is it possible to predict future using machine learning and/or AI?

as seen on Programmers - Search for 'Programmers'
Recently I have started reading about machine learning. From 3000 feet view, machine learning seems really great thing but as if now I have found that machine learning is limited to only 3 types of algorithms namely classification, clustering and recommendations. I would like to know if my assumption… >>> More
Design for a machine learning artificial intelligence framework

as seen on Stack Overflow - Search for 'Stack Overflow'
This is a community wiki which aims to provide a good design for a machine learning/artificial intelligence framework (ML/AI framework). Please contribute to the design of a language-agnostic framework which would allow multiple ML/AI algorithms to be plugged into a single framework which: runs… >>> More
A good machine learning technique to weed out good URLs from bad

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, I have an application that needs to discriminate between good HTTP GET requests and bad. For example: http://somesite.com?passes=dodgy+parameter # BAD http://anothersite.com?passes=a+good+parameter # GOOD My system can make a binary decision about whether or not a… >>> More
Design for a machine learning artificial intelligence framework (community wiki)

as seen on Stack Overflow - Search for 'Stack Overflow'
This is a community wiki which aims to provide a good design for a machine learning/artificial intelligence framework (ML/AI framework). Please contribute to the design of a language-agnostic framework which would allow multiple ML/AI algorithms to be plugged into a single framework which: runs… >>> More