Search Results

Search found 3771 results on 151 pages for 'social intelligence'.

Page 35/151 | < Previous Page | 31 32 33 34 35 36 37 38 39 40 41 42  | Next Page >

  • Static Evaluation Function for Checkers

    - by Kamikaze
    Hi all! I'm trying to write an evaluation function for a game of checkers that I'm developing but I can't find the right documentation. I've read several documents on the web witch describe different techniques for either writing one or letting the computer find it(using genetic algorithms or Bayesian learning) but they're too complicated for a novice like me. All documents pointed a reference to "Some studies in machine learning using the game of checkers" by A.L.Samuel but I couldn't get my hands on it yet :(. I've only read the follow up "Some studies in machine learning using the game of checkers -II" and found some good info there, but it doesn't explain what the eval parameters mean (I think I don't have the whole article). Thanks for your help!

    Read the article

  • Find optimal strategy and AI for the game 'Proximity'?

    - by smci
    'Proximity' is a strategy game of territorial domination similar to Othello, Go and Risk. Two players, uses a 10x12 hex grid. Game invented by Brian Cable in 2007. Seems to be a worthy game for discussing a) optimal algorithm then b) how to build an AI. Strategies are going to be probabilistic or heuristic-based, due to the randomness factor, and the insane branching factor (20^120). So it will be kind of hard to compare objectively. A compute time limit of 5s per turn seems reasonable. Game: Flash version here and many copies elsewhere on the web Rules: here Object: to have control of the most armies after all tiles have been placed. Each turn you received a randomly numbered tile (value between 1 and 20 armies) to place on any vacant board space. If this tile is adjacent to any ally tiles, it will strengthen each tile's defenses +1 (up to a max value of 20). If it is adjacent to any enemy tiles, it will take control over them if its number is higher than the number on the enemy tile. Thoughts on strategy: Here are some initial thoughts; setting the computer AI to Expert will probably teach a lot: minimizing your perimeter seems to be a good strategy, to prevent flips and minimize worst-case damage like in Go, leaving holes inside your formation is lethal, only more so with the hex grid because you can lose armies on up to 6 squares in one move low-numbered tiles are a liability, so place them away from your main territory, near the board edges and scattered. You can also use low-numbered tiles to plug holes in your formation, or make small gains along the perimeter which the opponent will not tend to bother attacking. a triangle formation of three pieces is strong since they mutually reinforce, and also reduce the perimeter Each tile can be flipped at most 6 times, i.e. when its neighbor tiles are occupied. Control of a formation can flow back and forth. Sometimes you lose part of a formation and plug any holes to render that part of the board 'dead' and lock in your territory/ prevent further losses. Low-numbered tiles are obvious-but-low-valued liabilities, but high-numbered tiles can be bigger liabilities if they get flipped (which is harder). One lucky play with a 20-army tile can cause a swing of 200 (from +100 to -100 armies). So tile placement will have both offensive and defensive considerations. Comment 1,2,4 seem to resemble a minimax strategy where we minimize the maximum expected possible loss (modified by some probabilistic consideration of the value ß the opponent can get from 1..20 i.e. a structure which can only be flipped by a ß=20 tile is 'nearly impregnable'.) I'm not clear what the implications of comments 3,5,6 are for optimal strategy. Interested in comments from Go, Chess or Othello players. (The sequel ProximityHD for XBox Live, allows 4-player -cooperative or -competitive local multiplayer increases the branching factor since you now have 5 tiles in your hand at any given time, of which you can only play one. Reinforcement of ally tiles is increased to +2 per ally.)

    Read the article

  • Applications for the Church Programming Language

    - by Chris S
    Has anyone worked with the programming language Church? Can anyone recommend practical applications? I just discovered it, and while it sounds like it addresses some long-standing problems in AI and machine-learning, I'm sceptical. I had never heard of it, and was surprised to find it's actually been around for a few years, having been announced in the paper Church: a language for generative models.

    Read the article

  • Implementing crossover in genetic programming

    - by Name
    Hi, I'm writing a genetic programming (GP) system (in C but that's a minor detail). I've read a lot of the literature (Koza, Poli, Langdon, Banzhaf, Brameier, et al) but there are some implementation details I've never seen explained. For example: I'm using a steady state population rather than a generational approach, primarily to use all of the computer's memory rather than reserve half for the interim population. Q1. In GP, as opposed to GA, when you perform crossover you select two parents but do you create one child or two, or is that a free choice you have? Q2. In steady state GP, as opposed to a generational system, what members of the population do the children created by crossover replace? This is what I haven't seen discussed. Is it the two parents, or is it two other, randomly-selected members? I can understand if it's the latter, and that you might use negative tournament selection to choose members to replace, but would that not create premature convergence? (After a crossover event the population contains the two original parents plus two children of those parents, and two other random members get removed. Elitism is inherent.) Q3. Is there a Web forum or mailing list focused on GP? Oddly I haven't found one. Yahoo's GP group is used almost exclusively for announcements, the Poli/Langdon Field Guide forum is almost silent, and GP discussions on general/game programming sites like gamedev.net are very basic. Thanks for any help you can provide!

    Read the article

  • Rush Hour - Solving the game

    - by Rubys
    Rush Hour if you're not familiar with it, the game consists of a collection of cars of varying sizes, set either horizontally or vertically, on a NxM grid that has a single exit. Each car can move forward/backward in the directions it's set in, as long as another car is not blocking it. You can never change the direction of a car. There is one special car, usually it's the red one. It's set in the same row that the exit is in, and the objective of the game is to find a series of moves (a move - moving a car N steps back or forward) that will allow the red car to drive out of the maze. I've been trying to think how to solve this problem computationally, and I can really not think of any good solution. I came up with a few: Backtracking. This is pretty simple - Recursion and some more recursion until you find the answer. However, each car can be moved a few different ways, and in each game state a few cars can be moved, and the resulting game tree will be HUGE. Some sort of constraint algorithm that will take into account what needs to be moved, and work recursively somehow. This is a very rough idea, but it is an idea. Graphs? Model the game states as a graph and apply some sort of variation on a coloring algorithm, to resolve dependencies? Again, this is a very rough idea. A friend suggested genetic algorithms. This is sort of possible but not easily. I can't think of a good way to make an evaluation function, and without that we've got nothing. So the question is - How to create a program that takes a grid and the vehicle layout, and outputs a series of steps needed to get the red car out? Sub-issues: Finding some solution. Finding an optimal solution (minimal number of moves) Evaluating how good a current state is Example: How can you move the cars in this setting, so that the red car can "exit" the maze through the exit on the right?

    Read the article

  • Good implementations of reinforced learning?

    - by Paperino
    For an ai-class project I need to implement a reinforcement learning algorithm which beats a simple game of tetris. The game is written in Java and we have the source code. I know the basics of reinforcement learning theory but was wondering if anyone in the SO community had hands on experience with this type of thing. What would your recommended readings be for an implementation of reinforced learning in a tetris game? Are there any good open source projects that accomplish similar things that would be worth checking out? Thanks in advanced Edit: The more specific the better, but general resources about the subject are welcomed. Follow up: Thought it would be nice if I posted a followup. Here's the solution (code and writeup) I ended up with for any future students :). Paper / Code

    Read the article

  • Genetic Programming Online Learning

    - by Lirik
    Has anybody seen a GP implemented with online learning rather than the standard offline learning? I've done some stuff with genetic programs and I simply can't figure out what would be a good way to make the learning process online. Please let me know if you have any ideas, seen any implementations, or have any references that I can look at.

    Read the article

  • Neural Network problems

    - by Betamoo
    I am using an external library for Artificial Neural Networks in my project.. While testing the ANN, It gave me output of all NaN (not a number in C#) The ANN has 8 input , 5 hidden , 5 hidden , 2 output, and all activation layers are of Linear type , and it uses back-propagation, with learning rate 0.65 I used one testcase for training { -2.2, 1.3, 0.4, 0.5, 0.1, 5, 3, -5 } ,{ -0.3, 0.2 } for 1000 epoch And I tested it on { 0.2, -0.2, 5.3, 0.4, 0.5, 0, 35, 0.0 } which gave { NaN , NaN} Note: this is one example of many that produces same case... I am trying to discover whether it is a bug in the library, or an illogical configuration.. The reasons I could think of for illogical configuration: All layers should not be linear Can not have descending size layers, i.e 8-5-5-2 is bad.. Only one testcase ? Values must be in range [0,1] or [-1,1] Is any of the above reasons could be the cause of error, or there are some constraints/rules that I do not know in ANN designing..? Note: I am newbie in ANN

    Read the article

  • Is the board game "Go" NP complete?

    - by sharkin
    There are plenty of Chess AI's around, and evidently some are good enough to beat some of the world's greatest players. I've heard that many attempts have been made to write successful AI's for the board game Go, but so far nothing has been conceived beyond average amateur level. Could it be that the task of mathematically calculating the optimal move at any given time in Go is an NP-complete problem?

    Read the article

  • What algorithms are suitable for this simple machine learning problem?

    - by user213060
    I have a what I think is a simple machine learning question. Here is the basic problem: I am repeatedly given a new object and a list of descriptions about the object. For example: new_object: 'bob' new_object_descriptions: ['tall','old','funny']. I then have to use some kind of machine learning to find previously handled objects that had similar descriptions, for example, past_similar_objects: ['frank','steve','joe']. Next, I have an algorithm that can directly measure whether these objects are indeed similar to bob, for example, correct_objects: ['steve','joe']. The classifier is then given this feedback training of successful matches. Then this loop repeats with a new object. a Here's the pseudo-code: Classifier=new_classifier() while True: new_object,new_object_descriptions = get_new_object_and_descriptions() past_similar_objects = Classifier.classify(new_object,new_object_descriptions) correct_objects = calc_successful_matches(new_object,past_similar_objects) Classifier.train_successful_matches(object,correct_objects) But, there are some stipulations that may limit what classifier can be used: There will be millions of objects put into this classifier so classification and training needs to scale well to millions of object types and still be fast. I believe this disqualifies something like a spam classifier that is optimal for just two types: spam or not spam. (Update: I could probably narrow this to thousands of objects instead of millions, if that is a problem.) Again, I prefer speed when millions of objects are being classified, over accuracy. What are decent, fast machine learning algorithms for this purpose?

    Read the article

  • Programming Technique: How to create a simple card game

    - by Shyam
    Hi, As I am learning the Ruby language, I am getting closer to actual programming. So I was thinking of creating a simple card game. My question isn't Ruby orientated, but I do know want to learn how to solve this problem with a genuine OOP approach. In my card game I want to have four players. Using a standard deck with 52 cards, no jokers/wildcards. In the game I won't use the Ace as a dual card, it is always the highest card. So, the programming problems I wonder about are the following: How can I sort/randomize the deck of cards? There are four types, each having 13 values. Eventually there can be only unique values, so picking random values could generate duplicates. How can I implement a simple AI? As there are tons of card games, someone would have figured this part out already, so references would be great. I am a truly Ruby nuby, and my goal here is to learn to solve problems, so pseudo code would be great, just to understand how to solve the problem programmatically. I apologize for my grammar and writing style if it's unclear, for it is not my native language. Also pointers to sites where such challenges are explained, would be a great resource! Thank you for your comments, answers and feedback!

    Read the article

  • Higher-order unification

    - by rwallace
    I'm working on a higher-order theorem prover, of which unification seems to be the most difficult subproblem. If Huet's algorithm is still considered state-of-the-art, does anyone have any links to explanations of it that are written to be understood by a programmer rather than a mathematician? Or even any examples of where it works and the usual first-order algorithm doesn't?

    Read the article

  • How do I find the most “Naturally" direct route using A-star (A*)

    - by Greg B
    I have implemented the A* algorithm in AS3 and it works great except for one thing. Often the resulting path does not take the most “natural” or smooth route to the target. In my environment the object can move diagonally as inexpensively as it can move horizontally or vertically. Here is a very simple example; the start point is marked by the S, and the end (or finish) point by the F. | | | | | | | | | | |S| | | | | | | | | x| | | | | | | | | | x| | | | | | | | | | x| | | | | | | | | | x| | | | | | | | | | x| | | | | | | | | | |F| | | | | | | | | | | | | | | | | | | | | | | | | | | | | As you can see, during the 1st round of finding, nodes [0,2], [1,2], [2,2] will all be added to the list of possible node as they all have a score of N. The issue I’m having comes at the next point when I’m trying to decide which node to proceed with. In the example above I am using possibleNodes[0] to choose the next node. If I change this to possibleNodes[possibleNodes.length-1] I get the following path. | | | | | | | | | | |S| | | | | | | | | | |x| | | | | | | | | | |x| | | | | | | | | | |x| | | | | | | | |x| | | | | | | | |x| | | | | | | | |F| | | | | | | | | | | | | | | | | | | | | | | | | | | | | And then with possibleNextNodes[Math.round(possibleNextNodes.length / 2)-1] | | | | | | | | | | |S| | | | | | | | | |x| | | | | | | | | x| | | | | | | | | | x| | | | | | | | | | x| | | | | | | | | | x| | | | | | | | | | |F| | | | | | | | | | | | | | | | | | | | | | | | | | | | | All these paths have the same cost as they all contain the same number of steps but, in this situation, the most sensible path would be as follows... | | | | | | | | | | |S| | | | | | | | | |x| | | | | | | | | |x| | | | | | | | | |x| | | | | | | | | |x| | | | | | | | | |x| | | | | | | | | |F| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Is there a formally accepted method of making the path appear more sensible rather than just mathematically correct?

    Read the article

  • Find optimal/good-enough strategy and AI for the game 'Proximity'?

    - by smci
    'Proximity' is a strategy game of territorial domination similar to Othello, Go and Risk. Two players, uses a 10x12 hex grid. Game invented by Brian Cable in 2007. Seems to be a worthy game for discussing a) optimal algorithm then b) how to build an AI. Strategies are going to be probabilistic or heuristic-based, due to the randomness factor, and the insane branching factor (20^120). So it will be kind of hard to compare objectively. A compute time limit of 5s per turn seems reasonable. Game: Flash version here and many copies elsewhere on the web Rules: here Object: to have control of the most armies after all tiles have been placed. Each turn you received a randomly numbered tile (value between 1 and 20 armies) to place on any vacant board space. If this tile is adjacent to any ally tiles, it will strengthen each tile's defenses +1 (up to a max value of 20). If it is adjacent to any enemy tiles, it will take control over them if its number is higher than the number on the enemy tile. Thoughts on strategy: Here are some initial thoughts; setting the computer AI to Expert will probably teach a lot: minimizing your perimeter seems to be a good strategy, to prevent flips and minimize worst-case damage like in Go, leaving holes inside your formation is lethal, only more so with the hex grid because you can lose armies on up to 6 squares in one move low-numbered tiles are a liability, so place them away from your main territory, near the board edges and scattered. You can also use low-numbered tiles to plug holes in your formation, or make small gains along the perimeter which the opponent will not tend to bother attacking. a triangle formation of three pieces is strong since they mutually reinforce, and also reduce the perimeter Each tile can be flipped at most 6 times, i.e. when its neighbor tiles are occupied. Control of a formation can flow back and forth. Sometimes you lose part of a formation and plug any holes to render that part of the board 'dead' and lock in your territory/ prevent further losses. Low-numbered tiles are obvious-but-low-valued liabilities, but high-numbered tiles can be bigger liabilities if they get flipped (which is harder). One lucky play with a 20-army tile can cause a swing of 200 (from +100 to -100 armies). So tile placement will have both offensive and defensive considerations. Comment 1,2,4 seem to resemble a minimax strategy where we minimize the maximum expected possible loss (modified by some probabilistic consideration of the value ß the opponent can get from 1..20 i.e. a structure which can only be flipped by a ß=20 tile is 'nearly impregnable'.) I'm not clear what the implications of comments 3,5,6 are for optimal strategy. Interested in comments from Go, Chess or Othello players. (The sequel ProximityHD for XBox Live, allows 4-player -cooperative or -competitive local multiplayer increases the branching factor since you now have 5 tiles in your hand at any given time, of which you can only play one. Reinforcement of ally tiles is increased to +2 per ally.)

    Read the article

  • Designing bayesian networks

    - by devoured elysium
    I have a basic question about Bayesian networks. Let's assume we have an engine, that with 1/3 probability can stop working. I'll call this variable ENGINE. If it stops working, then your car doesn't work. If the engine is working, then your car will work 99% of the time. I'll call this one CAR. Now, if your car is old(OLD), instead of not working 1/3 of the time, your engine will stop working 1/2 of the time. I'm being asked to first design the network and then assign all the conditional probabilities associated with the table. I'd say the diagram of this network would be something like OLD -> ENGINE -> CAR Now, for the conditional probabilities tables I did the following: OLD |ENGINE ------------ True | 0.50 False | 0.33 and ENGINE|CAR ------------ True | 0.99 False | 0.00 Now, I am having trouble about how to define the probabilities of OLD. In my point of view, old is not something that has a CAUSE relationship with ENGINE, I'd say it is more a characteristic of it. Maybe there is a different way to express this in the diagram? If the diagram is indeed correct, how would I go to make the tables? Thanks

    Read the article

  • Sharepoint BDC Error: The title property of entity tblStaff is set to an invalid value

    - by Christopher Rathermel
    I am just starting to create our Business Data Catalog(s) for our practice management system and I am running into an issue w/ our staff table. Background: I am using Business Data Catalog Definition Editor to create my ADF. I am using the RevertToSelf Authentication Mode. I have tried a few other tables and they seem to work just fine thus far.. only issue is w/ the staff table. If I removed all the columns for the staff entity except the ID and a few columns for the name it actually works. So it has a problem w/ one of my columns in tblStaff. I receive this error even when I set up an ADF w/ just this one entity. So w/ no associations.. When attempting to view the record: http://servername/ssp/admin/Content/tblstaff.aspx?StaffID={0} w/ {0} replaced w/ an actual staff ID I get the following error: The title property of entity tblStaff is set to an invalid value. Things I have tried: I noticed that I do have a column in my staff table called "Title" and removed it from ADF w/ no luck... Same error.. I tried to use bdc meta man to create my ADF and I got the same error... Any ideas? Chris

    Read the article

  • chess AI for GAE

    - by Richard
    I am looking for a Chess AI that can be run on Google App Engine. Most chess AI's seem to be written in C and so can not be run on the GAE. It needs to be strong enough to beat a casual player, but efficient enough that it can calculate a move within a single request (less than 10 secs). Ideally it would be written in Python for easier integration with existing code. I came across a few promising projects but they don't look mature: http://code.google.com/p/chess-free http://mariobalibrera.com/mics/ai.html

    Read the article

  • How to find minimum cut-sets for several subgraphs of a graph of degrees 2 to 4

    - by Tore
    I have a problem, Im trying to make A* searches through a grid based game like pacman or sokoban, but i need to find "enclosures". What do i mean by enclosures? subgraphs with as few cut edges as possible given a maximum size and minimum size for number of vertices that act as soft constraints. Alternatively you could say i am looking to find bridges between subgraphs, but its generally the same problem. Given a game that looks like this, what i want to do is find enclosures so that i can properly find entrances to them and thus get a good heuristic for reaching vertices inside these enclosures. So what i want is to find these colored regions on any given map. The reason for me bothering to do this and not just staying content with the performance of a simple manhattan distance heuristic is that an enclosure heuristic can give more optimal results and i would not have to actually do the A* to get some proper distance calculations and also for later adding competitive blocking of opponents within these enclosures when playing sokoban type games. Also the enclosure heuristic can be used for a minimax approach to finding goal vertices more properly. Do you know of a good algorithm for solving this problem or have any suggestions in things i should explore?

    Read the article

  • Help with Neuroph neural network

    - by user359708
    For my graduate research I am creating a neural network that trains to recognize images. I am going much more complex than just taking a grid of RGB values, downsampling, and and sending them to the input of the network, like many examples do. I actually use over 100 independently trained neural networks that detect features, such as lines, shading patterns, etc. Much more like the human eye, and it works really well so far! The problem is I have quite a bit of training data. I show it over 100 examples of what a car looks like. Then 100 examples of what a person looks like. Then over 100 of what a dog looks like, etc. This is quite a bit of training data! Currently I am running at about one week to train the network. This is kind of killing my progress, as I need to adjust and retrain. I am using Neuroph, as the low-level neural network API. I am running a dual-quadcore machine(16 cores with hyperthreading), so this should be fast. My processor percent is at only 5%. Are there any tricks on Neuroph performance? Or Java peroformance in general? Suggestions? I am a cognitive psych doctoral student, and I am decent as a programmer, but do not know a great deal about performance programming.

    Read the article

  • how to work with strings and integers as bit strings in python?

    - by Manuel
    Hello! I'm developing a Genetic Algorithm in python were chromosomes are composed of strings and integers. To apply the genetic operations, I want to convert these groups of integers and strings into bit strings. For example, if one chromosome is: ["Hello", 4, "anotherString"] I'd like it to become something like: 0100100100101001010011110011 (this is not actual translation). So... How can I do this? Chromosomes will contain the same amount of strings and integers, but this numbers can vary from one algorithm run to another. To be clear, what I want to obtain is the bit representation of each element in the chromosome concatenated. If you think this would not be the best way to apply genetic operators (such as mutation and simple crossover) just tell me! I'm open to new ideas. Thanks a lot! Manuel

    Read the article

  • A* (A-star) implementation in AS3

    - by Bryan Hare
    Hey, I am putting together a project for a class that requires me to put AI in a top down Tactical Strategy game in Flash AS3. I decided that I would use a node based path finding approach because the game is based on a circular movement scheme. When a player moves a unit he essentially draws a series of line segments that connect that a player unit will follow along. I am trying to put together a similar operation for the AI units in our game by creating a list of nodes to traverse to a target node. Hence my use of Astar (the resulting path can be used to create this line). Here is my Algorithm function findShortestPath (startN:node, goalN:node) { var openSet:Array = new Array(); var closedSet:Array = new Array(); var pathFound:Boolean = false; startN.g_score = 0; startN.h_score = distFunction(startN,goalN); startN.f_score = startN.h_score; startN.fromNode = null; openSet.push (startN); var i:int = 0 for(i= 0; i< nodeArray.length; i++) { for(var j:int =0; j<nodeArray[0].length; j++) { if(!nodeArray[i][j].isPathable) { closedSet.push(nodeArray[i][j]); } } } while (openSet.length != 0) { var cNode:node = openSet.shift(); if (cNode == goalN) { resolvePath (cNode); return true; } closedSet.push (cNode); for (i= 0; i < cNode.dirArray.length; i++) { var neighborNode:node = cNode.nodeArray[cNode.dirArray[i]]; if (!(closedSet.indexOf(neighborNode) == -1)) { continue; } neighborNode.fromNode = cNode; var tenativeg_score:Number = cNode.gscore + distFunction(neighborNode.fromNode,neighborNode); if (openSet.indexOf(neighborNode) == -1) { neighborNode.g_score = neighborNode.fromNode.g_score + distFunction(neighborNode,cNode); if (cNode.dirArray[i] >= 4) { neighborNode.g_score -= 4; } neighborNode.h_score=distFunction(neighborNode,goalN); neighborNode.f_score=neighborNode.g_score+neighborNode.h_score; insertIntoPQ (neighborNode, openSet); //trace(" F Score of neighbor: " + neighborNode.f_score + " H score of Neighbor: " + neighborNode.h_score + " G_score or neighbor: " +neighborNode.g_score); } else if (tenativeg_score <= neighborNode.g_score) { neighborNode.fromNode=cNode; neighborNode.g_score=cNode.g_score+distFunction(neighborNode,cNode); if (cNode.dirArray[i]>=4) { neighborNode.g_score-=4; } neighborNode.f_score=neighborNode.g_score+neighborNode.h_score; openSet.splice (openSet.indexOf(neighborNode),1); //trace(" F Score of neighbor: " + neighborNode.f_score + " H score of Neighbor: " + neighborNode.h_score + " G_score or neighbor: " +neighborNode.g_score); insertIntoPQ (neighborNode, openSet); } } } trace ("fail"); return false; } Right now this function creates paths that are often not optimal or wholly inaccurate given the target and this generally happens when I have nodes that are not path able, and I am not quite sure what I am doing wrong right now. If someone could help me correct this I would appreciate it greatly. Some Notes My OpenSet is essentially a Priority Queue, so thats how I sort my nodes by cost. Here is that function function insertIntoPQ (iNode:node, pq:Array) { var inserted:Boolean=true; var iterater:int=0; while (inserted) { if (iterater==pq.length) { pq.push (iNode); inserted=false; } else if (pq[iterater].f_score >= iNode.f_score) { pq.splice (iterater,0,iNode); inserted=false; } ++iterater; } } Thanks!

    Read the article

  • Update Rule in Temporal difference

    - by Betamoo
    The update rule TD(0) Q-Learning: Q(t-1) = (1-alpha) * Q(t-1) + (alpha) * (Reward(t-1) + gamma* Max( Q(t) ) ) Then take either the current best action (to optimize) or a random action (to explorer) Where MaxNextQ is the maximum Q that can be got in the next state... But in TD(1) I think update rule will be: Q(t-2) = (1-alpha) * Q(t-2) + (alpha) * (Reward(t-2) + gamma * Reward(t-1) + gamma * gamma * Max( Q(t) ) ) My question: The term gamma * Reward(t-1) means that I will always take my best action at t-1 .. which I think will prevent exploring.. Can someone give me a hint? Thanks

    Read the article

  • Entropy using Decision Tree's

    - by Matt Clements
    Train a decision tree on the data represented by attributes A1, A2, A3 and outcome C described below: A1 A2 A3 C 1 0 1 0 0 1 1 1 0 0 1 0 For log2(1/3) = 1.6 and log2(2/3) = 0.6, answer the following questions: a) What is the value of entropy H for the given set of training example? b) What is the portion of the positive samples split by attribute A2? c) What is the value of information gain, G(A2), of attribute A2? d) What is IFTHEN rule(s) for the decision tree?

    Read the article

  • Pong: How does the paddle know where the ball will hit?

    - by Roflcoptr
    After implementing Pacman and Snake I'm implementing the next very very classic game: Pong. The implementation is really simple, but I just have one little problem remaining. When one of the paddle (I'm not sure if it is called paddle) is controlled by the computer, I have trouble to position it at the correct position. The ball has a current position, a speed (which for now is constant) and a direction angle. So I could calculate the position where it will hit the side of the computer controlled paddle. And so Icould position the paddle right there. But however in the real game, there is a probability that the computer's paddle will miss the ball. How can I implement this probability? If I only use a probability of lets say 0.5 that the computer's paddle will hit the ball, the problem is solved, but I think it isn't that simple. From the original game I think the probability depends on the distance between the current paddle position and the position the ball will hit the border. Does anybody have any hints how exactly this is calculated?

    Read the article

< Previous Page | 31 32 33 34 35 36 37 38 39 40 41 42  | Next Page >