DPRG List  

DPRG: Sbasic , gambling , and probability

Subject: DPRG: Sbasic , gambling , and probability
From: Corey Hansen nzhansen at ihug.co.nz
Date: Sun Jan 11 14:46:57 CST 1998

>>Interesting there are a couple of ways you could do this, and I could
>make it one degree of difficulty harder by also considering what
>movement it just completed as another input.  Or even two prior
>movements may be additional inputs.

I see. So the robot remembering in case say the first move goes well , but
then the robot's next move sticks it with all sensor on.

>For example, if you look at the sensor inputs, you would choose a
>movement based on the pattern. Say it says go forward. It goes forward

>and hits something, which based on the sensor inputs, the response is to
>go backwards, where it then ends up in the same position as it was in
>when the sensors told it to go forward. Now it is stuck, forward, back,
>forward, back... Adding previous two movements to the input is extra
>information to decide not to go forward again and repeat the cycle.

That's where the gambling , and probability effect comes in. If the robot
goes forward then it remembers it's last move. When the sensor inputs change
, it looks at those. If the number is higher than before then the move will
be considered bad. Now the robot degrades the forward move for the previous
sensor input , and upgrades all other moves or vice versa. And finally the
robot looks at the current sensor input and goes to that table of motor

The gambling and probability are very important to the whole of the robot.
When it starts all of the values in the 256 sensor inputs variables are set
to 128 , because I'll limit them to 8bit variables. If the robot has no
sensor inputs then the binary number represeting this would be 00000000. Now
in the 256 different sensor input variable %00000000 is first. 0 will
probably be the label for that. The program goes to 0 to exucute the code.
In that space would be the 7 motor variables for THAT sensor input. All of
them have the value of 128. The robot needs to do a gambling type effect to
choose which motor output to do. It needs to take into account the values of
those variables also.

Imagine this: You're at the races. Plilly-Pinhead has 20/1 on her ,
Jacky-Jub has 3/1 , and Billy-Boo has 100/1. Who will probably win?
Jacky-Jub. But what if someone else wins? He'd get better odds on himself
and the others would worse odds.

This would apply well to my robot , just using the processor and not race
scores sent via satelite.

>Where I think you are getting at, is to have a list of possible
>movements with a score for each movement for each set of input
>For example, if the sensors report the way is clear, then the score for
>go forward is say 10 and the score for turn right is 7 and the score for
>turn left is 7.  In the case of take the best choice you would always go
>forward (because 10 is the highest number). If you wanted to add some
>randomness to it, you would generate a uniform random number between 1
>and the sum of all the scores for the movement, (in this simple case 10
>+ 7 + 7 = 24).  If the random number is between 1 and 10 you go forward,
>if it is between 11 and 17 you go right and if it is between 18 and 24
>you go left.

That would work.

>You should preload the scores with what you would guess they should be,
>or preload them with random numbers or preload them with the same
>score.  If the movement was successful you could add one to the score
>for that response for that set of inputs.  If the movement was not
>successful you could subtract one from the score for that set of inputs.

I like uploading my own numbers.......

>The hard part is determining what is successful movement or not.  I
>would try at a first attempt to base it on the  ability to move for a
>certain period of time successfully or not?  (Say 1 or 2 seconds, you
>may have to play with it to see what makes sense).

I'll try that........

>Maybe you or someone
>else has a different idea of what makes a successful or not >successful
>This feedback of what is good or what is not good is important in any
>learning system. The system can only learn if there is a decision as to
>what constitutes a good/bad (pain/pleasure) move.


More information about the DPRG mailing list