My own solution for the CardPole-Challange. I only use the pole-angle, nothing else. I remember a history of it (with memory capacity) With a memory capacity of 1 it does not work, with 2 I need luck. 3 and more look good. I already solved it with all 4 parameters but this did not match my personal experience where I am balancing a pole on my finger. The only thing I observe here is the angle. import gym import random import numpy as np class HillClimbingAgent(): def __init__(self, env): self .action_size = env.action_space.n self .state = [ 0 , 0 , 0 ] # memory capacity self .input_size = len( self .state) self .W = 1e-4 * np.random.rand( self .action_size, self .input_size) self .best_W = np.copy( self .W) self .best = -np.inf self .noise_scale = 1e-5 def _append_to_state(self, state): n = len( self .state) for i in range( 0 , n- 1 ): self .state[n-i- 1 ] = self .state[n-i- 2...
"If you think without writing, you only think you're thinking." - Leslie Lamport | Contents are snapshots of thoughts; they change within minutes.