Here are some results from the implementation of adaptive sampling for the tuning of learning rates. 25 simulation threads with different learning rates (current learning rate + a multiple of delta) are run and every n iterations of the IGO algorithm sets the current learning rate to the learning rate of the thread where the minimum mean of the objective was attained during these n iterations and all 25 threads. An the process gets repeated.
Initial learning rates: [1,1]; Number of samples: 100; delta: 0.25; Iterations: 25
Best objective 3.906106261908349e-05 @ [3.141947289734544, 3.141325967480646] attained @ 799th iteration.
Initial learning rates: [1,1]; Number of samples: 100; delta: 0.125; Iterations: 25
Best objective 2.811559213462544e-04 @ [3.142757645858563, 3.141837545625751] attained @ 2392th iteration.
Initial learning rates: [1,1]; Number of samples: 1000; delta: 0.125; Iterations: 10
Best objective 9.220073825133568e-05 @ [3.141111456149766, 3.141109756733831] attained @ 327th iteration.