I have separated the learning rate into two. First the learning rate for the parameters related to concentration and mean. The second learning rate for parameters for the precision matrix.
All simulation where run for 1000 iterations.
Objective function: 2D Rastrigin centered at [pi,pi].
Eta_d = 1; Eta_D=0.25; 100 samples
Best fit: 5.092878218704300e-05 @ [3.142098748056062, 3.142098748056062]
Eta_d = 1; Eta_D=0.25; 1000 samples
Best fit: 0.011457103531619 @ [3.134161151183576, 3.140001246486607]
Eta_d = 1; Eta_D=0.5; 25 samples
Best fit: 0.013942156424022 @ [3.149423592902749, 3.138598537797074]
Eta_d = 1; Eta_D=0.5; 50 samples
Best fit: 2.739779972955603e-04 @ [3.140419809603958, 3.141666381043835]
Eta_d = 1; Eta_D=0.5; 75 samples
Best fit: 3.583357357932471e-04 @[3.142532825223459, 3.140632298254060]
Eta_d = 1; Eta_D=0.5; 100 samples
Best fit: 6.607625318082455e-04 @ [3.140007830023662, 3.140687697105372]
Eta_d = 1; Eta_D=0.5; 325 samples
Best fit: 0.005180560001300 @ [3.136709161988744, 3.140087288810498]
Eta_d = 1; Eta_D=0.5; 550 samples
Best fit: 0.013942156424022 @ [3.140419809603958, 3.138598537797074]
Eta_d = 1; Eta_D=0.5; 1000 samples
Best fit:9.464108499734891e-04 @ [3.143699809057704, 3.142167429144717]
Eta_d = 1; Eta_D=0.75; 100 samples
Best fit: 0.002806343200142 @ [3.138836948966443, 3.139032993376946]
Eta_d = 1; Eta_D=0.75; 1000 samples
Best fit: 4.175267021224727e-04 @ [3.140161030118904, 3.141358087630181]