Game Theory
Tournament-0/Tournament-0/t0-action_distribution.pdf
1 2 3 Action Index, a
0.0
0.1
0.2
0.3
π i( a
) Empirical Strategy, FP v. FP, RPS
p1
p2
Tournament-0/Tournament-0/t0-action_learning_path.pdf
0 200 400 Iteration
0.00
0.25
0.50
0.75
1.00
1.25 π i( a
) Learning Path, FP v. FP, RPS
π0(0)
π0(1)
π0(2)
π1(0)
π1(1)
π1(2)
Tournament-0/Tournament-0/t0-reward_distribution.pdf
−1 0 1 Reward
0.0
0.1
0.2
0.3
0.4
0.5 P
r( R
= r ), r
Reward Distribution, FP v. FP, RPS p1
p2
Tournament-0/Tournament-0/t0-reward_history.pdf
0 200 400 Iteration
−1.0
−0.5
0.0
0.5
1.0 r
Reward History, FP v. FP, RPS p1
p2
Tournament-1/Tournament-1/t1-action_distribution.pdf
1 2 3 Action Index, a
0.0
0.1
0.2
0.3
0.4
π i( a
) Empirical Strategy, eG v. UCB, RPS
p1
p2
Tournament-1/Tournament-1/t1-action_learning_path.pdf
0 200 400 Iteration
0.00
0.25
0.50
0.75
1.00
1.25 π i( a
) Learning Path, eG v. UCB, RPS
π0(0)
π0(1)
π0(2)
π1(0)
π1(1)
π1(2)
Tournament-1/Tournament-1/t1-reward_distribution.pdf
−1.0 −0.5 0.0 0.5 1.0 Reward
0.0
0.1
0.2
0.3
0.4
P r( R
= r ), r
Reward Distribution, eG v. UCB, RPS p1
p2
Tournament-1/Tournament-1/t1-reward_history.pdf
0 200 400 Iteration
−1.0
−0.5
0.0
0.5
1.0 r
Reward History, eG v. UCB, RPS p1
p2