Answer the question
In order to leave comments, you need to log in
How does reinforcement learning save the found optimal strategy?
I continue to analyze the code of the program https://github.com/Smeilz/Tic-Tac-Toe-Reinforcemen...
1) The program simulates 200000 games of 2 opponents in tic-tac-toe 3x3
2) Saves the strategy to a file using pickle
3) You can play with the trained strategies, it is loaded again with the help of pickle
I output the saved object via print, a huge text is displayed there
'X', '0', '0', 'X', 'X', '0'), 3): 1.0, (('X', 'X', ' ', ' ', ' ', ' ', ' ', '0', '0'), 3): 1.203194073499, (('X', 'X', ' ', ' ', ' ', ' ', ' ', '0', '0'), 4): 0.97, (('X', 'X', ' ', ' ', ' ', ' ', ' ', '0', '0'), 5): 1.0, (('X', 'X', ' ', ' ', ' ', ' ', ' ', '0', '0'), 6): 1.0, (('X', 'X', ' ', ' ', ' ', ' ', ' ', '0', '0'), 7): 1.8822040593129998, (('X', 'X', ' ', '0', 'X', ' ', ' ', '0', '0'), 3): 0.92401, (('X', 'X', ' ', '0', 'X', ' ', ' ', '0', '0'), 6): 0.43899999999999995, (('X', 'X', ' ', '0', 'X', ' ', ' ', '0', '0'), 7): 1.8999999669669685, (('X', 'X', ' ', '0', 'X', ' ', '0', '0', '0'), 3): 1.0, (('X', 'X', ' ', '0', 'X', ' ', '0', '0', '0'), 6): 1.0, (('0', ' ', '0', ' ', 'X', ' ', 'X', ' ', ' '), 2): 1.899999952809955, (('0', ' ', '0', ' ', 'X', ' ', 'X', ' ', ' '), 4): 0.707281, (('0', ' ', '0', ' ', 'X', ' ', 'X', ' ', ' '), 6): 1.6262611862579543, .............
Answer the question
In order to leave comments, you need to log in
Everything is clear with the neural network - we save the weights and then recreate the network for another data set.learn Q-learning
And what about the result of training on Q-learning.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question