O
O
Oleg Petrov2018-10-01 15:05:21
Python
Oleg Petrov, 2018-10-01 15:05:21

How is the strategy saved in reinforcement learning?

Parsing the program code https://github.com/Smeilz/Tic-Tac-Toe-Reinforcemen...
What did I understand?
The program has 2 modules.
Qlearning.py - responsible for training agents and saving the result of learning
Game.py - describes the process of the game The
question is how exactly does Qlearning do the saving strategy?
1) There is a line in Train.py
game.saveStates()
2) It refers to a function in the game.py module

def saveStates(self):
        self.player1.saveQtable("player1states")
        self.player2.saveQtable("player2states")

3) This function then references the instance of Player1 and Player2 and the saveQtable function in the QLearning.py module
def saveQtable(self,file_name):  #save table
        with open(file_name, 'wb') as handle:
            pickle.dump(self.Q, handle, protocol=pickle.HIGHEST_PROTOCOL)

-------------------------------------------------- --
As a result, as I understand it, the program saves the strategy that was obtained as a result of training as a stream of bytes and decodes it back when loading.
Questions.
1) How exactly is the strategy saved? What is its structure? What will the self parameter store in this case?
2) Is it possible to change the code to save it to a file in readable form and see the format?
3) How to save the same in Xml?
Thanks in advance

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question