How to train a neural network to predict an agent's action?

K

kiling2019-09-03 14:06:36

Python

kiling, 2019-09-03 14:06:36

I started learning machine learning. I am using python and scikit-learn for this.
There is a task in which I need to predict the action of an agent. The input data describes the situation. The agent has two possible actions that are a reaction to the described situation. An agent can only take one action.
1 0 - the agent performed the first action.
0 1 - the agent performed the second action.
After training, I want to get the probabilities of the agent taking an action on an event. Those. if I get an answer to the event - 0.3 0.7, this means predicting the actions of the agent in which it is predicted that the agent will perform 1 action with a probability of 30%, and the second action, respectively, with a probability of 70%.
For training, I tried to use different regression models, such as LinearRegression or RandomForestRegressor. As a result, I seem to even get the data of the desired type.
So the question is which learning models to apply correctly for such a task. And most importantly, how to evaluate the result of the implementation. After all, if the answer is a prediction of 0.02 0.98, and the agent still performs the first action (1 0), then this is not an error, just an event with a low probability. For models, as I understand it, an estimate of the mean square error is applied. Such an assessment is not suitable for this task, is it?

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

R

Ruslan., 2019-09-03
@LaRN

Look here:
https://habr.com/ru/company/ods/blog/323890/
It looks like a logistic regression.

S

Sergey, 2019-09-03
@begemot_sun

In fact, from the point of view of correct statistics, you need not only the probability of the occurrence of a given event. But also the confidence interval of this accomplishment.
That. your network can predict 1% for one action and 99% for another .. but if the first happens, this does not mean that the network was "mistaken", it just predicted this outcome with a probability of not 95% .. but 1% )