Answer the question
In order to leave comments, you need to log in
Is there an ML algorithm that works with panel data, as it does in OLS?
I am working on churn forecasting.
A question arose during the selection of the sample. Is it possible to form a sample in such a way that it takes into account the temporal nature. i.e., for example, the sample shows the moment of the next settlement date and, therefore, the client decides whether to continue using the service or not.
The panel data analysis method is often implemented in econometrics. Therefore, I asked the question. Is there an algorithm that understands that panel data is used as input and that each line has a subscriber id.
For example, a client Lech. He has id 178. The observation period for him is 6 months. And every month for him it is a separate line with his selected characteristics. For each such line id 178, respectively. Or do I need to set df like a panel?
I saw a lot of work on churn forecasting, but nowhere did I see discussions of the structure of such data and the possibility of analyzing panel data.
Answer the question
In order to leave comments, you need to log in
Something you mixed everything in one heap.
Panel data is not only used in econometrics. Moreover, methods for working with panel data exist on their own, and they are used in econometrics too.
On the account of "implementation in econometrics" - it will be interesting to get acquainted with the reference specifically to "implementation", at least in one of the common packages.
As for the opposition between ML and MNC, this is also a very mysterious phrase. OLS is a kind of mathematical method that underlies many applied methods, including regression analysis, which is quite successfully and widely used in machine learning.
Back to panel data. There is a linearmodels module ( https://bashtage.github.io/linearmodels/index.html#) which has a whole group of methods for working with panel data:
https://bashtage.github.io/linearmodels/panel/inde...
Use it to your heart's content.
PS By the way, in real problems one can often limit oneself to simply multivariate regression, one of the parameters of which is time. Try that too.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question