C
C
Chichi2015-12-03 22:59:34
Python
Chichi, 2015-12-03 22:59:34

How to handle null variable values ​​in regression (machine learning)?

I am trying to do a regression analysis. There are many variables (multiple feature regression). Some variables for some data element have not been assigned a value and are set to null. For clarity, the picture:
77408207e20f46d1aa23b663b165e25b.jpg
As you can see, some elements do not have values ​​for certain categories (features). For now, I've set them to Null. But how should such values ​​be handled when performing data regression? I would not want these Null values ​​to have a bad effect on the regression model. Unfortunately, I cannot remove elements that contain Null in any of the categories. I am using Python to plot the regression.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
�
âš¡ Kotobotov âš¡, 2015-12-03
@ChicoId

as a data scientist, you yourself must decide what to do with the gaps,
usually there are several typical options:
1. throw out this data (not suitable if there is very little data, and each is worth its weight in gold)
2. fill in with some typical values, for example, zeros (if we are talking about the number, for example), or average values ​​if, for example, it is a year (for example, write 2005)
3. restore this data (this already requires certain approaches and algorithms. For example, using neighbors, k-mean, collaborative filtering.)
you just do here a very simple task -> presumably the price is the result of a combination of a set of parameters.
from here you can evaluate the impact of each parameter on the price. especially having similar meanings.
start with simple things first -> where CONDITION NEW -> it is obvious that YEAR = ~ 2015
then you can make a system of equations -> type
YEAR*x+Storey*y+Area*z+Condition*n+Type*m+District*k =PRICE
you can easily make a system of 5 equations, and find the coefficients in any way convenient for you, for example, Gauss.
(PS if there is not enough data, then the district can be neglected)

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question