How to solve the problem?

E

Egor88888882020-04-15 21:52:25

Python

Egor8888888, 2020-04-15 21:52:25

Friends, a task was posted on one portal:

Take data on unemployment in the city of Moscow: https://video.ittensive.com/python-advanced/data-9... Group data by year, if there are less than 6 values in a year, discard these years. Build a linear regression model over the years of the average value of the ratio of UnemployedDisabled to UnemployedTotal (percentage of people with disabilities) per month and answer, what is the expected value in 2020 while maintaining the current policy of the city of Moscow?

I solved it:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
pd.options.display.max_rows = 1000
data = pd.read_csv("https://video.ittensive.com/python-advanced/data-9753-2019-07-25.utf.csv", delimiter = ";")
data = data.groupby("Year").filter(lambda x : x["UnemployedTotal"].count() < 6)
data["Year"] = data["Year"].astype("category")
data_group = data.groupby("Year").mean()
x = np.array(data_group.index).reshape(len(data_group.index),1)
y = np.array(data_group["UnemployedDisabled"]/data_group["UnemployedTotal"]*100).reshape(len(data_group.index),1)
model = LinearRegression()
model.fit(x, y)
plt.scatter(x,y , color ="orange")
x = np.append(x,[2020]).reshape(len(data_group.index)+1,1)
plt.plot(x, model.predict(x), color = "blue", linewidth = 3)
plt.show()

print(model.predict(np.array(2020).reshape(1,1)))

#print(data_group)

but the answer doesn't pass verification :(((((

help me find what I missed!

Thanks!