S
S
SemenAnnigilator2021-09-09 16:54:07
Python
SemenAnnigilator, 2021-09-09 16:54:07

Sklearn during model training throws an error "X and y have inconsistent dimensions (68180 != 6818)"?

Here is my code:

`features = data.drop(["Tenure"], axis = 1)
 target = data["Tenure"]
 features = features.to_numpy()
 target = target.to_numpy()
 features_train, features_valid, target_train, target_valid = train_test_split(features, target, test_size = 0.25, random_state = 12345)
 features_train = features_train.reshape(-1, 1)
 target_train = target_train.reshape(-1, 1)
 model_elastic = ElasticNetCV(random_state = 12345)
 model_elastic.fit(features_train, target_train)
 predictions = model_elastic.predict(target_train)
 print(accuracy_score(target, predict))`


Mistake:

ValueError                                Traceback (most recent call last)
<ipython-input-99-cb6d5d8978a1> in <module>
      7 target_train = target_train.reshape(-1, 1)
      8 model_elastic = ElasticNetCV(random_state = 12345)
----> 9 model_elastic.fit(features_train, target_train)
     10 predictions = model_elastic.predict(features_train)
     11 print(accuracy_score(target, predict))

~\anaconda3\envs\LikeProject\lib\site-packages\sklearn\linear_model\_coordinate_descent.py in fit(self, X, y)
   1261 
   1262         if X.shape[0] != y.shape[0]:
-> 1263             raise ValueError("X and y have inconsistent dimensions (%d != %d)"
   1264                              % (X.shape[0], y.shape[0]))
   1265 

ValueError: X and y have inconsistent dimensions (68180 != 6818)


At first I had an error with a 2d array, but I solved it by converting the dataframe to a numpy array and reshape, and after that I got this error. I need to predict values ​​for target_train.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
T
Tipchik, 2021-09-10
@Tipchik

No need to reshape the data. Lines 6 and 7 are redundant.
You have 10 features (10 columns) in your training data with 6818 entries in each column. With a reshape, you convert all 10 columns into one and get 6818 x 10 = 68180 records. The number of lines in train and target must be the same.
Before training the model (the fit method), I recommend displaying and checking the data with your eyes so that there are no such errors.
Validation of the resulting model in lines 10 and 11 will give an incorrect result, because:
1. in line 10 of the code - an incorrect argument in the predict method. In your case, the argument is features_valid. For this data set, predict the result of predictions.
2. line 11 of the code - incorrect arguments for the accuracy_score() function. D.b. accuracy_score(target_valid, predictions).
Just starting to learn ML. So I wrote as I understand.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question