Answer the question
In order to leave comments, you need to log in
How can you use the same parameters for different Pipeline steps in scikit-learn with grid search?
I solve the problem of text classification using convolutional networks.
The pipeline consists of two steps:
1) the MyPreprocessor preprocessor, which breaks the text into words, defines a dictionary and replaces the words in the text with ordinal indices in the dictionary
2) the MyClassifier classifier, which actually trains the network.
However, these two steps share a common set of parameters (the size of the max_features dictionary and the maximum allowable length of the max_len phrase). What should be done to make them change synchronously?
Conditional code:
clf = Pipeline([('vect', MyPreprocessor()),
('clf', MyClassifier())])}
params = {'vect__max_features': [5000, 10000],
'vect__max_len': [64, 96, 128],
'clf__max_features': [5000, 10000],
'clf__max_len': [64, 96, 128]}
gs_clf = GridSearchCV(clf, params, n_jobs=-1)
gs_clf = gs_clf.fit(X_train, Y_train)
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question