Answer the question
In order to leave comments, you need to log in
How to properly vectorize data for neural network training?
I am new to this topic, in connection with this I had a question how to correctly vectorize a dataset column containing words. (This is not a categorical feature and the one hot encoding method will not work). I vertorized it using a bag of words, but I'm not sure if I did it right and whether it will be trained correctly with such data. There are also columns with categorical features, I have already applied the one hot encoding method to them. Please point out my mistakes and suggest how they can be corrected.
An example of rows from a column that I vectorized with a bag of words:
img price png
css font awesome min css
The code I used for this:
my_df = pd.read_csv('DICT_FOR_LEARN.csv', header= 0, sep=';')
vectorizer = CountVectorizer()
X1 = vectorizer.fit_transform(my_df['url_path']).toarray()
X2= pd.get_dummies(my_df['country'], sparse=True)
X3 = pd.get_dummies(my_df['continent'], sparse=True)
X4 = pd.get_dummies(my_df['timezone'], sparse=True)
X5 = pd.get_dummies(my_df['method'], sparse=True)
X6 = pd.get_dummies(my_df['http'], sparse=True)
X7 = pd.get_dummies(my_df['exit_system'], sparse=True)
X8 = pd.get_dummies(my_df['os'], sparse=True)
X9 = pd.get_dummies(my_df['browser'], sparse=True)
X10 = pd.get_dummies(my_df['device'], sparse=True)
x_train = X1, X2, X3, X4, X5, X6, X7, X8, X9, X10
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question