Answer the question
In order to leave comments, you need to log in
How to properly submit strings for lemmatization?
I have tokenized a large text, now I am trying to submit these lines for lemmatization. Lemmatization is carried out using pymorphy2, the library accepts only the word. I can’t figure out how to submit a sentence by word, but so that he saves everything in the dataframe to me in the same way by sentences.
data_clear = pd.read_csv('C:\\Users\\ugrobug\\Desktop\\out_token.csv', sep='\t', encoding='utf-8')
def lemma(data_clear):
morph = pymorphy2.MorphAnalyzer()
final_data = pd.DataFrame({'Question'})
for i in data_clear['0']:
c = morph.parse(i)[0]
lemmas = c.normal_form
print(lemmas)
final_data.loc[len(final_data)]=[lemmas]
final_data.to_csv('C:\\Users\\ugrobug\\Desktop\\out_lemma.csv', sep='\t', encoding='utf-8')
lemma(data_clear)
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question