Answer the question
In order to leave comments, you need to log in
Does the algorithm eat internal tags instead of tag text?
Hello. There is the following algorithm:
from bs4 import BeautifulSoup
from word2word import Word2word
from tqdm import tqdm
import nltk
tr = Word2word("en", "ru")
soup = BeautifulSoup(html, "lxml")
for tag in tqdm(soup.find_all()):
if tag.string:
try:
batch = nltk.word_tokenize(tag.string) # разделяем строку на слова
# переводим каждое слово, составляя полноценное предложение, и вписываем в тег
str_to_paste = ""
for i in batch:
str_to_paste += tr(i)[0] + " "
tag.string = str_to_paste
except:
continue
with open("index.html", "w", encoding = "utf-8") as file:
file.write(soup.prettify())
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question