W
W
walkerstech2020-02-26 06:33:24
Python
walkerstech, 2020-02-26 06:33:24

Parser not working in excel?

Hello, I need to write data from tr to excel table. On the screen, the data that is parsed from the html file, but they must be written to excel. But it doesn't write down what the problem is. Maybe someone can decide?
At the output, the data that is parsed from the html page should be written to an excel table.
Where in A1, B1 , C1, D1, E1 should be written "ID", "Nick_Name", "Faction", "Text", "Time".
And for the rest, A2, A3 ... B2, B3, etc., will have to be recorded, an example of the data on the screen:
a3783b88f7.png

from bs4 import BeautifulSoup
from openpyxl import load_workbook
import xlwt

# Initialize a workbook
book = xlwt.Workbook()

# Add a sheet to the workbook
sheet1 = book.add_sheet("Лог")

# The data
cols = ["ID", "Nick_Name", "Фракция", "Текст", "Время"]


  with open("bank.html", "r", encoding="utf-8") as f:
      
      contents = f.read()
   
      soup = BeautifulSoup(contents, 'lxml')
   
      tags = soup.find_all(['th', 'tr'])
      
      for tag in tags:

txt = tag.text.split()

# Loop over the rows and columns and fill in the values
for num in range(50):
      row = sheet1.row(num)
      for index, col in enumerate(cols):
          value = txt[index]
          row.write(index, value)

# Save the result
book.save("test.xls")

Answer the question

In order to leave comments, you need to log in

1 answer(s)
O
o5a, 2020-02-26
@o5a

Because of the lost indents, it's hard to say for sure, but in this state it is clear that only one last line of data is written to txt, because. txt = tag.text.split()overwrites it every time. Instead, I think it was supposed to write a nested array of strings to txt.

txt = []
with open("bank.html", "r", encoding="utf-8") as f:
    contents = f.read()
    soup = BeautifulSoup(contents, 'lxml')
    tags = soup.find_all(['th', 'tr'])
    for tag in tags:
        txt.append(tag.text.split())

And then in the recording cycle, respectively, go through these lines, i.e. change something like this
for i, vals in enumerate(txt):
    row = sheet1.row(i)
    for index, col in enumerate(cols):
        value = vals[index]
        row.write(index, value)

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question