R
R
Rotyin2021-10-31 21:58:42
Python
Rotyin, 2021-10-31 21:58:42

How to extract text from pdf into two columns?

I want to extract text from a pdf where the text is visually in two columns but the pdfplumber library reads it as one line.

I want to first count from the first column, and then from the other. How can I do it?

my code

class TextPDF():
  def __init__(self, name):
    self.name = name


  def text(self):
    file = open("wb_text_shlak.txt","w")
    file.close()
    pdf = pdfplumber.open(self.name)
    for i in range(2,14):
      page = pdf.pages[i]
      text = page.extract_text()
      with open("wb_text_shlak.txt","a") as file:
        file.write(text)


wb = TextPDF("4.pdf")
wb.text()

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question