M
M
max_bu2018-04-04 15:44:37
Python
max_bu, 2018-04-04 15:44:37

How to process multiple exel files with python?

Hello, please tell me, I'm confused:
There are 20 excel files in each of them 50,000 rows and 10 columns.
1. I need to extract lines from each file that contain the keywords: lookFor1 and lookFor2 in column "F".
2. Write all these lines in a single file.
I found the following solution in Google + my heaps:

from openpyxl import Workbook,load_workbook
import re
import os
for root, dirs, files in os.walk("."):  
    files = filter(lambda x: x.endswith('.xlsx'), files)
    for filename in files:

        wayBook = filename

        wbSearch = Workbook()
        wbSearch = load_workbook(wayBook)
        wsSearch = wbSearch.active

        wbResult = Workbook()
        wsResult = wbResult.active
        resultRow = 1

        lookFor1 = 'Кино'
        lookFor2 = 'Книга'

        for i in range(1,50000):
            value=str(wsSearch.cell(row=i, column=6).value)
            mailvalue=wsSearch.cell(row=i, column=9).value
        if value in lookFor1 :
            wsResult.cell(row=resultRow, column=1).value=mailvalue
            wsResult.cell(row=resultRow, column=2).value=value
            resultRow += 1
        elif value in lookFor2 :
            wsResult.cell(row=resultRow, column=1).value=mailvalue
            wsResult.cell(row=resultRow, column=2).value=value
            resultRow += 1
        wbResult.save("result_" + str(wayBook))
        print('Все готово: ' + str(wayBook) + 'Удачи!')

This code creates empty files.
from openpyxl import Workbook,load_workbook
import re

wayBook = "search.xlsx"

wbSearch = Workbook()
wbSearch = load_workbook(wayBook)
wsSearch = wbSearch.active

wbResult = Workbook()
wsResult = wbResult.active
resultRow = 1

lookFor1 = 'Кино'
lookFor2 = 'Книга'

for i in range(1,50000):
   value=str(wsSearch.cell(row=i, column=6).value)
   mailvalue=wsSearch.cell(row=i, column=9).value
   if (' ' + value + ' ') in (' ' + lookFor1 + ' '):
       wsResult.cell(row=resultRow, column=1).value=mailvalue
       wsResult.cell(row=resultRow, column=2).value=value
       resultRow += 1
   elif (' ' + value + ' ') in (' ' + lookFor2 + ' '):
       wsResult.cell(row=resultRow, column=1).value=mailvalue
       wsResult.cell(row=resultRow, column=2).value=value
       resultRow += 1
wbResult.save("result_" + str(resultRow) + ".xlsx")
print('Все готово! Удачи')
print (wbSearch)

This code works, but there are two problems:
1: no automation, processes one file.
2: if there is one word, he will find it, but if there are several words in the column, he skips it.
How to solve these problems?
Thanks in advance

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question