L
L
lemonlimelike2020-07-16 01:52:55
Python
lemonlimelike, 2020-07-16 01:52:55

How to run a certain number of threads?

Hello everyone! I have a file with 10k records. How to run 10 threads for every 10 records. That is, so that there are no more than 10 threads.

I have this code:

from threading import Thread
from openpyxl import load_workbook, Workbook
from fuzzywuzzy import process



def get_list_data(input_file_name):
    '''Получение списка товаров из файла'''
    list_data = []
    wb = load_workbook(filename=input_file_name)
    sheet = wb.active
    m_row = sheet.max_row
    with open('conf/alias.txt', 'r') as file_alias:
        list_alias = []
        for item in file_alias:
            # if ',' in item:
            #     list_alias.append(item.title().replace(
            #         '\n', '').split(',')[-1])
            # else:
            list_alias.append(item.title().replace('\n', ''))

    for i in range(1, m_row+1):
        if not sheet.cell(row=i, column=1).value:
            continue
        title = str(sheet.cell(row=i, column=1).value).replace("'", '')
        brand_s = str(sheet.cell(row=i, column=2).value).replace(
            "'", '').rstrip().title().split(' ')
        point_match = process.extractOne(brand_s[0], list_alias)
        alias = str(sheet.cell(row=i, column=2).value).replace("'", '').rstrip().title()
        if point_match is not None:
            if point_match[-1] >= 50:
                alias = point_match[0]

        brand = '+'.join(brand_s)
        min_count = str(sheet.cell(row=i, column=3).value).replace("'", '')
        max_time_delivery = str(sheet.cell(
            row=i, column=4).value).replace("'", '')
        count_sell = str(sheet.cell(
            row=i, column=5).value).replace("'", '')
        object = {'idnp': title, 'brand': brand, 'min_count': min_count,
                  'max_time_delivery': max_time_delivery, 'count_sell': count_sell, 'alias': alias}
        list_data.append(object)

    return list_data


if __name__ == '__main__':
  list_data = get_list_data('Копия китай.xlsx')

  for index,item in enumerate(list_data):
    print(index)


There was an idea to do this: run a loop through the list with a step of 10 and create a stream for i,i-1,i-2,...,i-10 records. But you can't do that with enumerate. There was another idea: to add indexes of records that were already in the streams to the list and check each time, but this is too stupid. Tell me how to do it right.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
D
Dr. Bacon, 2020-07-16
@lemonlimelike

From the file itself, one thread should read and write this data to the queue, and already from the queue, 10 threads can read the data. Or you immediately read everything from the file and through submit at ThreadPoolExecutor(max_workers=10) you start everything.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question