R
R
Ravik Morozov2019-07-03 12:58:13
Python
Ravik Morozov, 2019-07-03 12:58:13

Why does python vk_api take a long time to process with a stand alone application?

The code unloads information about group subscribers in csv

import vk_api, csv
token = "1234"

vk_session = vk_api.VkApi(token=token,app_id="1234")
vk = vk_session.get_api()
print(vk)
##Сохраняет бд Юзеров в FILENAME
def bdGroupUsers(group_id=29534144):
    FILENAME = "out12.csv"

    count_members = vk.groups.getMembers(group_id=group_id, count=0)["count"] #число участников в группе
    while_count= count_members//11000 + 1
    def code(offset=0):
        code = '''
        var i = 0;
        var members = [];
        var offset = '''+str(offset)+''';
        while(i < 11){
        members.push(API.groups.getMembers({"group_id": '''+str(group_id)+ ''', "offset": offset, "fields": "can_write_private_message, city"})["items"]);
        i = i + 1;
        offset = offset + 1000;
        }
        return members;
        '''
        return code
    offset = 0
    ku = 0
    df=[]
    columns = ["can_access_closed", "can_write_private_message", 'city', 'deactivated', 'first_name', 'id', 'is_closed', 'last_name'] #заголовки таблицы
    with open(FILENAME, "w", newline="", encoding='utf8') as file:
        writer = csv.DictWriter(file, fieldnames=columns)
        writer.writeheader()


    with open(FILENAME, "w", newline="", encoding='utf8') as file:
        writer = csv.DictWriter(file, fieldnames=columns)
        while ku <while_count-1:
            y = vk.execute(code=code(offset))
            for i in y:
                for j in i:
                    writer.writerow(j)
            print(ku)
            offset+=11000
            ku+=1
bdGroupUsers()

The unloading speed is about 1200-1800 people / sec. After reading about the possibilities to upload up to 75 thousand users per second, I am not very satisfied with the result. Help, gods vk api, where is my stupidest mistake?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
S
Sergey Sokolov, 2019-07-03
@Yunow

superficial view
На первый взгляд, косяк в числе запросов в одной "пачке" - их можно 25, а у вас всего 11.
И подозрение на тормоза из-за частоты запросов. Библиотека vk_api, если используется именно она, с ошибками превышения частоты запросов (3 в секунду) поступает просто: откладывает следующую попытку на пол-секунды. См. too_many_rps_handler
Разумнее считать время самостоятельно и точно укладываться в 3 запроса в секунду.
Вы пишете про 1200-1800 в секунду, это как-то совсем медленно, учитывая, что первые 33 тысячи то уж точно должно были бы вернуться за первую секунду с тем кодом, что в примере.

Upd. I tried without any pythons right on the VK application page to create a stored procedure and get a list of group members in it. The same picture. If only idusers - it is performed quickly.
It is worth adding fields cityand can_write_private_messagehow long it thinks and crashes with an error.
Apparently, it is necessary to “understand and forgive” the VC, which cannot cope with such a load within the established time / resource limits, and beat into smaller parties, or perform additional. requests.
For example, if such parsing is required not once, but again and again, it makes sense to cache user data (hoping that they will not change the city and close / open messages every day) and upon repeated requests, first receive only id, and then request cityandcan_write_private_messageonly for beginners.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question