R
R
Ruslan Mordovanech2021-12-13 14:31:44
Python
Ruslan Mordovanech, 2021-12-13 14:31:44

How can I make the code download the full information from the links?

import asyncio
import uuid
import aiohttp
import async_timeout
import requests

link = 'https://dsa.court.gov.ua/open_data_json.php?json=532'

response = requests.get(link).json()
urls = []
for item in response['Файли']:
    urls.append(list(item.values())[0])

async def get_url(url, session):
  file_name = str(uuid.uuid4())
  async with async_timeout.timeout(120):
    async with session.get(url) as response:
      with open(file_name, 'wb') as fd:
        async for data in response.content.iter_chunked(9000):
          fd.write(data)
          return 'Successfully downloaded ' + file_name

async def main(urls):
  async with aiohttp.ClientSession() as session:
    tasks = [get_url(url, session) for url in urls]
    return await asyncio.gather(*tasks)

urls = urls
loop = asyncio.get_event_loop()
results = loop.run_until_complete(main(urls))
print('\n'.join(results))

All snares roll down but do not pump to the end.
The files are lightly recorded. I understand the download process is interrupted. And for some reason it doesn't see it as csv.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
V
Vlad Grigoriev, 2021-12-13
@Hery1

async def get_url(url, session):
  file_name = str(uuid.uuid4())
  async with async_timeout.timeout(120):
    async with session.get(url) as response:
      with open(file_name, 'wb') as fd:
        async for data in response.content.iter_chunked(9000):
          fd.write(data)
          return 'Successfully downloaded ' + file_name - вот тут вы выходите из цикла и функции и скачиваете один только блок

Вот так скачали завершили функциию 
                async for data in response.content.iter_chunked(9000):
                    fd.write(data)
                    print(data)
                return 'Successfully downloaded ' + file_name

R
Ruslan Mordovanech, 2021-12-13
@Hery1

import requests
from multiprocessing.pool import ThreadPool

link = 'https://dsa.court.gov.ua/open_data_json.php?json=532'

response = requests.get(link).json()
urls = []
for item in response['Файли']:
    urls.append(list(item.values())[0])

def download_url(url):
  print("downloading: ",url)
  file_name_start_pos = url.rfind("/") + 1
  file_name = url[file_name_start_pos:]

  r = requests.get(url, stream=True)
  if r.status_code == requests.codes.ok:
    with open(file_name, 'wb') as f:
      for data in r:
        f.write(data)
  return url

urls =  urls

results = ThreadPool(5).imap_unordered(download_url, urls)
for r in results:
    print(r)

I found a code that downloads all the snares at the same time and without interruptions. Need to protest

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question