W
W
WebSpider2015-01-04 05:35:54
Perl
WebSpider, 2015-01-04 05:35:54

What is the best programming language for this task?

It is necessary to organize a massive multi-threaded (parallel) download of a large number of relatively small (up to 5 MB) files via the HTTP protocol, and then, after downloading each file, read and parse the server response (in the general case, HTML). What programming languages ​​are best suited for this task? While I'm leaning towards Pearl - can you advise something better? Preferably with reason.
Thanks in advance!

Answer the question

In order to leave comments, you need to log in

4 answer(s)
A
asd111, 2015-01-04
@WebSpider

Python.
Instead of multithreading, you can use grequests - it's a requests library based on gevent i.e. non-blocking I/O.
And if multithreading then:
here is a single-threaded example.

import requests
filename='test_file'
f = open (filename)
r =  requests.post(url='http://upload.example.com', data =  {'title':'test_file},  files =  {'file':f})
print r.status_code
print r.headers

manual on the requests library
docs.python-requests.org/en/latest/index.html
example of multithreading
import threading
from random import randint
from time import sleep

def printNumber(number):
   # Sleeps a random 1 to 10 seconds
   sleep(randint(1,10))
   print str(number)

thread_list = []

for i in range(1,10):
   # Instatiates the thread
   # (i) does not make a sequence, so (i,)
   t = threading.Thread(target=printNumber, args=(i,))
   # Sticks the thread in a list so that it remains accessible 
   thread_list.append(t)

# Starts threads
for thread in thread_list:
   thread.start()

# This blocks the calling thread until the thread whose join() method is called is terminated.
# From http://docs.python.org/2/library/threading.html#thread-objects
for thread in thread_list:
   thread.join()

# Demonstrates that the main process waited for threads to complete
print "Done"

M
maaGames, 2015-01-04
@maaGames

Any language that you know well enough to accomplish this task and that is supported by the server you are using. You will spend more time downloading files than parsing them with any of the voiced languages, so choose from your own knowledge and ease of implementation.

O
OnYourLips, 2015-01-04
@OnYourLips

Любой язык общего назначения.

Пока склоняюсь в сторону Перла — может посоветуете что-то получше?
А почему не Cobol?
Перл сейчас - это только legacy, не надо тащить его в новые проекты. Но раньше он использовался для таких задач.
Ближайщие современные аналоги Perl - это Ruby (очень похож элементами синтаксиса) и Python.
Один из них и советую.

W
WebSpider, 2015-01-04
@WebSpider

What about GO?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question