A
A
Alexey Pomogaev2012-02-13 08:15:14
Java
Alexey Pomogaev, 2012-02-13 08:15:14

Java Futures vs Goroutines

I am writing a search bot that requests a URL and fetches content from that URL for further processing. The processor supports the ability to run bots in multiple threads, but here the question arises, what will be faster - java threads or green threads (or light threads, like in go).

Let's say I'm using HttpClient to get content from a specific URL. I think it makes sense to create Future tasks and map them to 100-200 (numbers from the bulldozer) java threads, through the same thread pool. Thus, HttpClient will work in java threads requesting and receiving content by URL. Taking into account pings, approximately 100ms, this work can take up to 600ms.

If I understand correctly, then thanks to non-blocking IO, the thread with the HttpClient code will fall asleep for at least 100ms, so these 100-200 threads will work smartly, falling asleep while waiting for data, and then waking up to receive them.

And of course, there will be a separate thread that bypasses the Future in an infinite loop and checks what data has been received and sends it to another thread for processing.

Did I get it right? Maybe it makes sense to use Go with routines for this, instead of java with threads, will it be faster? Or is there a better way to do it in java?

UPDATE
Pings can be as long as 1000ms, which means that you need to create more threads, the better, i.e. so that each thread holds the connection. And if you use a thread pool, say 100 threads, then with 3000 bots, it will work slowly. Those. after the thread is released, the new bot will grab it and fall asleep for 1000ms due to IO, and the queue will be 30 bots per thread. And so, if each bot has a stream or at least 2-3 bots in a queue for a stream, then it will be faster.

The only question is, what are the limits, say, for some simple Core 2 Duo processor? What will be the difference in performance if you use light go threads or do heavy ones in java? 10,000 java threads vs 10,000 goroutines with IO in 100-1000ms? But apparently, no one has tested this here, I'll figure it out myself now.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
B
bald2b, 2012-02-14
@bald2b

Why do you think Java threads are so heavy? HttpClient itself is heavy, probably for simple download tasks :)
If I were you, instead of perverting with go, I would write my own simple content downloader instead of HttpClient, there will be more exhaust. As for the number of Java threads - each thread occupies by default 2 MB of memory when created (can be reduced with the JVM -Xss key), so think about how much you can run

B
Beholder, 2012-02-14
@Beholder

The slowest thing will be the network, not the streams.

P
physics, 2014-04-16
@physics

When solving a similar problem, I would pay attention to Akka (example on Habré habrahabr.ru/post/125717/). Each task with a request would be wrapped in an actor class. It would be possible to bother with multithreading on your own, but then you should keep in mind the need for thread pools, etc., etc.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question