A
A
Alexv012015-02-02 12:33:44
ruby
Alexv01, 2015-02-02 12:33:44

On what to write a site parser? in PHP or Ruby?

On what to write a site parser?
I know PHP well , I
don't know Ruby at all :)
The parser should be multi-threaded and work quickly)
so I think for this you need to learn ruby
​​or is PHP enough?

Answer the question

In order to leave comments, you need to log in

9 answer(s)
W
webbus, 2015-02-02
@Alexv01

Normal people are not looking for adventures, they take scrapy and python and get a multi-threaded parser out of the box.

E
Eugene Burmakin, 2015-02-02
@Freika

On what you know best. You know php - write on it. If you want to get used to Ruby along the way, write to Ruby. The possibilities of languages ​​for this purpose are practically the same.

S
Sergey, 2015-02-02
@butteff

In general, php, especially multi-threaded, will work for a very long time.
I would write it generally under the desktop on something, and not on the puff.
But just in case, I'll throw in a link that makes life much easier
simplehtmldom.sourceforge.net

C
Crash, 2015-02-02
@Bandicoot

Perhaps enough puff

E
Evgeny Kalibrov, 2015-02-02
@rework

In my opinion, there is not much difference on what to write on, so I advise you to write on what is more to the soul.
For multi-threaded requests in php, you can use the curl library, and the curl_multi_exec function. I think in ruby ​​it is possible to use it.

S
Sergey Ivanov, 2015-02-02
@Writerim

I wrote a parser for a long time as follows. bash + curl , parsed it with the same bash and got the necessary piece. Then I passed it through the console to the php script. Worked very quickly and on large volumes.
Now I would love to try something ready.

A
asd111, 2015-02-02
@asd111

Java + jsoup. If the site is formed through JS, then Selenium instead of jsoup.
Multithreading in Java is easy to do. After PHP, Java is easy to write.

Thread t = new Thread(new Runnable() {
    @Override
    public void run() { 
        parse();
        }
    });
t.start();

C
Chups23, 2020-05-06
@Chups23

Good afternoon, parsers can be written in literally all languages, but there are separate languages ​​for parsers that are suitable for this! Of these YAPs I will advise: '1. Python 2. PHP, 3. Javascript , 4. Ruby, 5. Java and .Net'
You can choose any of these options!

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question