S
S
Sergei Cojocaru2013-04-25 21:23:16
Computer networks
Sergei Cojocaru, 2013-04-25 21:23:16

Finding identical strings

In short, the task is to parse a couple of hundred thousand web addresses and find the same strings in them. And this should not be done in a couple of thousand years;). Internet speed is not a problem. The snag itself in the algorithm for searching for identical strings ... Which way to look????

Answer the question

In order to leave comments, you need to log in

3 answer(s)
B
boodda, 2013-04-25
@boodda

what is meant by identical lines?
Does letter case matter?
What are the length limits?
The length of the line, the length of the words, the number of words in the line, have you already determined this?
Or do you intend to search for lines long for war and peace?

S
Seter17, 2013-04-25
@Seter17

Well, everything rests on the data structure that you are going to use. Hash tables will help you, I guess.

B
boodda, 2013-04-25
@boodda

There are already written engines of search bots, and open source. try using them.
or stir up a dictionary for words ID | word
then a dictionary of word forms
and then convert sentences to ID1 ID2 ID3 stream and search in the
database

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question