P
P
Pavel2015-12-19 09:24:51
Algorithms
Pavel, 2015-12-19 09:24:51

How to count the number of distinct substrings of a string and list one copy of each of those substrings?

Hello. Tell me how you can solve the problem indicated in the title in time linear with respect to the length of the input string. I assume that a suffix tree is needed, but a suffix tree alone is clearly not enough here.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
A
AM5800, 2015-12-19
@AM5800

The number of all substrings of the string n is equal to the binomial coefficient (2, n + 1). Those. 1/2(n+1)n.
Those. in the worst case you have n^2 substrings. And at best - n. It's just that I wouldn't be too happy. Because the best case strings look something like this: "aaaaaaaaaaaaaaaa". There are 16 characters and 16 unique substrings.
I'm all for what - in linear time, nothing meaningful will come of it.
The number of unique substrings can be calculated as follows:
https://www.quora.com/Given-a-string-how-do-I-find...
Unfortunately, the author did not evaluate his algorithm. But it seems to run in O(n log n).
Since the suffix array is built in O(n log n), and LCP works out in O(n)

P
Pavel, 2015-12-20
@PavelG94

I think that this can still be done in linear time. This will require a suffix tree for the input string, and how to do this can be found here
habrahabr.ru/post/258121

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question