A
A
AndreyRaf132019-12-07 18:01:27
Python
AndreyRaf13, 2019-12-07 18:01:27

Check for overlapped part in "Longest Repeated Substring" algorithm?

I am solving the problem:

You have to return a substring length. You need to find a substring that repeats more than once in a given string. This reiteration shouldn't have overlaps. For example, in a string "abcab" the longest substring that repeats more than once is "ab", so the answer should be 2 (length of "ab")
Input: String.
Output: Int.
Example:
double_substring('aaaa') == 2
double_substring('abc') == 0
double_substring('aghtfghkofgh') == 3 # fgh

Here is the solution I wrote based on the algorithm I found:
"For each two adjacent suffixes on the sorted suffix array, count the length of the common prefix (Note: should avoid counting the overlapped part). The longest repeated prefix appearing first on the original input string is the result."
The code:
def function(string):
    suffixes = sorted([string[i:] for i in range(len(string))])
    print(suffixes)

    longest_prefix = 0

    for i in range(0, len(suffixes) - 1):
        first_suffix = suffixes[i]
        second_suffix = suffixes[i+1]
        counter = 0
        for j in range(0, min(len(first_suffix), len(second_suffix))):
            if first_suffix[j] == second_suffix[j]:
                counter += 1
            else:
                break

        if counter > longest_prefix:
            longest_prefix = counter
    return longest_prefix

print(function('aghtfghkofgh'))

This code is bad because it does not take into account overlapped parts. That is, for the string "aaaa" it will return 3, but these strings intersect. How to add this code so that it checks if substrings intersect?

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question