Answer the question
In order to leave comments, you need to log in
How to find the largest sequence of characters that is repeated in the text more than once?
It is necessary to take the text with a change of type String and find the largest sequence of characters that is repeated in the text more than once.
Spaces should also be taken into account.
For example, there is text:
Toster site , blah blah blah Toster site
Site toster
aboi booolsa b booolsa df
booolsa
Answer the question
In order to leave comments, you need to log in
Solution :
The LongestSubstringFinder class that performs this operation:
import java.util.Arrays;
public final class LongestSubstringFinder {
private LongestSubstringFinder() {}
public static String findLongestSubstringAtString(String string) {
string = string.replaceAll("\\s+", " ");
int quantityOfSuffixes = string.length();
String[] suffixes = new String[quantityOfSuffixes];
for (int i = 0; i < quantityOfSuffixes; i++) {
suffixes[i] = string.substring(i, quantityOfSuffixes);
}
Arrays.sort(suffixes);
String longestRepeatedSubstring = "";
for (int i = 0; i < quantityOfSuffixes - 1; i++) {
String x = findLongestCommonPrefixOfTwoStrings(suffixes[i], suffixes[i + 1]);
if (x.length() > longestRepeatedSubstring.length()) {
longestRepeatedSubstring = x;
}
}
return longestRepeatedSubstring;
}
private static String findLongestCommonPrefixOfTwoStrings(String first, String second) {
int quantity = Math.min(first.length(), second.length());
for (int i = 0; i < quantity; i++) {
if (first.charAt(i) != second.charAt(i))
return first.substring(0, i);
}
return first.substring(0, quantity);
}
}
@Test
public void testFindLongestSubstringAtStringFirst() throws Exception {
String testString = "Сайт toster, бла бла бла Сайт toster";
String neededResult = "Сайт toster";
String longestSubstring = LongestSubstringFinder
.findLongestSubstringAtString(testString);
assertEquals(longestSubstring, neededResult);
}
@Test
public void testFindLongestSubstringAtStringSecond() throws Exception {
String testString = "aboibooolsabbooolsadf";
String neededResult = "booolsa";
String longestSubstring = LongestSubstringFinder
.findLongestSubstringAtString(testString);
assertEquals(longestSubstring, neededResult);
}
Idea:
We take N sequences of length 1, starting at elements 0..N-1, write out the position of their 1st character (0..N-1) to the array, sort by the character in this position. For groups with the same character, we call recursively, considering the 2nd character from the position recorded in the array, etc.
Pseudocode:
find(N,A[])
p[N]={0..N-1} // позиции начала подстрок
l=0,ml=0,s // длина, макс длина подстроки, позиция подстроки макс длины
finda(0,N)
// искать среди позиций p[i..j-1]
def finda(i,j)
if l>ml && j-i>1
ml=l
s=p[i]
++l
// оставим в p[i..j-1] позиции <= N-l
m=j,k=j=i
while k<m
if p[k]<=N-l
p[j++]=p[k]
++k
sort p[i..j-1]):A[p[i]+l]
k=m=i
while ++k<j
if(A[p[k]+l]!=A[p[k-1]+l]) // конец группы подстрок с равным символом на позиции l от начала
finda(m,k)
m=k
finda(m,j) // последняя группа подстрок с равным символом на позиции l от начала
--l
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question