A
A
artshelom2017-07-10 08:48:03
Java
artshelom, 2017-07-10 08:48:03

Is it possible to further optimize the text task?

Is it possible to increase the speed of execution of this code even more ?
391b33ff5b2e47289ad7c4e154e7a2ab.png
Challenge:
The Japanese are endlessly in love with the technology that surrounds them. They closely follow all the technical innovations and try to use the most modern and "smart" of them. Den and Sergey have a brilliant plan: they want to create a text editor that will conquer the Japanese. The most important uberintelligent editor feature should be auto-completion. If the user has typed the first few letters of a word, the editor should suggest the most plausible endings.
Dan and Sergey have already collected a huge amount of Japanese texts. For each Japanese word, they counted the number of times it occurs in the texts. If the user has already entered several letters, then the editor should show no more than ten of the most frequently used words starting with the letters entered by the user, sorted in descending order of frequency of mention.
Help Sergey and Dan revolutionize the text editor market.
Initial data
The first line contains a single integer N (1 ≤ N ≤ 105) — the number of words in the found texts. Each of the next N lines contains the word wi (a non-empty sequence of lowercase Latin letters no longer than 15) and an integer ni (1 ≤ ni ≤ 106) — the number of times this word occurs in the texts. The word and number are separated by a single space. No word is repeated more than once. The (N + 2)th line contains the number M (1 ≤ M ≤ 15000). The next M lines contain the words ui (a non-empty sequence of lowercase Latin letters no longer than 15) — the beginning of the words entered by the user.
Can this code be further optimized?

private List<StringIntObj> sort(List<StringIntObj> list){//StringIntObj- это объект который содержит слово и мощность этого слова
        List<StringIntObj> sortList = new ArrayList<>(10);
        if (list==null||list.size()==0)
            return null;
        StringIntObj bottom = list.get(0);
        for (int i=0;i<list.size();i++){
            if (sortList.size()<10){
                sortList.add(list.get(i));
                if (bottom.compareTo(list.get(i))>0)
                    bottom = list.get(i);//Находит минимум из 1 десяти!
            }else{
                if (bottom.compareTo(list.get(i))<0)
                    continue;
                Collections.sort(sortList);
                if (sortList.get(9).compareTo(list.get(i))>0)
                    sortList.set(9, list.get(i));
                for (int a=0;a<sortList.size()-1;a++){
                    if (sortList.get(a).compareTo(list.get(i))>0&&sortList.get(a+1).compareTo(list.get(i))<0){
                        sortList.set(a, list.get(i));
                        bottom=sortList.get(0);
                        break;
                    }
                }
            }
        }
        return sortList;
    }


    private String parsText(String word){//Ищет подходящие слова, если их не находит возвращает пустой массив.
        List<StringIntObj> list = sort(tree.search(word));
//        List<StringIntObj> list = (tree.search(word));
        //Если использовать в сортировки TreeSet то время увеличиться до 25 секунд. Так что нужно дальше думать!
        StringBuffer buffer = new StringBuffer();
        if (list == null||list.size()==0)
            return null;
        Collections.sort(list);
        for (int i = 0;i<10&&i<list.size();i++){
            buffer.append(list.get(i).get()).append(" ");
        }
        return String.valueOf(buffer);
    }

Answer the question

In order to leave comments, you need to log in

1 answer(s)
L
longclaps, 2017-07-11
@artshelom

I'm not a Javist, so I decided to practice.
Here is a java implementation on HashMap, rewritten from python:

import java.io.*;
import java.util.*;

class CountedWord {
    int c;
    String w;

    public CountedWord(String s) {
        String[] l = s.split(" ", 2);
        this.c = Integer.parseInt(l[1]);
        this.w = l[0];
    }
}

public class Main {
    public static int TEN = 10;

    public static void main(String[] args) throws Exception {
        long start = System.nanoTime();
        HashMap<String, CountedWord[]> prefixMap = new HashMap<>();
        BufferedReader data = new BufferedReader(new FileReader("test.in"), 0x10000);
        for (int i = Integer.parseInt(data.readLine()); i > 0; i--) {
            CountedWord cw = new CountedWord(data.readLine());
            String prefix = cw.w;
            for (int le = prefix.length() - 1; le >= 0; le--) {
                CountedWord[] buf = prefixMap.get(prefix);
                if (buf == null) {
                    buf = new CountedWord[TEN];
                    buf[0] = cw;
                    prefixMap.put(prefix, buf);
                } else { // это даже не сортировка вставкой, а просто вставка
                    int j = TEN - 1;
                    if (buf[j] != null && cw.c <= buf[j].c)
                        break; // уже на этом префиксе cw.c оказался меньше всех
                    while (buf[--j] == null) ;
                    j++;
                    while (j-- > 0 && buf[j].c < cw.c) buf[j + 1] = buf[j];
                    buf[j + 1] = cw;
                }
                prefix = prefix.substring(0, le);
            }
        }

        for (int i = Integer.parseInt(data.readLine()); i > 0; i--) {
            String prefix = data.readLine();
            System.out.println(String.format("     ~ %s", prefix));
            CountedWord[] buf = prefixMap.get(prefix);
            if (buf == null) continue;
            for (int j = 0; j < TEN; j++) {
                CountedWord cw = buf[j];
                if (cw == null) break;
                System.out.println(String.format("%6d %s", cw.c, cw.w));
            }
        }
        System.out.println(String.format(
                "\nit takes %.3f sec", 1e-9 * (System.nanoTime() - start)));
    }
}

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question