Z
Z
zdravnik2015-10-27 17:32:46
elasticsearch
zdravnik, 2015-10-27 17:32:46

elasticsearch crash after server reboot?

After restarting the server, elasticsearch stops writing data. The logs show an error like
[2015-10-27 13:20:24,503][DEBUG][action.bulk ] [Torso] [logstash-2015.10.27][3] failed to execute bulk item (index) index.... ....
I crash the contents of curator delete indices --all-indices, I restart the elastic, everything starts working. I give reboot to the server and again the same song.
What is the problem? how to solve the issue?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
P
Puma Thailand, 2015-10-27
@opium

how do you start elastic?
what script extinguishes elastic server on reboot?

Z
zdravnik, 2015-10-27
@zdravnik

cat /etc/rc.d/init.d/elasticsearch
#!/bin/sh
#
# elasticsearch <summary>
#
# chkconfig:   2345 80 20
# description: Starts and stops a single elasticsearch instance on this system
#

### BEGIN INIT INFO
# Provides: Elasticsearch
# Required-Start: $network $named
# Required-Stop: $network $named
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: This service manages the elasticsearch daemon
# Description: Elasticsearch is a very scalable, schema-free and high-performance search solution supporting multi-tenancy and near realtime search.
### END INIT INFO

#
# init.d / servicectl compatibility (openSUSE)
#
if [ -f /etc/rc.status ]; then
    . /etc/rc.status
    rc_reset
fi

#
# Source function library.
#
if [ -f /etc/rc.d/init.d/functions ]; then
    . /etc/rc.d/init.d/functions
fi

exec="/usr/share/elasticsearch/bin/elasticsearch"
prog="elasticsearch"
pidfile=/var/run/elasticsearch/${prog}.pid

[ -e /etc/sysconfig/$prog ] && . /etc/sysconfig/$prog

export ES_HEAP_SIZE
export ES_HEAP_NEWSIZE
export ES_DIRECT_SIZE
export ES_JAVA_OPTS
export JAVA_HOME

lockfile=/var/lock/subsys/$prog

# backwards compatibility for old config sysconfig files, pre 0.90.1
if [ -n $USER ] && [ -z $ES_USER ] ; then
   ES_USER=$USER
fi

checkJava() {
    if [ -x "$JAVA_HOME/bin/java" ]; then
        JAVA="$JAVA_HOME/bin/java"
    else
        JAVA=`which java`
    fi

    if [ ! -x "$JAVA" ]; then
        echo "Could not find any executable java binary. Please install java in your PATH or set JAVA_HOME"
        exit 1
    fi
}

start() {
    checkJava
    [ -x $exec ] || exit 5
    [ -f $CONF_FILE ] || exit 6
    if [ -n "$MAX_LOCKED_MEMORY" -a -z "$ES_HEAP_SIZE" ]; then
        echo "MAX_LOCKED_MEMORY is set - ES_HEAP_SIZE must also be set"
        return 7
    fi
    if [ -n "$MAX_OPEN_FILES" ]; then
        ulimit -n $MAX_OPEN_FILES
    fi
    if [ -n "$MAX_LOCKED_MEMORY" ]; then
        ulimit -l $MAX_LOCKED_MEMORY
    fi
    if [ -n "$MAX_MAP_COUNT" ]; then
        sysctl -q -w vm.max_map_count=$MAX_MAP_COUNT
    fi
    if [ -n "$WORK_DIR" ]; then
        mkdir -p "$WORK_DIR"
        chown "$ES_USER":"$ES_GROUP" "$WORK_DIR"
    fi
    echo -n $"Starting $prog: "
    # if not running, start it up here, usually something like "daemon $exec"
    daemon --user $ES_USER --pidfile $pidfile $exec -p $pidfile -d -Des.default.path.home=$ES_HOME -Des.default.path.logs=$LOG_DIR -Des.default.path.data=$DATA_DIR -Des.default.path.work=$WORK_DIR -Des.default.path.conf=$CONF_DIR
    retval=$?
    echo
    [ $retval -eq 0 ] && touch $lockfile
    return $retval
}

stop() {
    echo -n $"Stopping $prog: "
    # stop it here, often "killproc $prog"
    killproc -p $pidfile -d 20 $prog
    retval=$?
    echo
    [ $retval -eq 0 ] && rm -f $lockfile
    return $retval
}

restart() {
    stop
    start
}

reload() {
    restart
}

force_reload() {
    restart
}

rh_status() {
    # run checks to determine if the service is running or use generic status
    status -p $pidfile $prog
}

rh_status_q() {
    rh_status >/dev/null 2>&1
}


case "$1" in
    start)
        rh_status_q && exit 0
        $1
        ;;
    stop)
        rh_status_q || exit 0
        $1
        ;;
    restart)
        $1
        ;;
    reload)
        rh_status_q || exit 7
        $1
        ;;
    force-reload)
        force_reload
        ;;
    status)
        rh_status
        ;;
    condrestart|try-restart)
        rh_status_q || exit 0
        restart
        ;;
    *)
        echo $"Usage: $0 {start|stop|status|restart|condrestart|try-restart|reload|force-reload}"
        exit 2
esac
exit $?

Here is such an init script.
Spent a few more reboots on the third or fourth reboot, the problem was gone. But I don't like it. I really want to figure out what's wrong

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question