Why doesn't PostgreSQL start on the Pacemaker master node?

R

Roman Krolikov2014-06-30 08:23:57

linux

Roman Krolikov, 2014-06-30 08:23:57

There is a Pacemaker cluster with some resources. The configuration is the following:

node sky01 \
        attributes standby="off"
node sky02
primitive drbd_fs ocf:heartbeat:Filesystem \
        params device="/dev/vg1/cluster" directory="/cluster" options="noatime,nodiratime" fstype="xfs" \
        op start interval="0" timeout="60" \
        op stop interval="0" timeout="120"
primitive drbd_sky ocf:linbit:drbd \
        params drbd_resource="sky" \
        op monitor interval="15" \
        op start interval="0" timeout="240" \
        op stop interval="0" timeout="120"
primitive lvm_vg1 ocf:heartbeat:LVM \
        params volgrpname="vg1" \
        op start interval="0" timeout="30" \
        op stop interval="0" timeout="30" \
        meta target-role="Started"
primitive mon_sky ocf:pacemaker:ClusterMon \
        params user="root" update="5" extra_options="-E /usr/local/bin/mon_cluster.sh -e [email protected]" \
        op monitor on-fail="restart" interval="30" \
        meta target-role="Stopped"
primitive pub_ip ocf:kumina:hetzner-failover-ip \
        op start interval="0" timeout="360" \
        params ip="5.9.34.18" script="/usr/local/sbin/parse-hetzner-json.py"
group Application lvm_vg1 drbd_fs pub_ip
ms ms_drbd_sky drbd_sky \
        meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" target-role="Started"
location cli-prefer-Application Application \
        rule $id="cli-prefer-rule-Application" inf: #uname eq sky01
location loc_ms_01 ms_drbd_sky 100: sky01
location loc_ms_02 ms_drbd_sky 10: sky02
location loc_sky_01 Application 100: sky01
location loc_sky_02 Application 10: sky02
colocation col_sky_drbd inf: Application ms_drbd_sky:Master
order ord_sky inf: ms_drbd_sky:promote Application:start
property $id="cib-bootstrap-options" \
        dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
        cluster-infrastructure="openais" \
        expected-quorum-votes="2" \
        stonith-enabled="false" \
        no-quorum-policy="ignore" \
        last-lrm-refresh="1404105298" \
        start-failure-is-fatal="false" \
        stop-all-resources="false" \
        symmetric-cluster="false"
rsc_defaults $id="rsc-options" \
        resource-stickiness="100"

This is how everything works well: we lose a node, resources migrate.
If you add a PostgreSQL resource

crm configure primitive pgsql lsb:postgresql \
 op monitor interval="30" timeout="60" \
 op start interval="0" timeout="60" \
 op stop interval="0" timeout="60"

then it immediately starts on the slave node, which leads to an error. If the slave node is turned off, then postgres will migrate to the master and start normally. Enable slave node - postgres migrates back - error.
Those. complete disregard for group, location and stickiness. Tell me what's the problem?

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

R

Roman Krolikov, 2014-06-30
@r_krolikov

It's decided. The problem was in /etc/init.d/postgresql script (removed " set +e "). Because of this, the cluster behaved strangely, running PG on two nodes.