U
U
unwrecker2019-03-14 13:14:41
Virtualization
unwrecker, 2019-03-14 13:14:41

What can I do to make automatic migration work in OpenNebula?

As I understand it, since the "Opennebula" tag was not mentioned on the Toaster before me, then there are few chances, but still maybe the answer to the second question will be ....
1. So, I installed Opennebula on 3 servers according to the instructions from the official site . Enabled parameters responsible for HA: FEDERATION, RAFT_LEADER_HOOK, RAFT_FOLLOWER_HOOK, HOST_HOOK. The first part worked: the common ip is thrown when one of the hosts is disconnected. And here with HOST_HOOK in general .
In the config it looks like this:

HOST_HOOK = [
    NAME      = "error",
    ON        = "ERROR",
    COMMAND   = "ft/host_error.rb",
    ARGUMENTS = "$ID -m -p 5",
    REMOTE    = "no" ]

When shutting down one of the servers, the onezone utility gives out quite the correct information:
# onezone show 0
ZONE 0 INFORMATION                                                              
ID                : 0                   
NAME              : OpenNebula          


ZONE SERVERS                                                                    
ID NAME            ENDPOINT                                                       
 0 server-0        http://10.0.0.50:2633/RPC2
 1 server-1        http://10.0.0.51:2633/RPC2
 2 server-2        http://10.0.0.52:2633/RPC2

HA & FEDERATION SYNC STATUS                                                     
ID NAME            STATE      TERM       INDEX      COMMIT     VOTE  FED_INDEX 
 0 server-0        leader     64         1345       1340       0     -1
 1 server-1        follower   64         1340       1340       -1    -1
 2 server-2        error      -          -          -          -     -

ZONE TEMPLATE                                                                   
ENDPOINT="http://localhost:2633/RPC2"

But onehost for some reason thinks that the host is alive:
# onehost list
  ID NAME            CLUSTER   TVM      ALLOCATED_CPU      ALLOCATED_MEM STAT  
   1 cl0             default     0      0 / 2400 (0%)   0K / 187.6G (0%) on    
   2 cl1             default     0      0 / 2400 (0%)   0K / 187.6G (0%) on    
   3 cl2             default     1    200 / 2400 (8%)   2G / 187.6G (1%) on

And onevm thinks that the virtual machine is running on this host:
# onevm list
    ID USER     GROUP    NAME            STAT UCPU    UMEM HOST             TIME
     0 oneadmin oneadmin ubuntu 1        runn  0.0      2G cl2          0d 18h26

The oned.log is filled with information about replication errors:
Thu Mar 14 13:02:55 2019 [Z0][ReM][D]: Req:1872 UID:0 one.zone.raftstatus invoked
Thu Mar 14 13:02:55 2019 [Z0][ReM][D]: Req:1872 UID:0 one.zone.raftstatus result SUCCESS, "<RAFT><SERVER_ID>0</..."
Thu Mar 14 13:02:55 2019 [Z0][ReM][D]: Req:4176 UID:0 one.vmpool.info invoked , -2, -1, -1, -1
Thu Mar 14 13:02:55 2019 [Z0][ReM][D]: Req:4176 UID:0 one.vmpool.info result SUCCESS, "<VM_POOL><VM><ID>0</..."
Thu Mar 14 13:02:55 2019 [Z0][ReM][D]: Req:7744 UID:0 one.vmpool.info invoked , -2, -1, -1, -1
Thu Mar 14 13:02:55 2019 [Z0][ReM][D]: Req:7744 UID:0 one.vmpool.info result SUCCESS, "<VM_POOL><VM><ID>0</..."
Thu Mar 14 13:02:56 2019 [Z0][RCM][D]: Faild to replicate log record at index: 566 on follower: 2, error: Error replicating log entry 566 on follower 2: RPC call timed out and aborted
Thu Mar 14 13:03:00 2019 [Z0][RCM][D]: Faild to replicate log record at index: 566 on follower: 2, error: Error replicating log entry 566 on follower 2: RPC call timed out and aborted
Thu Mar 14 13:03:04 2019 [Z0][RCM][D]: Faild to replicate log record at index: 566 on follower: 2, error: Error replicating log entry 566 on follower 2: RPC call timed out and aborted
Thu Mar 14 13:03:07 2019 [Z0][RCM][D]: Faild to replicate log record at index: 566 on follower: 2, error: Error replicating log entry 566 on follower 2: RPC call timed out and aborted
Thu Mar 14 13:03:11 2019 [Z0][RCM][D]: Faild to replicate log record at index: 566 on follower: 2, error: Error replicating log entry 566 on follower 2: RPC call timed out and aborted
Thu Mar 14 13:03:15 2019 [Z0][RCM][D]: Faild to replicate log record at index: 566 on follower: 2, error: Error replicating log entry 566 on follower 2: RPC call timed out and aborted
Thu Mar 14 13:03:18 2019 [Z0][RCM][D]: Faild to replicate log record at index: 566 on follower: 2, error: Error replicating log entry 566 on follower 2: RPC call timed out and aborted
Thu Mar 14 13:03:22 2019 [Z0][RCM][D]: Faild to replicate log record at index: 566 on follower: 2, error: Error replicating log entry 566 on follower 2: RPC call timed out and aborted

But not a word about calling a hook.
Accordingly, opennebula believes that the virtual machine is working, and nothing is migrating anywhere. What is it and how to deal with it?
2. I chose the right solution for a long time, but maybe wrong - OpenNebula has a very small community. Maybe there is another similar solution? We need opensouce, not very monstrous (like openstack) - I will have only 3 servers, and most importantly, when one node falls, nothing should break (or quickly rise), even if this node had the only hypostasis of some necessary virtual machine.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
S
Sergey Fadeev, 2019-04-19
@bartelby

onezone show 0
This command shows the status of the management servers where one is running.
onehost list
Shows the status of the hosts where the VM is running.
Usually these are physically different servers.
What are you using as a dadastor?
If "ssh" - then the file / disk of the VM remained on the host that you turned off.

U
unwrecker, 2019-04-19
@unwrecker

There was already a big thread with correspondence, but, apparently, the author erased it. All obvious problems have been eliminated. No solution found.
And everything works as it should for you? What has been done for this besides instructions from opennebula? What is the network configuration?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question