How to perform a Disaster Recovery test

In this article, you’ll find a step-by-step guide on how to conduct a Disaster Recovery (DR) test by executing a manual failover.

Important

To successfully conduct this test, the Primary member must be abruptly shut down.

Requirements

Two Segura instances must be available.
Instances must be in the same cluster and operating correctly. For more information on cluster settings, refer to the article How to create a cluster.
Take a snapshot of the instances.

This test is intended for two instances as cluster members. Before starting, check the following tags at the bottom of each member:

Before conducting the test, it’s crucial to take a snapshot of the instances as a precaution since abrupt shutdowns can cause damage.

Important

Always take snapshots in reverse order of the cluster. In this case, first take a snapshot of Member B and then of Member A.

To take a snapshot, follow these steps:

sudo orbit shutdown

Access Orbit Server Manager > Replication > Elasticsearch.
In the Data search cluster and the Cluster members tables, check if the cluster size corresponds to 2.

Caution

Make sure not to use Wildcards (*).

Info

This list will make the Assume as Primary button visible to users.

Caution

When using subnet masks, adopt the CIDR notation, for example, 192.168.1.0/24.

Caution

Ensure it’s an abrupt shutdown; otherwise, the cluster will detect the deactivation, and Member B will not display the Recovery page.

Once Member A is inactive due to unexpected behavior, Member B will enter a split brain, blocking any database changes until manual instructions.
Then, the Recovery page will be displayed on the web application.
Click Assume as Primary.
Confirm by clicking Yes. This will set Member B as the new Primary member. This process may take a few minutes.

Info

Ensure the button appears; otherwise, refer to How to enable Recovery to ensure IPs are configured correctly.

Once the Orbit Web interface is available on Member B, check if the tag indicates that this instance is now the Primary member.
To access other Segura modules, you need to enable the application. Go to Orbit > Settings > Application, and toggle the Enable application button to the active position.
Click Save.

If the green color is displayed, then the application is activated.

After these steps, all Segura functionalities will be available and operational on the DR Member B.

Activate Member A and wait for synchronization with the other cluster database. This may take a few minutes.

Info

Member A will identify the issue, and Member B, currently Primary, will automatically synchronize new information between members.

After synchronization, the login page will be displayed on the main web application interface.
Log in to Member A's web application and click Assume as Primary to restore it as the Primary member.
On Member B, go to Orbit Server Manager > Settings > Application, and toggle the Enable application button to the inactive position.
Click Save.

Info

Make sure the green color is not displayed.

Initiate an SSH session on Member A using port 59022 with the user mt4adm.
Run the command sudo orbit application status to check the following information:

 sudo orbit application status

Application: Active
Replication: Active
Instance:    Cluster
Primary:     memberB
Main:        No

Then, execute the command sudo orbit application primary to set Member A as Primary:

sudo orbit application primary

Application: Active
Replication: Active
Instance:    Cluster
Primary:     memberA
Main:        Yes