Schedule Backup And Restore Elasticsearch Data using Kibana and Curator

A Quick way to setup on-Premise Elasticsearch Disaster Recovery Process

A snapshot is a backup taken from a running Elasticsearch cluster. You can take a snapshot of individual indices or the entire cluster and store it in a repository on a shared file system, and some plugins support remote repositories on S3, HDFS, Azure, Google Cloud Storage, and more.

Snapshots are taken incrementally. This means that when it creates a snapshot of an index, Elasticsearch avoids copying any data that is already stored in the repository as part of an earlier snapshot of the same index. Therefore it can be efficient to take snapshots of your cluster quite frequently.

You can restore snapshots into a running new cluster. When you restore an index, you can alter the name of the restored index as well as some of its settings.

You must register a snapshot repository before you can perform snapshot and restore operations.

The shared file system repository (“type”: “fs”) uses the shared file system to store snapshots. To register the shared file system repository, it is necessary to mount the same shared file system to the same location on all master and data nodes. This location (or one of its parent directories) must be registered in the “path.repo” setting in “elasticsearch.yml” files on all the data nodes in the Elasticsearch cluster.

The path.repo should be configured as below on all the data nodes. This shared drive should be mapped as a local drive on each of them. Here the given path “D:/ESCluster_Backup” is mounted on each of the nodes as a network drive.

path.repo setting in elasticsearch.yml file for file system

Steps for Taking Snapshot and monitoring them

  1. Download Curator from the below link https://packages.elastic.co/curator/5/windows/elasticsearch-curator-5.8.1-amd64.zip
  2. Copy and paste the following files from the ElasticSearch_Curator_Config folder to the extracted folder of the curator, i.e., ..\elasticsearch-curator-5.8.1-amd64\curator-5.8.1-amd64
  • CuratorConfig.yml
  • RESTORE_BACKUP.yml
  • TAKE_BACKUP.yml

3. In the CuratorConfig.yml, Give the Ip address of the elastic search server

  • hosts: 192.168.x.x [e.g.]

4. To create the repository “ESBackupRepo” at the path.repo location specified in elasticsearch.yml file, execute the following command from cmd at location elasticsearch-curator-5.8.1-amd64\curator-5.8.1-amd64

es_repo_mgr --config .\CuratorConfig.YML create fs --repository ESCluster_Backup --location ESBackupRepo --compression true --chunk_size 1g

ESBackupRepo created

5. Download and extract https://www.elastic.co/downloads/kibana for the respective OS.

6. Go to kibana-7.x.x-windows-x86_64\config\kibana.yml file.

7. Configure the elastic search server IP and Port in the property elasticsearch.hosts: [“http://192.168.x.x:9200"]

8. Start the Kibana from kibana-7.x.x-windows-x86_64\bin\kibana.bat file

9. log in to http://localhost:5601/

10. Click on Expand link at the bottom left corner and select Management.

Click on Management

11. Click on Snapshot Repositories in Kibana and see the created repository

Click on Snapshot Repositories
Click on ESCluster_Backup repository

12. Verify the repository by clicking the created repository ESCluster_Backup and scrolling down the properties.

Click on Verify repository

13. Verification status should be as follows:

Verification status of the Backup Repository

14. All the logging can be seen in the logging.log file created in the curator home location.

15. Also the created Repository with the name “ESBackupRepo” can be seen in the file system.

ESBackupRepo repository created inside ESCluster_Backup[path.repo] folder

17. To create the snapshot manually, execute the following command:

curator --config .\CuratorConfig.yml .\TAKE_BACKUP.yml

18. Snapshots created can be viewed from Kibana.

curator-20210207220947 is the snapshot created

19. Details of the created snapshot (default name — ‘curator %Y%m%d%H%M%S’) can be seen by clicking on it.

snapshot — curator-20210207220947 properties

20. To schedule the snapshots, go to Windows Task Scheduler

21. Create a basic task and configure the suitable name and description of the task and click next.

create scheduled task wizard- name and description

22. Select the appropriate time in which task has to be scheduled.

create scheduled task wizard- trigger time configuration

23. After selecting the appropriate values, give the path to the downloaded curator.exe in the Program/script as shown below:

Configuration of curator path

24. Give the arguments as follows:

--config .\CuratorConfig.yml .\TAKE_BACKUP.yml

25. After this the final window will open like below:

Click on the checkbox

26. Click on the checkbox to open the properties dialog to configure further settings.

27. A new window will open where triggers and security settings can be performed. So give the appropriate values.

Further options for settings

28. Edit the trigger and set the required schedule time for taking snapshots.

Set the time interval required for taking snapshots

29. Click OK to start the Scheduler in further steps.

30. The scheduler should start working and snapshots created can be further viewed from Kibana snapshots for the created repository.

How to Restore Data into a Running Elasticsearch Cluster

  1. Now to restore the data into the new running elastic cluster, execute the above steps 1 to 16 in the section Steps for Taking Snapshot and monitoring them
  2. Configure the indices: [“give the indices here separated by comma to restore”] in the action: restore section in RESTORE_BACKUP.yml
  3. After the same repository has been connected to this new cluster, execute the below command to restore the latest snapshot that was created automatically.

curator --config .\CuratorConfig.yml .\RESTORE_BACKUP.yml

To restore the elastic search index

4. When the command finishes executing, check whether the index was successfully restored by clicking on Index management in Kibana.

“esindex“ elasticsearch index restored

5. The restored indexes in the cluster should be visible here

6. To index the delta data of the index which was not restored by the curator, note the timestamp of the last snapshot created by visiting the snapshot in the repository.

Time of creation of snapshot can be seen here

7. Re- crawl the data which was indexed after this time to get the complete restoration of the elastic search cluster.

8. Elasticsearch index is ready to be searched and crawled now for new and old data.

Thanks for reading the article!!!

References

  1. https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshot-restore.html
  2. https://www.elastic.co/guide/en/elasticsearch/client/curator/5.8/snapshot.html
  3. https://www.elastic.co/guide/en/elasticsearch/client/curator/5.8/restore.html
  4. https://www.windowscentral.com/how-create-automated-task-using-task-scheduler-windows-10

An individual contributor with more than 5 years of experience in both IT service and product companies being involved in product features development