An important part of setting up an Elasticsearch cluster is to configure snapshots. I recently set up snapshots to Backblaze B2, an affordable alternative to AWS S3. This is a complete example of configuring them.
Setup
Steps:
- Configure “/etc/elasticsearch/elasticsearch.yml”. Add the following lines to the end of the “elasticsearch.yml” on all nodes:
s3.client.backblaze.protocol: https
s3.client.backblaze.path_style_access: true
- Restart elasticsearch.
- Configure your Backblaze account secrets. You will need to use the Elasticsearch keystore to set the values for “s3.client.backblaze.access_key” and “s3.client.backblaze.secret_key”:
elasticsearch keystore add s3.client.backblaze.access_key
<Enter the Backblaze access key>
elasticsearch keystore add s3.client.backblaze.secret_key
<Enter the Backblaze secret key>
- Set up “/etc/elasticsearch/backblaze_repository” on all nodes. Put in the correct values for your bucket and endpoint.
{
"type": "s3",
"settings": {
"bucket": "YOUR_BUCKET_NAME",
"endpoint": "ENDPOINT (may be s3.us-west-001.backblazeb2.com)",
"client": "backblaze",
"protocol": "https"
}
}
- Load the above repository on one node (replace “” in command):
curl -k -u 'elastic:<PASSWORD>' -H'Content-Type: application/json' -XPUT 'https://localhost:9200/_snapshot/backblaze' --data '@/etc/elasticsearch/backblaze_repository'
At this point you should be able to go to Kibana and in the menu select “Stack Management” -> “Data” -> “Snapshot and Restore” and click on the “Repositories” tab, and see “backblaze”. Click on the “backblaze” repository and click “Verify repository” to ensure it is working.
Use the “Policies” tab of the above page to set up snapshot policies, say for “daily” schedule.
A word on pricing
AWS pricing can be tricky to compare. Normal S3 pricing is $23/TB, where B2 is $6. However AWS Glacier can drop that pricing to $4 or $1. However, Glacier has additional overhead per object, which they charge for. Putting small files in Glacier can be more expensive than normal S3. ElasticSearch snapshots are a mix of small and large files so it may be a good choice, but to get to $1/TB you need to be ok with it taking 12 hours to grab snapshots, and you’ll need to know what files you need to get. If you are storing large amounts of data, it may be worth the savings, but for a couple bucks a month, we just use Backblaze.