Leaistic: A Library and Microservice for Managing Elasticsearch Content

Leaistic Library Header image

ElasticSearch is a great technology for a wide variety of use cases from autocompletion to log management, and is likely to be part of your stack for many complex projects.

In ElasticSearch, you put your data in indices. An index is a collection of documents, that shares the same mapping and settings.

We have found that Index management can be hard to manage over time:

  1. ElasticSearch is not truly schemaless but it is smart enough to figure out the schema (aka datatype) most of the time. Thanks to that, you can play with ElasticSearch without configuring anything at first.
  2. In a typical project, you will start with small documents, then add, delete and alter field content. You’ll also need to add some mappings to have control over the datatype of the fields. Then, you’ll probably need to update some field mapping at some point, only to discover you can’t, without re-indexing all of the index content. You’ll encounter the same issue, should you need to change your index static settings. Unfortunately if you did not prepare for it, this means production downtime.
  3. ElasticSearch will figure out the datatype of a given field using the dynamic mapping rules it has. The first document you index containing a given field is its only context to figure out the right datatype.

During the life of a project, without being managed, an index will surely break.

A Workflow to Manage These Problems

In order to have the flexibility to change your index mappings or static settings whenever you want, you need to be able to create a new index with the same content and the new mappings and settings. Once reindexed, you need to use this new index instead of the original one. Ideally, this should happen with no downtime.

Keeping track of changing indices names at a given moment in your project can be hard to manage. ElasticSearch provides a solution to this with aliases: you can search and write to aliases like you do with indices. Aliases are just redirects to one or more indices, possibly with some filtering. They also allow you to redirect all of the queries from an index to a new one, with no downtime.

Here are simplified, nominal examples of what you should do to manipulate index-like alias foo:

  • create a new index structure:
  •  

  • update an existing index structure:
  •  

  • delete an existing index structure:

 

One of the best ways to allow both machines and humans (for testing) to create indices that work in the long run, is to deploy index templates prior to any creations or updates with this workflow.

Introducing Leastic

Leaistic is both a library you can integrate, and a standalone REST microservice with a nice SwaggerUI.

Leaistic provides high-level primitives for creating, updating, and deleting indices, index templates, and aliases working together to help with the problems listed above. It does not cover all ElasticSearch possibilities like rollover index, multi-indices strategies, etc. If you start a new project with ElasticSearch, it will help you get your indices right.

Leaistic tries to be resilient against external changes in parallel with its operations, and has a per-alias locking mechanism to compensate for ElasticSearch’s lack of transactions for these operations. It is committed to providing helpful errors, and to rollback, when necessary, to the best possible state.

There is detailed usage information in the repo. Be sure to also read ‘Why use Leaistic’ and let us know what you think.

Image: unsplash-logoGabriel Sollmann

Top