Log in

No account? Create an account
entries friends calendar profile Previous Previous Next Next
An Elk. With scales! - Ed's journal
An Elk. With scales!
This week I have mostly been fiddling with Docker and Elasticsearch/Logstash/Kibana. (Known as 'ELK').

The basics are something I've fiddled with before - elasticsearch is a NoSQL database that's built to shard and scale. Logstash is a log parsing tool, which extracts log metadata and inserts it into ... well, a variety of databases, but in this case I'm using elasticsearch.

And Kibana is a visualisation tool, that - amongst other things - has a configuration for doing logstash parsed logs out of an elasticsearch back end.

I've tried this before - and it worked fine - but what I wanted to try this time is making a scalable system. And thus docker containers. If you haven't encountered them, they're ... sort of like a mini virtual machine. You create a docker image - which is essentially an application, but bundled with all it's dependencies.

And from the image, you create containers - runnable instances of an application. But the key point is, each container is ... well, self contained. All the dependencies are bundled up together, which makes them particularly portable - relocate and start wherever you need/want. (Well, provided you have at least a basic docker build - the whole point is you don't actually need to install much else).

But the thing I was trying to do here is use a private docker network, and create a set of containers that would basically auto-configure - allowing you to 'spin up' extra nodes as you need to.

With the elasticsearch database this is working nicely - because you're instantiating containers off images, you need to think in terms of persistence. You can therefore create and attach a 'storage' container, that _is_ persistent - and just attach to that with your current elasticsearch image.

But the base 'discovery' mechanism is an IP unicast, which allows you to specify a set of 'discovery' nodes to find the initial cluster. It works well enough, but it does require you have a particular set of IP addresses active.

Logstash/Kibana is a bit less good at the dynamic discovery, so I'm still working on it. Logstash, given it's near-real-time nature it shouldn't be too hard to start/stop and do node discovery as part of the startup script, but Kibana it's a bit less easy.

So I'm thinking I might try looking at haproxy next, or some other discovery mechanism.

But otherwise, as it stands - I've got a container 'set' that it took me about 10 minutes to start up an extra 'node' in my cluster, to add storage/compute resources. (And most of that was installing the updates I needed for docker-engine to do the multi-host network).

So all good so far.
Leave a comment