3

What is the best practice to deploy database in microservices architecture, more precisely in distributed environment, such as docker swarm? Microservices principles states each service should be stateless to enable scaling. As database obviously has a state, should it live at fixed position outside of cluster, deployed and configured before the cluster is initialized?

I'm confused, because all docker compose examples includes database container in the service definition. But things aren't that simple. Often the database needs a lot of configuration before it's ready to use. Also, docker sucks at coordinating the service starting order.

If it's really a good practice to deploy the database alongside with services to docker swarm, how to ensure consistency and persistence of cricial data?

1 Answer 1

1

This is a good question and one I think a lot of people are still thinking through as far as best practices are concerned. The answer really depends on your needs. There are several ways to crack this nut but these are the two I'm using right now:

  • Running the database in the typical manner on dedicated machine(s) with replication, etc
  • I am currently experimenting with running the database as a service on a Docker Swarm cluster with the data persisted across the cluster with GlusterFS
    • I have three machines in the cluster labeled as database machines
    • These database machines all run a GlusterFS container providing the GlusterFS capabilities
      • When the database service is started I map the GlusterFS share into the container and specify that the service should only run on a machine labeled as a database node. With this setup it doesn't matter which node the database service starts on and if a machine fails the database service is automatically migrated to another node labeled as a database node. The GlusterFS replication of data ensures the integrity of the persisted data.

As mentioned, it is my understanding that there is still a lot of experimentation going on with this and 'best practices' are not entirely established. Those best practices will ultimately depend on your needs and risk tolerances.

Sign up to request clarification or add additional context in comments.

6 Comments

The biggest concern with using glusterFS and a database is the whole issue of network traffic and the max capacity for gluster. Last year I had a gluster distributed data store for a PHP application that included logs. We actually got to a point where scale from logs alone overloaded gluster over and over again just from the load from distributed read writes to the logs
@Dockstar, agreed and thank you for posting your experience. That is why I'm testing it at the moment. How did you have your GlusterFS configured? Distributed? Striped, distributed + striped, replicate, replicate + distribute?
Ours was striped and not replicated since we had multiple endpoints. Honestly, unless you have dedicated Flash (not disks, but flash cards) I can't see the IO living up to striping data at that pace as well as replication. What I've been looking at is a way to do automatic service discovery for active active galera clusters. Would replicate at the DB level not storage, but even then you're hitting the storage array over and over for the same blocks
@Dockstar, that sounds like yet another way that would be worth investigating. At what size database and log files did you start noticing performance issues and what was the network environment like with your GlusterFS setup?
So in our environment, the Databases were actually their own servers. These were literally just application logs. We had 22 queue runners running through jobs that would generate about 100MB of logs a day. The issue comes when you have like 8 hosts that are striping that way. The sheer volume of reads and writes required to keep up with it kept causing gluster to fail, so we backed off a lot of our application logging and decided not to move that direction this year. If you throw a database at it, it would all depend on your loads and replication strategies.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.