Why Did We Start Using AWS Secrets Manager To Store Sensitive Data?
👋 Hi. I'm Sunil.
I work at a startup during the day and as an indie developer at night.
In this article I'm going to talk about why we started using AWS Secrets Manager to manage sensitive data like passwords, keys, tokens etc used in our applications.
This is what is covered in this article:
- What the hell are you talking about?
- What did our previous setup look like?
- Problems with the old setup
- How did we use secrets manager to solve the problems?
If you're not familiar with a configuration server, it's basically a service which is used to keep configuration data needed in applications. If you are wondering what configuration data is, it can be any data which is needed to run applications and is different for different environments like staging and production. For example: If you're using stripe to charge your customers, you'll need too use different keys for staging and production environments.
We keep different configuration servers for different environments. A staging server where staging configuration data would be stored and similarly a production server for prod configuration data. On the runtime applications would pull this data from these servers and use them.
The next question is how can we input this data to a configuration server?
There are multiple ways to store configuration data depending on the service you're using. In our case it is Consul and this is how we use it:
- We store configuration data for each service as a json file in staging and prod branches of a Github repository for respective environments.
- Every time we make changes to this repository, we run a Jenkins job to update the data on Consul server.
- Our micro services which are orchestrated using Docker would pull the configuration data and keep it in their docker images that are built and then use it on the runtime.
There are many problems with storing sensitive data as part of code in a Github repository.
- It is not the right way to keep sensitive data as part of your code on Github even if it's a private one.
- It is not secure. If hackers get access to your Github account, all your secrets would be compromised.
There are many applications which provide encrypt / decrypt services to store sensitive data. The idea here is that we provide data in plain format. The service then encrypts the data using a key. When you request for this data, the service then decrypts this data and sends you back in the plain format. There are many details involved in this process like encrypt / decrypt algorithm used, access to read and write data from services, limiting the access to specific keys etc. But these are out of the scope of this blog.
AWS Secrets Manager is one of them and works well with the AWS ecosystem of services.
This is what we did to solve the problems discussed above:
- Removed sensitive configuration data from our private Github repositories.
- Used KMS and Secrets Manager to store sensitive data. Here we manually add the data one by one on the AWS console after we login.
- Provided access to our applications / micro services through policies and attaching them to ecs task roles to read data stored in Secrets Manager.
- Updated Dockerfile and consul template files of our services to pull the data stored in Secrets Manager when building docker image.
- Note that we still store all the other non sensitive data in a Consul server. When we build docker images we pull data from both Consul and AWS Secrets Manager and merge them and use.
- Our micro services now read sensitive data from AWS Secrets Manager instead of from a consul server. And there are no worries of someone hacking our data as AWS ensures keeping this data secret and encrypted.
I hope this article has given some insights into how sensitive data should be stored and accessed during the lifecycle of an application.
I've left out many details involved in each step on purpose. Let me know in comments if some parts are not explained clearly. I would be happy to answer your questions 🙂
If you're a beginner to cloud computing and want to learn AWS concepts, here's a great course by Daniel Vassallo who has worked in Amazon AWS team for 10+ years.
I highly recommend buying this course if you think documentations are overwhelming.
Here is the link if someone is interested.
Thanks for sharing your experience with storing secrets. I have a question, what about configs (not secrets). Why do you store configuration data as files in repo and then syncing it to consul?
Well you don't need to. You just need to find a way to upload your configuration data to consul server before deploying your services.
In our case each team keeps this data in their own Github repositories and run a Jenkins job which in turn uploads the data to consul server.
This process works best for us considering there are multiple teams and each team doesn't need to have access to other team's configurations.
Curious as to why you're not using Parameter Store for all the other config values? These can be implemented as infra-as-code and if they aren't secrety they can be committed alongside the application (similar to how you're currently doing it)
- Wouldn't you need to update code anyways when there are new values?
- Not sure I see the distinction between the two; if its in code and you manually change in consul, you could just do the same with Param Store. But even if a config value results in a restart, its still a fully automated and auditable change.
- Yes, you can define the secret, but manually add in the value as a side process. If the secret needed (or would benefit) regular rotation then you can enable it.
Pete As I had mentioned earlier we currently manually add each of the secrets into SM. Adding all the configurations like this was not realistic since we generally have lot of config data for each service.
Also this can be error prone because of the interface AWS currently has to add secrets (my opinion)