Today, I resized a 40GB DigitalOcean droplet through their dashboard. I'm documenting some of the things I learned along the way.
A few months back we released Hyvor Talk v2, which was re-written from scratch and had a new database schema, hence a new database. For the database, I chose to have it self-hosted rather than using managed database solution from DigitalOcean, because our database usually gets high traffic, so we usually want to change MYSQL config (such as InnoDB configurations), which are not provided by the managed services. So, in our case, MYSQL was installed in a droplet.
For this new database, we chose a 16GB Memory Optimized droplet. This was a bad decision. We received high CPU usage alerts more than 10 times in the last two months. This caused the system to slow down. One thing I learned is that a general-purpose database needs a good balance between CPU and RAM.
Before Hyvor Talk v2, we had used an 8GB Shared droplet, which worked great. Surprisingly, it was faster than this new 16GB memory-optimized droplet. So, we thought that we have to migrate to a new Shared droplet or resize the current one to a shared droplet. We had two options.
I first thought of this process:
However, our database was around 40GB (data + indexes). So, mysqldump took almost 15 minutes to complete. Then, I tried importing those data to the new database. After 45 minutes, I could only import up to our pages table (which has 22 million rows). The next one was pageviews table, which has more than 100 million rows. So, this will probably take hours. We cannot afford that much downtime. So, I thought of another plan: Replication.
I could set up MYSQL Replication. This was possible but was hard, therefore, can cause a lot of downtimes if something goes wrong. I have no previous experience with MYSQL replication. Therefore, even this might work, I didn't go for it. I had some other problems too.
I skipped my "replication" idea for a few days and checked other options.
I created a test droplet with dummy data of 40GB. Then, I turned it off and started making a snapshot through the DO dashboard. Unfortunately, it took more than 40 minutes for the snapshot. So, this option won't work. We cannot afford a downtime of 40 minutes.
This was actually the first option I considered. However, DigitalOcean documentation says, "Allow for about one minute of downtime per GB of used disk space, though the actual time necessary is typically shorter.". This was unfortunate. 40 minutes of downtime for 40GB of data is just not acceptable. And, they have not specified how "shorter" the required time can be. Only because of this sentence, I had to check the other two options which I explained earlier.
However, after I realized those options are either not going to work or hard to do, I came back to this option. I created a new droplet of the same type (same RAM/CPU/DISK) of the current droplet, and added dummy data to fill the disk up to 90% (Close to 50GB). Now, if the documentation was right, it should take about 50 minutes for the resize. I gave it a try, and surprisingly, the resize was finished in just less than 1 minute. I tried it another time on a new droplet to make sure that the last result was not a "random" case. Nope, looks like I'm lucky.
So, today, I resized the production database, and the downtime was just 2 minutes (shutting down + resizing + turning it on again).
If you want to resize a droplet, resizing through the DigitalOcean dashboard seems to be the best option. Even the documentation says it will take 1 minute per 1GB, in my case, it took less than 1 minute for 40GB. However, do not take this as a fact. Do your own testing before resizing a production droplet (such as a database), which cannot afford long downtimes.
If you have any questions, feel free to comment below.