On Percona Live! Amsterdam 2015 we had a talk with Peter Boros about GTID replication.
Here are the slides.
Well, it was ended a week ago, but I had too many errands to run so I couldn’t post anything about it.
It was really great, again.This was the third time I attended (2013 London, 2015 Santa Clara) so now I have met with a lot of familiar people – it is true that MySQL has a great community. The chosen city was great, Amsterdam is one of the coolest places in Europe, the hotel was neat, and the programs were also astounding.
The conference sessions were great too, I really enjoyed, them all, and because they are running on 8 thread parallel it is not that bad that there are some recurring sessions; if you missed one in spring you can watch it on autumn.
So, everything was comfy and neat. I hope I’ll attend on the next one too …
There were a few topics where I plan to dig deeper in the next weeks
Last time I was checked, how can TokuDB be used as a drop-in replacement for InnoDB. The first impressions were jolly good; way less disk space usage, and the TokuDB host can be a part of the current replication cluster.
So far so good.
After TokuDB was announced as a new storage engine for MySQL, it made me very curious, but I didn’t try it out until now.
I try to check it from different aspects and I’ll be the blog it steps by step. I don’t do any serious benchmarking, just play with it, and see if it could be fit into Kinja’s MySQL ecosystem.
I use one of our development servers as a TokuDB playground. Sadly that hardware is not the same as the database masters nor as the slaves, so performance tests couldn’t be made on that piece of metal but many other ways are open to doing this.
I’ve installed the tokudb plugin from the Percona repository. The setup was quite easy and fast, the documentation is nice.
I showed in an earlier post how to drop a whole database in a very safe way (no replication lag at all) and that technique could be used to drop a single table too, but cleaning up a table can take hours if not days to finish, so this is not the most comfortable way to do that. We also don’t want to have even a small spike of replication lag, so we need to find another solution.
How to remove database in a safe way
When you have to drop a large database, you’ll encounter some problems, mainly replication…
MySQL replication is great, and kind of reliable, but sometimes it could be messed up. The good news is that we can handle this.
Let’s see how replication happens when everything is fine!
When you have to drop a large database, you’ll encounter some problems, mainly replication lag. Now I’ll show you how to avoid this.
What can cause replication lag when you drop a database? First, it takes some disk I/O to unlink the files, and secondly, MySQL will scan through the buffer pool to see if there are pages from that database or not. On a huge (or at least big) database this could take seconds or even minutes, what means your slaves will collect lag for seconds or (of course) even minutes.
We love graphs.
Really love them. I think everyone likes graphs to collect data about the current state of their system, and needless to say, why.
Of course, sometimes it is painful to create graphs, but graphite could make this process easier, so we use them.
The graphite ecosystem makes data collection simple, when you send some data to statsd via simple UDP packets it will put them to a carbon database, and graphite will draw the lines. The only thing that can make this hard, is the question of ‘How can I collect my data to send?’
Well, I’ve checked many ways to solve this problem, but I didn’t find anything simple enough.
So, I wrote a daemon called ‘Mambocollector’ to deal with this problem. It is far from perfect, it has bugs, and I am not sure if it isn’t used too many resources when we collect a too many data, but it is working fine for my current needs. The project can be found on GitHub, feel free to use it, or participate in that.
UPDATE: This version was bugging badly, I rewrote it in Go.
In my last post, I showed how to fix replication errors on slaves, but I’ve made a mistake: my current example wasn’t good, after skipping the command or inserting an empty transaction the dataset was different because of a timestamp holding date column which is the CURRENT_TIMESTAMP default. Fixing the error solved the problem of the running replication thread, but the data wasn’t same on the hosts. I decided to leave this as-is, and instead of recreating the test, I rather show how to sync the databases.
For this, we can use the ‘pt-table-checksum’ and ‘pt-table-sync’ utilities from Percona toolkit.
Every MySQL DBA should deal with the situation, when there were an accidental write on one of the slaves. Changing replication to GTID will change the way how we should deal with that problem.
Let’s check out!