Planet MySQL

MySQL High Availability Framework Explained – Part II

In Part I, we introduced a High Availability (HA) framework for MySQL hosting and discussed various components and their functionality. Now in Part II, we will discuss the details of MySQL semisynchronous replication and the related configuration settings that help us ensure redundancy and consistency of the data in our HA setup. Make sure to check back in for Part III where we will review various failure scenarios that could arise and the way the framework responds and recovers from these conditions.

What is MySQL Semisynchronous Replication?

Simply put, in a MySQL semisynchronous replication configuration, the master commits transactions to the storage engine only after receiving acknowledgement from at least one of the slaves. The slaves would provide acknowledgement only after the events are received and copied to the relay logs and also flushed to the disk. This guarantees that for all transactions committed and returned to the client, the data exists on at least 2 nodes. The term ‘semi’ in semisynchronous (replication) is due to the fact that the master commits the transactions once the events are received and flushed to relay log, but not necessarily committed to the data files on the slave. This is in contrast to fully synchronous replication, where the transaction would have been committed on both the slave and the master before the session returns to the client.

Semisynchronous replication, which is natively available in MySQL, helps the HA framework to ensure data consistency and redundancy for committed transactions. In the event of a master failure, all transactions committed on the master would have been replicated to at least one of the slaves (saved to the relay logs). As a result, failover to that slave would be lossless because the slave is up to date (after the relay logs of the slave are fully drained).

Replication and Semisynchronous Related Settings

Let’s discuss some of the key MySQL settings used to ensure optimal behavior for high availability and data consistency in our framework.

Managing The Execution Speed of The Slaves

The first consideration is to handle the ‘semi’ behavior of semisynchronous replication which only guarantees that the data has been received and flushed to the relay logs by the I/O thread of the slave, but not necessarily committed by the SQL thread. By default, the SQL thread in a MySQL slave is single-threaded and will not be able to keep pace with the master which is multi-threaded. The obvious impact of this is that in the event of a master failure, the slave will not be up-to-date as its SQL thread is still processing the events in the relay log. This will delay the failover process as our framework expects the slave to be fully up-to-date before it can be promoted. This is necessary to preserve data consistency. To address this issue, we enable multi-threaded slaves with the option slave_parallel_workers to set the number of parallel SQL threads to process events in the relay logs.

In addition, we configure the below settings which ensure that the slave does not enter any state that the master was not in:

This provides us with stronger data consistency. With these settings, we will be able to get better parallelization and speed on the slave, but if there are too many parallel threads, the overhead involved in coordinating between the threads will also increase and can unfortunately offset the benefits.

Another configuration we can use to increase the efficiency of parallel execution on the slaves is to tune binlog_group_commit_sync_delay on the master. By setting this on master, the binary log entries on the master and hence the relay log entries on the slave will have batches of transactions that can be processed parallelly by the SQL threads. This is explained in detail in J-F Gagné’s blog where he refers to this behavior as ‘slowing down the master to speed-up the slave’.

If you’re managing your MySQL deployments through the ScaleGrid console, you have the ability to continuously monitor and receive real-time alerts on the replication lag of the slaves. It also allows you to dynamically tune the above parameters to ensure the slaves are working hand in hand with the master, therefore, minimizing your time involved in a failover process.

MySQL High Availability Framework Explained - Part IIClick To Tweet Important Semisynchronous Replication Options

MySQL semisynchronous replication, by design, can fall back to asynchronous mode based on the slave acknowledgement timeout settings or based on the number of semisynchronous-capable slaves available at any point in time. Asynchronous mode, by definition, does not provide guarantees that committed transactions are replicated to the slave and hence a master loss would lead to losing the data that has not been replicated. The default design of the ScaleGrid HA framework is to avoid falling back to asynchronous mode. Let’s review the configurations that influence this behavior.

  • rpl_semi_sync_master_wait_for_slave_count

    This option is used to configure the number of slaves that must send an acknowledgment before a semisynchronous master can commit the transaction. In the 3-node master-slave configuration, we set this to 1 so we always have an assurance that the data is available in at least one slave while avoiding any performance impact involved in waiting for acknowledgment from both of the slaves.

  • rpl_semi_sync_master_timeout

    This option is used to configure the amount of time that a semisynchronous master waits for acknowledgment from a slave before switching back to asynchronous mode. We set this to a relatively high timeout value so there is no fallback to asynchronous mode.

    Since we are operating with 2 slaves and the rpl_semi_sync_master_wait_for_slave_count is set to 1, we’ve noticed that at least one of the slaves does acknowledge within a reasonable amount of time and the master does not switch over to asynchronous mode during temporary network disruptions.

  • rpl_semi_sync_master_wait_no_slave

    This controls whether the master waits for the timeout period configured by rpl_semi_sync_master_timeout to expire, even if the slave count drops to less than the number of slaves configured by rpl_semi_sync_master_wait_for_slave_count during the timeout period. We retain the default value of ON so that the master does not fall back to asynchronous replication.

Impact of Losing All The Semisynchronous Slaves

As we saw above, our framework prevents the master from switching to asynchronous replication if all the slaves go down or become unreachable from the master. The direct impact of this is that writes get stalled on the master impacting the availability of the service. This is essentially as described by the CAP theorem about the limitations of any distributed system. The theorem states that, in the presence of a network partition, we will have to choose either availability or consistency, but not both. Network partition, in this case, can be considered as MySQL slaves disconnected from the master because they are either down or unreachable.

Our consistency goal is to ensure that for all committed transactions, the data is available on at least 2 nodes. As a result in such cases, the ScaleGrid HA framework favors consistency over availability. Further writes will not be accepted from clients though the MySQL master will still be serving the read requests. This is a conscious design decision we have made as the default behavior which is, of course, configurable based on the application requirements.

Make sure to subscribe to the ScaleGrid blog so you don’t miss Part III where we will discuss more failure scenarios and recovery abilities of the MySQL HA framework. Stay tuned!!

Care when changing MASTER_DELAY with CHANGE MASTER TO (delayed replication)

A few days ago, I stepped into a trap !  This made me lose time for fixing things (and even more for writing this post...).  In the hope that you will avoid my mistake, I am sharing this war story.  I also obviously opened a bug, more about this below. TL&DR: be careful when using CHANGE MASTER TO MASTER_DELAY = N: it might wipe your relay logs ! As written in the TL&DR, running CHANGE MASTER 

ProxySQL Series : Query Cache with ProxySQL

We know that MySQL query cache is deprecated as of MySQL 5.7.20 and removed in MySQL 8.0 and It has been recommended to use ProxySQL (or other) external query cache instead.

In this blog post, we’ll present the ProxySQL query cache functionality and how does it help us.

How to setup ProxySQL between client and database servers is out of the scope of this article.

If you are looking for other articles on ProxySQL Series :

ProxySQL Query Cache :

ProxySQL query cache stores the text of a SELECT statement together with the corresponding result sent to the client. If the application will re-execute the same query, the resultset will be returned by the embedded Query Cache.

It is a quite common scenario to identify database load caused by not-optimal SELECT statements that generate a result set that should be cached for a few seconds. To implement a code change can be a long process (developers should write new code, build it, testing in staging, then deploy on production), and this is often not a suitable option during an emergency. As the configuration of the database proxy layer (ProxySQL in this case) falls under the responsibility of DBAs, to enable caching DBAs won’t require developers to make changes to the application.

Therefore this is a feature that empowers the DBAs.

How ProxySQL query cache gets enabled?

In ProxySQL query cache we have to define what traffic gets cached. This is done by defining query rules that match traffic that is going to be cached and setting a “cache_ttl” for the cached results

Below is the scenario for optimal use of query cache –

For one of our client, we were getting thread spike alerts from master node.

after debugging we saw ProxySQL is opening number of connection to a master server to execute some verification from code to see if the table has all correct columns.

An application was sending `DESCRIBE table` queries very often and developer did not have a quick fix to control or modify that code because it was coming from some ORM.

As per 10 minutes stats, we observed around 346K describe queries comming on DB , i.e. 578 describe queries per second.

mysql> select sum(count_star) from stats_mysql_query_digest where digest_text like 'DESCRIBE%'; +-----------------+ | sum(count_star) | +-----------------+ | 346741 | +-----------------+

We have decided to create query rule to forward all queries starts with DESCRIBE on replica nodes and cache result set for 60 minutes.

60 mins cache because we got to know from client, that DB schema changes happen mostly on weekend and every time they introduce changes in the database first and after ~ 60 mins later they deploy new code.

Taking this into consideration, we added the following rules:

1 : Writer hostgroup

2 : Reader hostgroup

INSERT INTO mysql_query_rules (rule_id,active,match_digest,destination_hostgroup,cache_ttl,apply,comment) VALUES (100,1,'^DESCRIBE',2,3600000,1,'Cache and reroute DESCRIBE Table queries on readers HG'); LOAD MYSQL QUERY RULES TO RUNTIME; SAVE MYSQL QUERY RULES TO DISK; mysql> select rule_id,active,username,schemaname,flagIN,digest,match_digest,match_pattern,destination_hostgroup,cache_ttl,apply,comment from mysql_query_rules where rule_id=164\G ************************* 1. row ************************* rule_id: 100 active: 1 username: NULL schemaname: NULL flagIN: 0 digest: NULL match_digest: ^DESCRIBE match_pattern: NULL destination_hostgroup: 2 cache_ttl: 3600000 apply: 1 comment: Cache and reroute DESCRIBE Table queries on readers HG

Execute below command to reset the stats_mysql_query_digest results:

SELECT 1 FROM stats_mysql_query_digest_reset LIMIT 1;

Monitor stats : 

Cached queries are the ones marked with a special hostgroup -1 (this means that these queries were not sent to any hostgroup), and the total execution time for the queries cached is 0 (this means that the request was served within the same events loop).

mysql> select hostgroup,digest_text,count_star,FROM_UNIXTIME(last_seen) from stats_mysql_query_digest where digest_text like '%DESCRIBE%'; +-----------+------------------------------------+------------+--------------------------+ | hostgroup | digest_text | count_star | FROM_UNIXTIME(last_seen) | +-----------+------------------------------------+------------+--------------------------+ | -1 | DESCRIBE `agent` | 178 | 2018-10-11 11:35:43 | | 2 | DESCRIBE `integration`. | 1 | 2018-10-11 11:22:37 | | 2 | DESCRIBE `options_available` | 1 | 2018-10-11 11:22:37 | | 2 | DESCRIBE `campaignsettings` | 1 | 2018-10-11 11:22:37 | | 2 | DESCRIBE `integration` | 1 | 2018-10-11 11:22:38 | | 2 | DESCRIBE `clients` | 1 | 2018-10-11 11:22:57 | | 2 | DESCRIBE `companies` | 1 | 2018-10-11 11:22:37 | | 2 | DESCRIBE `vsts_options` | 1 | 2018-10-11 11:22:37 | | -1 | DESCRIBE `users_details` | 190211 | 2018-10-11 11:35:43 | | -1 | DESCRIBE `ticket` | 9081 | 2018-10-11 11:35:43 | | 2 | DESCRIBE `ticket` | 1 | 2018-10-11 11:22:38 | | 2 | DESCRIBE `stats` | 1 | 2018-10-11 11:22:37 | | -1 | DESCRIBE `stats` | 19856 | 2018-10-11 11:35:43 | | -1 | DESCRIBE `companies` | 40079 | 2018-10-11 11:35:43 | | -1 | DESCRIBE `options` | 20590 | 2018-10-11 11:35:43 | | -1 | DESCRIBE `com_settings` | 20215 | 2018-10-11 11:35:43 | | 2 | DESCRIBE `users_details` | 3 | 2018-10-11 11:23:00 | | -1 | DESCRIBE `visit` | 20166 | 2018-10-11 11:35:43 | | 2 | DESCRIBE `integration_many` | 1 | 2018-10-11 11:22:37 | | -1 | DESCRIBE `integration_many` | 21875 | 2018-10-11 11:35:43 | | -1 | DESCRIBE `integration` | 20767 | 2018-10-11 11:35:43 | | -1 | DESCRIBE `task_integration` | 9032 | 2018-10-11 11:35:43 | +-----------+------------------------------------+------------+--------------------------+ 22 rows in set (0.01 sec)

As per last 30 mins stats, only 13 DESCRIBE queries were served by MySQL and 676980 queries from proxysql cache

mysql> select hostgroup, sum(count_star) from stats_mysql_query_digest where digest_text like 'DESCRIBE%' group by hostgroup; +-----------+-----------------+ | hostgroup | sum(count_star) | +-----------+-----------------+ | -1 | 676980 | | 2 | 13 | +-----------+-----------------+ 2 rows in set (0.00 sec)

Another use case we observed that when the client sends more queries on INFORMATION_SCHEMA to perform some pre-validation checks. we can use similar query rules to overcome this situation.

Conclusion: – For some application logic ProxySQL query cache can achieve ~2X performance boost at a minimum and can be more better if ProxySQL instance closer to the application.

ProxySQL query cache may serve stale resultsets due to the way it invalidates cached data (cached data are valid for the interval specified by “cache_ttl”, while MySQL’s cached data get invalidated each time data change). Every query that is cached may return stale data and this may or may not be acceptable by the application so make proper use of this feature on requirements.

Some FAQ’s answer on query cache :

1. Does prepared statement support ProxySQL QC?

– Query cache doesn’t work with PS

2. Does query cache pre-allocate memory based on query_cache_size_MB? Should we reduce the value if not in use?

– No, query cache is not preallocated, Therefore no need to reduce it.

3. How to purge content of the query cache

– There is no command to immediately delete the whole content of the query cache. To define cache invalidation only make use of “cache_ttl”

Some benchmark has been performed [here]( to showcase how ProxySQL Query Cache boosts performance!

More details here :

Un-Answered Problems Into Wonderful Means to Set up a Higher education Essay Shown

Releasing Excellent Strategies to Begin a University or college Essay Make certain your strategy includes a launch, mid and bottom line. In accordance with the system which you will need to obtain, there exist unique problems on how to write a effect cardstock to a documentary you should make. In the carry on piece, you may also check the documentary with other individuals around the specified variety or issue so that you can get paid audience have greater understanding of the report. Your visitor would desire to know just how the making has effects on them-and regardless of if the studying will at the mercy of them most definitely Correct right away, it is best to painting a photograph within the particular person or scenario and display the activity transpiring. Evidently, it’s not possible to obtain all the deserving thoughts with the txt in 20 min, but which can be the length of time it entails to build your appearance and select even if you would like to continue on reading or otherwise. writing a term paper Prosperous individuals might have a travel-begin in lifestyle and when you’d wish to call this option, then nice. Ask these questions : issues prior to composing your very own affirmation anything you value. A friendly notice can certainly be printed in more or less any way you decide on, then again there are various of corporate guidelines you’ll have the ability to stick to if you’re puzzled by things to come up with or a way to file format your notice. For your own first section, you could always be sought after to compose a post, as it’s the only pick supplied.

New Detailed Roadmap forever Methods of Start a University or college Essay As a consequence of engagement of my girl who apparently created a choice to dedicate her existence to resist my effort and hard work for minimalism I generate a forecast I am going to under no circumstances run within the stuff to chuck in to the trash bin. Consequently, it’s always clever that you really relax and watch the documentary with greater frequency than the moment for being in the position to effectively breakdown it and recognize what it is about. You’ve worked well diligently for the past couple of months and notably hard over the past variety of several hours, so be happy and rejoice in! Good Tips on how to Build a School Essay Benefits Astoundingly, individuals are astoundingly several and exactly what you do can be new to someone else. The particular starting clarification is I do not need to acquire a normal business vocation. If you ever must have help and support getting in an understanding-earning mentality, then this is a tip. Is situated You’ve Been Instructed About Good Tips on how to Go into a Advanced schooling Essay You’re producing a head when you believe you need one particular. The two main key will mean by which you could make the most of the most frequent IELTS essay stories for your advantages. If you’re equipped to effectively compose an ideal introduction, you are able to get great marks belonging to the tutor.

There is always barely any pupil, who wasn’t delegated to prepare an essay. Don’t even think about dissertation editing the time you wrap up composing the preceding sentence. Exactly what you end up doing on this website enormously is contingent on the sort of essay you are looking for authoring. Generating an systematic essay thesis The primary activity you might actually do because it is possible to prepare your investigation essay may be to make an examination essay. The mere real truth that you’re really being produced to compose an essay allows you to detest the topic, then again tough your professor attempted to give it time to be intriguing. Many scheduling ought to go into your make up before beginning authoring it. The One Thing for you to do for better Solutions to Build a University or college Essay Among the fascinating approaches to embark on your essay could be to open it employing a well known price quote or astonishing facts. You will find lots of methods to set up your essay. A wonderful essay will disclose how tricky and committed work one who is aware of how you can show theirselves you’re. The greater the distinct you’re, the a lot easier it is going to be to generate it within your essay. It’s a brief constitution over a certain area. You wish to prepare any person essay. Essay system composing Every last essay requires a natural composition if there aren’t various other instructions. You could also will need to explore the short article aloud to a person in order to discover the things they presume. Let’s say you’d want to write a manuscript. You will need to visualize a good idea depending on way the documentary handled you. Effectively, you’re fortunate enough as you’ve have me!

Writing is somewhat much like using meditation. First of all, you really should look at my list of 150 area tips for essays which reveal. Jot straight down anything you understand more concerning field. If you would like to try and do a very superb profession, when you’ve concluded your summation, you must revisit and evaluate the preliminary report a final decisive moment. The answer, and that also which I supporter in all of the my creating sessions, should be to create a endeavor that you could possibly be successful at, by modifying the factors. If you’re utilizing your private fake with this book or you’ve published it through your home pc, require notes right on the internet page and underline crucial quotations. Beneficial Ways to Create a Advanced schooling Essay: the most effective Ease! Or if you ever get a personal-hosted Word press online site, I would recommend obtaining the Yoast Search engine optimization wordpress plugin. Opt for the wonderful article author you realize. In actual fact, a skilled publisher can deliver the results faster than any pupil as they’ve been publishing scholastic duties in their complete everyday living. Side area jobs are supposed to do without any demands though establishing a startup is quite stressful. Furthermore, one or two a long time of class time will probably want to get allocated to have the capability to present the collages. Formulating currently is element of my regime I’m anticipating.

Define "better"

Welcome to my first rant of 2019, although I have written about this before. While I enjoy benchmarketing from a distance it is not much fun to be in the middle of it. The RocksDB project has been successful and thus becomes the base case for products and research claiming that something else is better. While I have no doubt that other things can be better I am wary about the definition of better.

There are at least 3 ways to define better when evaluating database performance. The first, faster is better, ignores efficiency, the last two do not. I'd rather not ignore efficiency. The marginal return of X more QPS eventually becomes zero while the benefit of using less hardware is usually greater than zero.
  1. Optimize for throughput and ignore efficiency (faster is better)
  2. Get good enough performance and then optimize for efficiency
  3. Get good enough efficiency and then optimize for throughput
Call to action
I forgot to include this before publishing. Whether #1, #2 or #3 is followed I hope that more performance results include details on the HW consumed to create that performance. How much memory and disk space were used? What was the CPU utilization? How many bytes were read from and written to storage? How much random IO was used? I try to report both absolute and relative values where relative values are normalized by the transaction rate.

Understanding MySQL X (All Flavors)

Since 5.7.12 MySQL includes what is called the X plugin, but also it includes X protocol and X DevApi. But what is all this and how does it work? Let me share a personal short story on how I found myself investigating this feature. In a previous post I wrote about the MySQL Router tool, and our colleague Mr. Lefred pointed out that I was wrong about X protocol, because I mentioned it was created to be used with JSON docs. Given this input, I wanted to investigate in a little bit more depth about what all this “X” means and how it can be used in our day to day operations.

First problem I found is that the documentation is pretty extensive in the how’s but it was really hard to find the what’s. This is a bit strange, because for people trying to research about this new feature the documentation is not very helpful. In fact, I had to go to different websites to get a sense of what X means, how it works, and what it was created for.

Let’s start from the very beginning: what does the X stand for? Basically, it’s a way to name the crossover between relational and document models with extended capabilities, and the X is used for naming the three components we are describing: the plugin, the protocol and the DevApi.

X Plugin

This is the actual interface between MySQL server and the clients. By clients we can consider a variety of clients, not only the MySQL shell. It has to be installed in MySQL 5.7 versions via the INSTALL PLUGIN command but comes installed by default in MySQL 8. The plugin adds all the functionality, configuration variables, and status counters we need to use it.

It has the ability to work with both traditional SQL and Document objects, and also supports CRUD (Create, Read, Update, Delete) operations,  asynchronous query execution and so on – this provides a great capacity to extend the current way we work with MySQL.

X Protocol

This is a new client protocol created to ‘talk’ between the X Plugin and Clients.  I think it is fair to say this is an eXtended version of the MySQL protocol.
It was designed with the idea of having the capacity for asynchronous calls, meaning that you can send more than one query to server from same client without the need of waiting for first query to finish before sending the second and so. This improves the overall execution time by saving network round trips between clients and server.

Additionally, the protocol accepts CRUD operations and, of course, the handling of JSON documents and plain SQL. The protocol is fully implemented in MySQLShell and has several connectors for popular languages (Java and .Net for example)


The last piece of this package is the X DevAPI protocol. Probably the best documented of these pieces is the API implemented on the MySQL Shell and connectors that supports the X Protocol. This API is designed to easily write programs from a given client using some popular languages. For example, we can easily create/test a program from MySQL Shell using Python or JavaScript.

The API defines few interesting concepts to handle sessions. These sessions can handle several connections to a server so in a specific session we can encapsulate more than one MySQL connection. You can define a basic session connection as follows (in JavaScript) using the MySQL Shell:

MySQL localhost:33060+ ssl JS > var test = require('mysqlx'); MySQL localhost:33060+ ssl JS > var session = mysqlx.getSession({host: 'localhost', user: 'root', password: 'root', port: 3306});

So what’s new here? How does it help, and how I can make use of it? First let’s try to illustrate the architecture:


As you may notice, the X plugin adds a new interface that talks to X protocol, then this protocol is able to interact with connectors that supports the protocol (as mentioned above). The classic functionality is still present, so we just extended its functionality. The good part of this is that the protocol is capable of operating with both relational data and document store.

So now let’s check the funny part by putting all pieces together using a simple example using MySQL Shell:

[root@data1 ~]# mysqlsh MySQL Shell 8.0.13 Copyright (c) 2016, 2018, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type '\help' or '\?' for help; '\quit' to exit. MySQL JS > var test_conn = require('mysqlx'); MySQL JS > var session = mysqlx.getSession({host: 'localhost', user: 'root', password: 'root', port: 33060}); #creating session, notice X protocol listen port 33060 by default MySQL JS > test_collection = session.getSchema('test').createCollection("people"); <Collection:people> MySQL JS > test_collection.add({birth:"1988-06-12", Name: "Francisco"}); Query OK, 1 item affected (0.0456 sec) MySQL JS > test_collection.add({birth:"2001-11-03", Name: "Maria", Nickname: "Mary"}); Query OK, 1 item affected (0.0255 sec) MySQL JS > test_collection.find(); [ { "Name": "Francisco", "_id": "00005c19099f0000000000000004", "birth": "1988-06-12" }, { "Name": "Maria", "Nickname": "Mary", "_id": "00005c19099f0000000000000005", "birth": "2001-11-03" } ] 2 documents in set (0.0005 sec) MySQL JS > \sql #simple command to switch between modes Switching to SQL mode... Commands end with ; MySQL SQL > \connect root@localhost Creating a session to 'root@localhost' Fetching schema names for autocompletion... Press ^C to stop. Your MySQL connection id is 36 (X protocol) Server version: 8.0.11 MySQL Community Server - GPL No default schema selected; type \use <schema> to set one. MySQL localhost:33060+ ssl SQL > use test Default schema set to `test`. Fetching table and column names from `test` for auto-completion... Press ^C to stop. MySQL localhost:33060+ ssl test SQL > CREATE TABLE `people2` ( -> `id` int(11) unsigned NOT NULL AUTO_INCREMENT, -> `birth` datetime NOT NULL, -> `name` varchar(45) NOT NULL DEFAULT '', -> `nickname` varchar(45) NULL DEFAULT '', -> PRIMARY KEY (`id`) -> ) ENGINE=InnoDB; Query OK, 0 rows affected (0.1056 sec) MySQL localhost:33060+ ssl test SQL > insert into people2(birth, name, nickname) values('2010-05-01', 'Peter', null), ('1999-10-14','Joseph', 'Joe'); Query OK, 2 rows affected (0.0326 sec) MySQL localhost:33060+ ssl test SQL > select * from people2; +----+---------------------+--------+----------+ | id | birth | name | nickname | +----+---------------------+--------+----------+ | 1 | 2010-05-01 00:00:00 | Peter | NULL | | 2 | 1999-10-14 00:00:00 | Joseph | Joe | +----+---------------------+--------+----------+ 2 rows in set (0.0004 sec) MySQL localhost:33060+ ssl test SQL > select * from people; +-----------------------------------------------------------------------------------------------------+------------------------------+ | doc | _id | +-----------------------------------------------------------------------------------------------------+------------------------------+ | {"_id": "00005c19099f0000000000000004", "Name": "Francisco", "birth": "1988-06-12"} | 00005c19099f0000000000000004 | | {"_id": "00005c19099f0000000000000005", "Name": "Maria", "birth": "2001-11-03", "Nickname": "Mary"} | 00005c19099f0000000000000005 | +-----------------------------------------------------------------------------------------------------+------------------------------+ 2 rows in set (0.0028 sec)

Interesting right? Within the same shell, I’ve created session to run over X protocol, and handled both document and relational objects, all without quitting from shell.

Is this all? Of course not! We are just scratching the surface, we haven’t used asynchronous calls nor CRUD operations. In fact, these topics are enough for a blog post each. Hopefully, though, the What’s are answered for now – at least a little –and if that’s the case, I’ll be very happy!

Photo by Deva Darshan on Unsplash

MariaDB 10.3.12 and MariaDB Connector/C 3.0.8 now available

The MariaDB Foundation is pleased to announce the availability of MariaDB 10.3.12, the latest stable release in the MariaDB 10.3 series, as well as MariaDB Connector/ODBC 3.0.8, the latest stable release in the MariaDB Connector/ODBC series. See the release notes and changelogs for details. Download MariaDB 10.3.12 Release Notes Changelog What is MariaDB 10.3? MariaDB […]

The post MariaDB 10.3.12 and MariaDB Connector/C 3.0.8 now available appeared first on

Why we've chosen Snowflake ❄️ as our Data Warehouse

In the first of this series of blog posts about Data-Warehousing, I’ve been talking about how we use and manage our Amazon Redshift cluster at Drivy.

One of the most significant issues we had at this time was: how to isolate the compute from the storage to ensure maximum concurrency on read in order to do more and more data analysis and on-board more people in the team.

I briefly introduced Amazon Spectrum and promised to talk about how we were going to use it in a second blog post… But, that turned out not to be the case, because we ultimately decided to choose another data-warehousing technology (Snowflake Computing) which addresses the issue mentioned above, among other things, that I’ll expose here.

Why are we changing our Data Warehouse?

In Redshift and most of the Massive Parallel Processing SQL DBMS, the underlying data architecture is a mix of two paradigms:

  • Shared nothing: chunks of a table are spread across the worker nodes with no overlaps;
  • Shared everything: a full copy of a table is available on every worker node.

This approach is convenient for homogeneous workloads: a system configuration that is ideal of bulk loading (high I/O, light compute) is a poor fit for complex analytical queries (low I/O, heavy compute) and vice versa.

When you deal with many consumers with different volumes and treatments you usually tend towards a multi-cluster organization of your data warehouse, where each cluster is dedicated to a workload category: I/O intensive, storage-intensive or compute-intensive.

This design gives more velocity to the teams. You can decide to have one cluster for each team, for example, one for the finance, one for the marketing, one for the product, etc. They generally no longer have resource related issues, but new kinds of problems could emerge: data freshness and consistency across clusters.

Indeed, multi-clustering involves synchronization between clusters to ensure that the same complete data is available on every cluster on time. It complexifies the overall system, and thus results in a loss of agility.

In our case we have thousands of queries running on a single Redshift cluster, so very different workloads can occur concurrently:

  • a Drivy fraud application frequently requires the voluminous web and mobile app tracking data to detect fraudulent devices,
  • the main business-reporting runs a large computation on multiple tables,
  • the ETL pipeline of production DB dump and enrichment is running,
  • the ETL pipeline responsible for the tracking is running,
  • an exploration software extracts millions of records.

In order to improve the overall performance, to reduce our SLAs and make room for every analyst who wants to sandbox a complex analysis, we were looking for a solution that would increase the current capabilities of the system without adding new struggles.

It has to ensure the following:

  • ANSI SQL support and ACID transactions.
  • Peta-byte scale.
  • A fully managed solution.
  • Seamless scaling capability, ideally ability to scale independently compute and storage.
  • Cost effective.

Snowflake Computing meets all those requirements, it has a cloud-agnostic (could be Azure or AWS) shared-data architecture and elastic on-demand virtual warehouses that access the same data layer.

The Snowflake Elastic data warehouse

Snowflake is a pure software as a service, which supports ANSI SQL and ACID transactions. It also supports semi-structured data such as JSON and AVRO.

The most important aspect is its elasticity.

Storage and computing resources can be scaled independently in seconds. To achieve that, virtual warehouses can be created and decommissioned on the fly. Each virtual warehouse has access to the shared tables directly on S3, without the need to physically copy the data.

Multi-Cluster, Shared Data Architecture. Source:

They also have two really interesting features: auto-suspend and auto-scale. Every time a cluster is not used for more than 10 minutes, it is automatically put in sleep mode with no additional fees. The “Enterprise” plan also gives the auto-scale feature that adapts the size of the virtual warehouse according to the workload (horizontal scaling). I haven’t tested this feature yet since we have the lower “Premier” plan.

From Redshift to Snowflake

The data engineering team at Drivy is composed of two engineers. We dedicated a full quarter to the migration on top of the day-to-day operations, and it’s not finished yet. During this migration, we took the opportunity to pay some of our technical debt and modernize some of our ETL processes.

One of the greatest improvements we addressed was the versioning on S3 of every data involved prior and post a transformation. At every run of every ETL pipeline, for instance, if we consider the bulk loading of the production DB, a copy of the raw data and the transformed data is stored on S3.

That gives us many new capabilities: reproducibility, auditing and easier operations (when backfilling or when updating a table schema).

The biggest blocks of the migration were:

  • MySQL to Snowflake: Production DB bulk loading and transformations, with three kinds of ingestions, incremental append-only, incremental upsert, and full dump - we made a questionable choice here: our intermediate format is csv, we had many formatting issues.
  • Captur: Our internal tracking framework, it’s a pipeline that loads raw events from S3 (sent by the web and the mobile apps through a Kinesis stream) and split them into a backend and a frontend schema holding different tables (one for each event). It also automatically detects changes and adapts the schema (new columns, new tables) when needed.
  • API integrations: spreadsheets, 3rd parties APIs… straightforward but numerous.
  • Security and Grants management.
Virtual Warehouses mix

We want to group similar workloads in the same warehouses, to tailor the resources needed to the complexity of the computations, we made the following choice in our first iteration:

quantity size users description usage per day usage per week 1 S ETL + Viz Main warehouse for bulk loading, ETL and visualizations software. ∞ 7d/7 1 L Exploration Used early in the morning for ~100 high I/O extractions for an exploration software. 0 - 4h 7d/7 1 XS Analysts + Business users Main warehouse for analysts, ~200 daily complex analytical queries. 0 - 10h 5d/7 1 L Machine Learning + Ops Compute intensive warehouse for punctual heavy computations. 0 - 2h N.A.

Every warehouse has the default auto-suspend set to 10min of inactivity.

What’s next

Once we finish our migration, I’ll share my thoughts with you about the overall performance of the new system. I’ll also iterate on the mix of strategies presented above to ensure maximum velocity and convenience while minimizing the costs. Also, I’ll tell you more about how we do grant management.

Meanwhile, don’t hesitate of course to reach out to me if you have any feedback!

Fun with Bugs #76 - On MySQL Bug Reports I am Subscribed to, Part XIII

Holidays season is almost over here, so it's time to get back to my main topic of MySQL bugs. Proper MySQL bug reporting will be a topic of my FOSDEM 2019 talk in less than 4 weeks (and few slides with recent examples of bugs are not yet ready), so I have to concentrate on bugs.

Last time in this series I reviewed some interesting bug reports filed in November, 2018. Time to move on and proceed with bugs reported in December, 2018, as I've subscribed to 27 or so of them. As usual, I'll review them briefly starting from the oldest and try to check if MariaDB 10.3 is also affected when the bug report is about common features:
  • Bug #93440 - "Noop UPDATE query is logged to binlog after read_only flag is set". Nice corner case found by Artem Danilov. super_read_only, even if set to ON successfully, may not prevent from committing and advancing GTID value.
  • Bug #93450 - "mysqldump does not wrap SET NAMES into mysql-extension comment". This is a regression bug in MySQL 8.0 that may break compatibility with 3rd party tools not aware of MySQL extensions. This bug was reported by Mattias Jonsson.
  • Bug #93451 - "The table comment is cut down on selecting with ORDER BY". Nice regression in MySQL 8. As one can easily check, MariaDB 10.3.x and older MySQL versions are not affected.
  • Bug #93491 - "Optimizer does not correctly consider attached conditions in planning". Clear and useful bug report from Morgan Tocker.
  • Bug #93544 - "SHOW BINLOG EVENTS FROM <bad offset> is not diagnosed". Yet another regression bug in MySQL 8 found by Laurynas Biveinis from Percona. MariaDB 10.3 does not accept bad offsets:
    MariaDB [test]> show binlog events from 14 limit 1;
    ERROR 1220 (HY000): Error when executing command SHOW BINLOG EVENTS: Wrong offset or I/O error
    MariaDB [test]> show binlog events limit 4;
    | Log_name         | Pos | Event_type        | Server_id | End_log_pos | Info
    | pc-PC-bin.000001 |   4 | Format_desc       |         1 |         256 | Server
    ver: 10.3.7-MariaDB-log, Binlog ver: 4 |
    | pc-PC-bin.000001 | 256 | Gtid_list         |         1 |         285 | []
    | pc-PC-bin.000001 | 285 | Binlog_checkpoint |         1 |         328 | pc-PC-b
    in.000001                              |
    | pc-PC-bin.000001 | 328 | Gtid              |         1 |         370 | GTID 0-
    1-1                                    |
    4 rows in set (0.002 sec)

    MariaDB [test]> show binlog events from 256 limit 1;
    | Log_name         | Pos | Event_type | Server_id | End_log_pos | Info |
    | pc-PC-bin.000001 | 256 | Gtid_list  |         1 |         285 | []   |
    1 row in set (0.002 sec)
  • Bug #93572 - "parallel workers+slave_preserve_commit_order+flushtables with read lock deadlock". I subscribed to it as it's just yet another example of improper handling of useful bug reports, as already discussed in my post "Problems with Oracle's Way of MySQL Bugs Database Maintenance". I think Ashe Sun's point is clear and suggestions like "don't do it" have nothing to do with proper bugs processing.
  • Bug #93587 - "Error when creating a table with long partition names". Nice regression bug in MySQL 8 comparing to 5.7 was found by Sergei Glushchenko from Percona.

    MariaDB 10.3.7 on Windows also fails with error message that is not clear:
    ERROR 1005 (HY000): Can't create table `mc5noglq9ofy7ym76z1t758ztptj6iplvsldhmse
    fyw4zvocfpp` (errno: 168 "Unknown (generic) error from engine")In the error log I see:
    2019-01-06 19:36:46 10 [ERROR] InnoDB: Operating system error number 3 in a file operation.
    2019-01-06 19:36:46 10 [ERROR] InnoDB: The error means the system cannot find the path specified.
    2019-01-06 19:36:46 10 [ERROR] InnoDB: File .\mc5noglq9ofy7ym76z1t758ztptj6iplvs
    kf#sp#ywkq987ztkdj33zbmlw526153x86vxl4x44r15spf8jqs92665mt0qi6bsnkazy5.ibd: 'cre
    ate' returned OS error 203.
    2019-01-06 19:36:46 10 [ERROR] InnoDB: Cannot create file '.\mc5noglq9ofy7ym76z1
  • Bug #93600 - "Setting out of range fractional part produces incorrect timestamps". After some arguing this bug reported by Evgeny Firsov was "Verified". In MariaDB 10.3 truncation happens:
    MariaDB [test]> SET SESSION TIMESTAMP=1.9999996;
    Query OK, 0 rows affected (0.039 sec)

    | CURRENT_TIMESTAMP(6)       |
    | 1970-01-01 02:00:01.999999 |
    1 row in set (0.010 sec)

    MariaDB [test]> CREATE TABLE t1( ts TIMESTAMP(6), dt DATETIME(6) );
    Query OK, 0 rows affected (0.387 sec)

    Query OK, 1 row affected (0.079 sec)

    MariaDB [test]> select * from t1;
    | ts                         | dt                         |
    | 1970-01-01 02:00:01.999999 | 1970-01-01 02:00:01.999999 |
    1 row in set (0.016 sec)
  • Bug #93603 - "Memory access error with alter table character change." This bug was reported by Ramesh Sivaraman from Percona QA. I've subscribed mostly to find out how bug reports with new severity level (S6) are going to be processed and fixed. See also his Bug #93701 - "Assertion `maybe_null' failed |Item_func_concat::val_str(String*)".

    I've subscribed to S7 Bug #93617 - "Conditional jump or depends on uninitialized value(s) in Field_num::Field_num" from Laurynas Biveinis for similar reason.
  • Bug #93649 - "STOP SLAVE SQL_THREAD deadlocks if done while holding LOCK INSTANCE FOR BACKUP". New MySQL 8 feature, LOCK INSTANCE FOR BACKUP statement, is an attempt to introduce backup locks to MySQL. Sergei Glushchenko found that it may cause deadlocks. I am surprised that the bug is still "Open", since December 18, 2018.
  • Bug #93683 - "Got error 155 when reading table './test/t1'". I am not sure if this message in the error log noted by Roel Van de Paar from Percona is a bug or a problem. MariaDB 10.3.7 produces the same error message. Let's find out, so far this report is "Verified".
  • Bug #93684 - "mysql innodb dump restore slows down after upgrade mysql 5.7 to 8.0". Florian Kopp reported this potential notable performance regression of MySQL 8.0.13 vs 5.7, but it is still not verified. I am not sure how this report may end up, but it's not the first report about performance regressions in MySQL 8 :)

One can hardly find any bugs in this winter forest. But they are hiding and will affect everyone there one day, in spring... To summarize:
  1. Some regression bugs are still not marked with "regression" tag.
  2. Some MySQL bug reports are still handled wrongly, with a trend of wasting bug reporter time on irrelevant clarifications and claims the problem is not clear, when there is a good enough explanation of the test case.
  3. Percona engineers still contribute a lot of MySQL 8 QA, by reporting numerous bugs. No wonder with their first Percona Server for MySQL 8 GA release happened in December...
I still have a dozen or so December 2018 bug reports (mostly "Open" at the moment) to review one day, so stay tuned!

My 2019 Database Wishlist

Last year I published my 2018 Database Wishlist, which I recently revisited to check what happened and what didn’t. Time for a 2019 wishlist.

I am not going to list items from my 2018 list, even if they didn’t happen or they partially happened. Not because I changed my mind about their importance. Just because I wrote about them recently, and I don’t want to be more boring than I usually am.

External languages for MySQL and MariaDB

MariaDB 10.3 implemented a parser for PL/SQL stored procedures. This could be good for their business, as it facilitates the migration from Oracle. But it isn’t an answer to the community request of supporting external languages, like C or Python.

Oracle theoretically allows to use languages supported by GraalVM in MySQL. But unspecified legal problems seem to stop them from releasing this. In any case, unfortunately, this feature is only available on GraalVM.

External languages are really desirable. Antony Curtis wrote a patch for this years ago, but neither MySQL or MariaDB included it. Ronal Bouman wrote mysqlv8udfs, a UDF to run JavaScript code via Google Chrome’s JavaScript engine, but it was never included in MySQL or MariaDB.

Please, Oracle and MariaDB: do it. Don’t just start another discussion in a mailing list, do it for real.

ClickHouse without ZooKeeper

ClickHouse needs ZooKeeper to setup a cluster. ZooKeeper is based on the JVM and it’s hard to use. Having to use it because there is no alternative is always quite annoying.

In their repo, there is a feature request to support Consul instead of ZooKeeper. To recap: the main problem is that they use some ZooKeeper unique features, but they plan to refactor the ZooKeeper client library, and after that it is possible they they will implement this feature.

I wrote about Consul some time ago, when I was a Percona consultant. I consider it a very good solution. The other alternative is etcd. It’s worth noting that, despite it being widely used, the CNCF still considers it as an incubating project.

Sphinx: implement JOINs

Sphinx seems to be forgot by many. It is not trendy anymore. There have not been cool titles on the tech sites involving Sphinx for years. So, is it slowly dying? No, it’s simply stable.

There are more trendy alternatives nowadays, yes. Like ElasticSearch and Solr. So why do I care about Sphinx? Well, first, it’s quite KISS (keep it simple, stupid!). It doesn’t have lots of features just because it was theoretically possible to implement them. Sorry for repeating myself, but basically… it’s stable. Second, it supports a subset of SQL, it’s kind of relational, and its language is compatible with MySQL. It could invent a new model and a new language that are supported by nothing else in the world, but they (and not many other modern database vendors) realised that it wouldn’t be a good thing.

But it misses JOINs. I don’t want to run complex JOINs on it, it will never be the right tool for that. Still, this would open more opportunities for Sphinx users. They always mentioned JOINs as something “currently missing”, so I still hope to see them implemented.

PostgreSQL: setup a real bug tracker

As I already mentioned, PostgreSQL doesn’t have a bug tracker. There is a mailing list. A problem that this surely causes is that it’s impossible to do structured searches – for example: get a list of confirmed bugs in version 11 involving foreign keys. I suspect there is another major problem: this could prevent some people from reporting bugs.

PostgreSQL is a great project, but please take bugs seriously.

Percona, be more conservative about your unique features

Every major version removes some features from the previous one. I understand why you do that, and I appreciate it. You want to keep the delta between Percona Server and MySQL as small as possible. If a feature is not used by enough customers, it implies some unjustified difference to maintain. I know that this policy is important to maintain your fork’s quality high.

Yet, I feel that you tend to remove too much stuff. Suppose I start to use a feature now and after some time I have to stop. If I think about it, I wish you didn’t implement it at all in the first place. So my point is not necessarily “maintain more features”, it could be “develop less features” as well.

Open source databases and cloud providers

We have read about some open source databases going proprietary because they spend money to innovate, while cloud providers simply take their customers away without giving anything in return. In particular, Amazon modifies open source software to create its own products, and sells them without paying the original vendors.

What can I say about Amazon… nothing, just a sarcastic “thank you”.

But I’ve something to ask database vendors. Do you know what MaxScale is? If not, it’s because this strategy of going proprietary is not as smart as you may think. Users will simply move away and forget you. Some of those users are paying customers. But even many of those who never paid you did contribute to your product – and indirectly, to your company’s incomes. How? With bug reporting, by writing technical contents about your products, by talking about how good your software is, etc. I don’t know of any estimation of how much these things contribute to a company’s economy, but I would be very interested in reading such a thing.


Common Table Expression (CTE) MySQL 8.0.

1.0. Introduction:

MySQL 8.0 was released with awesome features. One of its most prominent features is CTE (Common Table Expression).

The Common Table Expression can be used to construct complex queries in a more readable manner. In this blog, I have described how the CTE works with some useful examples.

1.1 What is CTE?

A Common Table Expression (CTE) is the result set of the query, which exists temporarily and uses only within the context of a larger query.

The CTE provides better readability and performance in comparison with a derived table.

In a normal query the temporary result set that exists only within the execution scope of a single SQL statement.

Example: select city.ID,city.Name,city.Population from city where city.Population between '10000' and '500000' group by city.ID UNION select country.Code,country.Name,country.Population from country join city on city.ID=country.Code group by country.Capital order by Population; Graphical View: Visual Explain using MySQL WorkbenchWhen Using CTE:
  • The CTE can be self-referencing or it can be referenced multiple times in the same query.
  • While comparing the normal query it will give better performance.
Example : with cte_ex (ID,Name,Population) as ( select ID,Name,Population from city where Population between '10000' and '500000' group by ID UNION select country.Code,country.Name,country.Population from country join city on city.ID=country.Code ) select * from cte_ex group by ID order by Population; Graphical View :


Just like database views and derived tables, CTE enables users to easily write and maintain complex queries with better readability and simplification. This reduction in complexity is achieved by deconstructing ordinarily complex queries into simple blocks to be used and reused if necessary.

1.2 CTE Syntax: WITH cte_name (column_list) AS ( -- Sub Query -- ) SELECT * FROM cte_name;
  • Initiate a CTE using “WITH”.
  • Provide a name for the result soon-to-be defined query.
  • After assigning a name, follow with “AS”.
  • Specify column names (optional step).
  • Define the query to produce the desired result set.
  • If multiple CTEs are required, initiate each subsequent expression with a comma.
  • Reference the above-defined CTE(s) in a subsequent query.
Thumb Rules :
  • The number of columns in the query must be the same as the number of columns in the column_list.
  • If you omit the column_list, the CTE will use the column list of the query that defines the CTE.
1.3 Use Cases:

Needing to reference a derived table multiple times in a single query.

  • An alternative to creating a view in the database.
  • Performing the same calculation multiple times over across multiple query components.
Example: WITH country_surface AS ( select Region, HeadOfState, SurfaceArea from country WHERE Region like 'A%' ) SELECT Region,SurfaceArea FROM country_surface order by SurfaceArea; CTE Structure :

Output : +---------------------------+-------------+ | Region | SurfaceArea | +---------------------------+-------------+ | Australia and New Zealand | 14.00 | | Australia and New Zealand | 36.00 | | Antarctica | 59.00 | | Australia and New Zealand | 135.00 | | Antarctica | 359.00 | | Antarctica | 3903.00 | | Antarctica | 7780.00 | | Australia and New Zealand | 270534.00 | | Australia and New Zealand | 7741220.00 | | Antarctica | 13120000.00 | +---------------------------+-------------+ 10 rows in set (0.00 sec)

In this example name of the CTE is country_surface, the query that defines the CTE returns two columns Region and SurfaceArea.

1.4 WITH clause usages :

There are three usages WITH clause to make common table expressions.

1) A WITH clause can be used at the beginning of SELECT, UPDATE, and DELETE statements.

WITH ... select ... WITH ... update ... WITH ... delete ... Source to update other tables :

The CTE provides a simple solution to update the other tables.

Example : with cte as ( select store_id from inventory as a )  update store as b,cte set b.last_update='2018-10-15 00:00:00'  where b.store_id = cte.store_id; Query OK, 2 rows affected (0.05 sec) Rows matched: 2  Changed: 2  Warnings: 0

2) A WITH clause can be used at the beginning of a subquery or a derived table subquery.

select ... where id IN (WITH ... select ...); select * from (WITH ... select ...) AS derived_table;

3) A WITH clause can be used immediately preceding SELECT of the statements that include a SELECT clause.

create table ... WITH ... select ... create view ... WITH ... select ... insert ... WITH ... select ... replace ... WITH ... select ... declare cursor ... WITH ... select ... explain ... WITH ... select ... Example : mysql> create table mydbops_test (id int(11) DEFAULT NULL,name varchar(30),address varchar(30),city varchar(15),country varchar(25)); Query OK, 0 rows affected (0.14 sec) mysql> insert mydbops_test (id,name,address,city,country) with RECURSIVE staff as ( select id,name,address,city,country from staff_list ) select * from staff; Query OK, 2 rows affected (0.06 sec) Records: 2 Duplicates: 0 Warnings: 0 1.5 Limiting CTE :
  • It is important for recursive CTEs that the SELECT part include a condition to terminate the recursion.
  • You can force termination by placing a limit on CTE execution time.
  • max_execution_time it enforces an execution timeout for SELECT statements executed within the current session.
  • The default max_execution_time is 0 .The variables only apply to read-only SELECT statements.
1.6 CTE Optimizer Hints :

We can include the max_execution_time an optimizer hint within the CTE statement.

with RECURSIVE full_table_scan as ( select actors,title from film_list UNION ALL select actors,title from film_list where title='ACADEMY DINOSAUR' ) SELECT /*+ MAX_EXECUTION_TIME(10) */ * FROM full_table_scan; ERROR 3024 (HY000): Query execution was interrupted, maximum statement execution time exceeded 1.7 Conclusion :
  • The Common Table Expressions is one of the long-awaited features in MySQL 8.
  • It will reduce more complexity. The purpose of CTEs is to simplify the writing of complex SQL queries.
  • You can always recognize them by the “With” keyword at the start of the SQL statement.

Featured Image Courtesy : Photo by Jeremy Thomas on Unsplash

MySQL 8.0 and keywords

As you know, MySQL uses some keywords and some of them are also reserved.

Let’s have a look how to deal with that:

mysql> create table WRITE (id int auto_increment primary key, varying varchar(10), than int);
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your
MySQL server version for the right syntax to use near 'WRITE (id int auto_increment primary key,
varying varchar(10), than int)' at line 1

OK, it seems WRITE is a keyword I cannot use as table name. I’ve then two choices:

  • rename the table to something else like WRITE_TBL
  • use back-ticks (`) around the table like `WRITE`

Let’s use the first option:

mysql> create table WRITE_TBL (id int auto_increment primary key, varying varchar(10), than int);
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your
MySQL server version for the right syntax to use near 'varying varchar(10), than int)' at line 1

We have a second error, this time MySQL is not happy with “varying“.

Let’s modify it, using the second option:

mysql> create table WRITE_TBL (id int auto_increment primary key, `varying` varchar(10), than int);
Query OK, 0 rows affected (2.34 sec)

It worked, however, I am sure that  “than” is also a keyword but it’s not reserved !

Of course, it’s not the most convenient to check the manual each time you want to check for keywords. Additionally, more keywords are appearing with new releases. It was the case with MySQL 8.0 were 70 new keywords were added !

That’s why, MySQL provides also an Information_Schema tables with all the keywords.

mysql> select count(*) from information_Schema.keywords;
| 679 |
1 row in set (0.10 sec

And we can check the amount of reserved keywords:

mysql> select count(*) from information_schema.keywords where reserved;
| 262 |
1 row in set (0.01 sec

And of course, we can verify for “than“:

mysql> select * from information_schema.keywords where word like 'than';
| THAN | 0 |
1 row in set (0.03 sec)

Indeed, it’s a keyword but not reserved.

In summary, yes there are many keywords in MySQL and almost 40% are reserved. But it’s very easy to verify them using Information_Schemaor using back-ticks (but I don’t recommend you to do so and I encourage you to avoid keywords in your schemas).

MySQL Performance Cheat Sheet

MySQL is extensive and has lots of areas to optimize and tweak for the desired performance. Some changes can be performed dynamically, others require a server restart. It is pretty common to find a MySQL installation with a default configuration, although the latter may not be appropriate per se from your workload and setup.

Here are the key areas in MySQL which I have taken from different expert sources in the MySQL world, as well as our own experiences here at Severalnines. This blog would serve as your cheat sheet to tune performance and make your MySQL great again :-)

Let’s take a look on these by outlining the key areas in MySQL.

System Variables

MySQL has lots of variables that you can consider to change. Some variables are dynamic which means they can be set using the SET statement. Others require a server restart, after they are set in the configuration file (e.g. /etc/my.cnf, etc/mysql/my.cnf). However, I’ll go over the common things that are pretty common to tune to make the server optimized.


This variable controls how large your filesort buffer is, which means that whenever a query needs to sort the rows, the value of this variable is used to limit the size that needs to be allocated. Take note that this variable is per-query that is processed (or per-connection) basis, which means that it would be a memory hungry when you set this higher and if you have multiple connections that requires sorting of your rows. However, you can monitor your needs by checking the global status variable Sort_merge_passes. If this value is large, you should consider increasing the value of the sort_buffer_size system variable. Otherwise, take it to the moderate limit that you need. If you set this too low or if you have large queries to process, the effect of sorting your rows can be slower than expected because data is retrieved randomly doing disk dives. This can cause performance degradation. However, it is best to fix your queries. Otherwise, if your application is designed to pull large queries and requires sorting, then it is efficient to use tools that handles query caching like Redis. By default, in MySQL 8.0, the current value set is 256 KiB. Set this accordingly only when you have queries that are heavily using or calling sorts.


MySQL documentation mentions that for each request that performs a sequential scan of a table, it allocates a read buffer. The read_buffer_size system variable determines the buffer size. It is also useful for MyISAM, but this variable affects all storage engines as well. For MEMORY tables, it is use to determine the memory block size.

Basically, each thread that does a sequential scan for a MyISAM table allocates a buffer of this size (in bytes) for each table it scans. It does applies for all storage engines (that includes InnoDB) as well, so it’s helpful for queries that are sorting rows using ORDER BY and caching its indexes in a temporary file. If you do many sequential scans, bulk insert into partition tables, caching results of nested queries, then consider increasing its value. The value of this variable should be a multiple of 4KB. If it is set to a value that is not a multiple of 4KB, its value will be rounded down to the nearest multiple of 4KB. Take into account that setting this to a higher value will consume a large chunk of your server’s memory. I suggest not to use this without proper benchmarking and monitoring of your environment.


This variable deals with reading rows from a MyISAM table in sorted order following a key-sorting operation, the rows are read through this buffer to avoid disk seeks. The documentation says, when reading rows in an arbitrary sequence or from a MyISAM table in sorted order following a key-sorting operation, the rows are read through this buffer (and determined through this buffer size) to avoid disk seeks. Setting the variable to a large value can improve ORDER BY performance by quite a lot. However, this is a buffer allocated for each client, so you should not set the global variable to a large value. Instead, change the session variable only from within those clients that need to run large queries. However, you should take into account that this does not apply to MariaDB, especially when taking advantage of MRR. MariaDB uses mrr_buffer_size while MySQL uses read_buffer_size read_rnd_buffer_size.


By default, value is of 256K. The minimum size of the buffer that is used for plain index scans, range index scans, and joins that do not use indexes and thus perform full table scans. Also used by the BKA optimization (which is disabled by default). Increase its value to get faster full joins when adding indexes is not possible. Caveat though might be memory issues if you set this too high. Remember that one join buffer is allocated for each full join between two tables. For a complex join between several tables for which indexes are not used, multiple join buffers might be necessary. Best left low globally and set high in sessions (by using SET SESSION syntax) that require large full joins. In 64-bit platforms, Windows truncates values above 4GB to 4GB-1 with a warning.


This is the maximum size in bytes for user-created MEMORY tables are permitted to grow. This is helpful when your application is dealing with MEMORY storage engine tables. Setting the variable while the server is active has no effect on existing tables unless they are recreated or altered. The smaller of max_heap_table_size and tmp_table_size also limits internal in-memory tables. This variable is also in conjunction with tmp_table_size to limit the size of internal in-memory tables (this differs from the tables created explicitly as Engine=MEMORY as it only applies max_heap_table_size), whichever is smaller is applied between the two.


The largest size for temporary tables in-memory (not MEMORY tables) although if max_heap_table_size is smaller the lower limit will apply. If an in-memory temporary table exceeds the limit, MySQL automatically converts it to an on-disk temporary table. Increase the value of tmp_table_size (and max_heap_table_size if necessary) if you do many advanced GROUP BY queries and you have large available memory space. You can compare the number of internal on-disk temporary tables created to the total number of internal temporary tables created by comparing the values of the Created_tmp_disk_tables and Created_tmp_tables variables. In ClusterControl, you can monitor this via Dashboard -> Temporary Objects graph.


You can increase the value of this variable if you have large number of tables that are frequently accessed in your data set. It will be applied for all threads, meaning per connection basis. The value indicates the maximum number of tables the server can keep open in any one table cache instance. Although increasing this value increases the number of file descriptors that mysqld requires, so you might as well consider checking your open_files_limit value or check how large is the SOFT and HARD limit set in your *nix operating system. You can monitor this whether you need to increase the table cache by checking the Opened_tables status variable. If the value of Opened_tables is large and you do not use FLUSH TABLES often (which just forces all tables to be closed and reopened), then you should increase the value of the table_open_cache variable. If you have a small value for table_open_cache, and a high number of tables are frequently accessed, this can affect the performance of your server. If you notice many entries in the MySQL processlistwith status “Opening tables” or “Closing tables”, then it’s time to adjust the value of this variable but take note of the caveat mentioned earlier. In ClusterControl, you can check this under Dashboards -> Table Open Cache Status or Dashboards -> Open Tables. You can check it here for more info.


Setting this variable would help improve scalability, and of course, performance which would reduce contention among sessions. The value you set here limits the number of open tables cache instances. The open tables cache can be partitioned into several smaller cache instances of size table_open_cache / table_open_cache_instances . A session needs to lock only one instance to access it for DML statements. This segments cache access among instances, permitting higher performance for operations that use the cache when there are many sessions accessing tables. (DDL statements still require a lock on the entire cache, but such statements are much less frequent than DML statements.) A value of 8 or 16 is recommended on systems that routinely use 16 or more cores.


Cache table definitions i.e. this is where the CREATE TABLE are cached to speed up opening of tables and only one entry per table. It would be reasonable to increase the value if you have large number of tables. The table definition cache takes less space and does not use file descriptors, unlike the normal table cache. Peter Zaitsev of Percona suggest if you can try the setting of the formula below,

The number of user-defined tables + 10% unless 50K+ tables

But take note that the default value is based on the following formula capped to a limit of 2000.

MIN(400 + table_open_cache / 2, 2000)

So in case you have larger number of tables compared to the default, then it’s reasonable you increase its value. Take into account that with InnoDB, this variable is used as a soft limit of the number of open table instances for the data dictionary cache. It will apply the LRU mechanism once it exceeds the current value of this variable. The limit helps address situations in which significant amounts of memory would be used to cache rarely used table instances until the next server restart. Hence, parent and child table instances with foreign key relationships are not placed on the LRU list and could impose a higher than the limit defined by table_definition_cache and are not subject to eviction in memory during LRU. Additionally, the table_definition_cache defines a soft limit for the number of InnoDB file-per-table tablespaces that can be open at one time, which is also controlled by innodb_open_files and in fact, the highest setting between these variables is used, if both are set. If neither variable is set, table_definition_cache, which has a higher default value, is used. If the number of open tablespace file handles exceeds the limit defined by table_definition_cache or innodb_open_files, the LRU mechanism searches the tablespace file LRU list for files that are fully flushed and are not currently being extended. This process is performed each time a new tablespace is opened. If there are no “inactive” tablespaces, no tablespace files are closed. So keep this in mind.


This is the per-connection maximum size of an SQL query or row returned. The value was last increased in MySQL 5.6. However in MySQL 8.0 (at least on 8.0.3), the current default value is 64 MiB. You might consider adjusting this if you have large BLOB rows that need to be pulled out (or read), otherwise you can leave this default settings with 8.0 but in older versions, default is 4 MiB so you might take care of that in case you encounter ER_NET_PACKET_TOO_LARGE error. The largest possible packet that can be transmitted to or from a MySQL 8.0 server or client is 1GB.

ClusterControl Single Console for Your Entire Database Infrastructure Find out what else is new in ClusterControl Install ClusterControl for FREE skip_name_resolve

MySQL server handles incoming connections by hostname resolution. By default, MySQL does not disable any hostname resolution which means it will perform a DNS lookups, and by chance, if DNS is slow, it could be the cause of awful performance to your database. Consider turning this on if you do not need DNS resolution and take advantage of improving your MySQL performance when this DNS lookup is disabled. Take into account that this variable is not dynamic, therefore a server restart is required if you set this in your MySQL config file. You may optionally start mysqld daemon, passing --skip-name-resolve option to enable this.


This is the number of permitted connections for your MySQL server. If you find out the error in MySQL ‘Too many connections’, you might consider setting it higher. By default, the value of 151 isn’t enough especially on a production database, and considering that you have greater resources of the server (do not waste your server resources especially if it’s a dedicated MySQL server). However, you must have enough file descriptors otherwise you will run out of them. In that case, consider adjusting your SOFT and HARD limit of your *nix operating systems and set a higher value of open_files_limit in MySQL (5000 is the default limit). Take into account that it is very frequent that the application does not close connections to the database correctly, and setting a high max_connections can result to some unresponsive or high load of your server. Using a connection pool at the application level can help resolve the issue here.


This is the cache to prevent excessive thread creation. When a client disconnects, the client's threads are put in the cache if there are fewer than thread_cache_size threads there. Requests for threads are satisfied by reusing threads taken from the cache if possible, and only when the cache is empty is a new thread created. This variable can be increased to improve performance if you have a lot of new connections. Normally, this does not provide a notable performance improvement if you have a good thread implementation. However, if your server sees hundreds of connections per second you should normally set thread_cache_size high enough so that most new connections use cached threads. By examining the difference between the Connections and Threads_created status variables, you can see how efficient the thread cache is. Using the formula stated in the documentation, 8 + (max_connections / 100) is good enough.


For some setup, this variable is their worst enemy. For some systems experiencing high load and are busy with high reads, this variable will bog you down. There has been benchmarks that were well-and-tested by e.g., Percona. This variable must be set to 0 along with query_cache_type = 0 as well to turn it off. The good news in MySQL 8.0 is that, the MySQL Team has stopped supporting this, as this variable can really cause performance issues. I have to agree on their blog that it is unlikely to improve predictability of performance. If you are engaged to use query caching, I suggest to use Redis or ProxySQL.

Storage Engine - InnoDB

InnoDB is an ACID-compliant storage engine with various features to offer along with foreign key support (Declarative Referential Integrity). This has a lot of things to say here but certain variables to consider for tuning:


This variable acts like a key buffer of MyISAM but it has lots of things to offer. Since InnoDB relies heavily on the buffer pool, you would consider setting this value typically to 70%-80% of your server’s memory. It is favorable also that you have a larger memory space than your data set, and setting a higher value for your buffer pool but not by too much. In ClusterControl, this can be monitored using our Dashboards -> InnoDB Metrics -> InnoDB Buffer Pool Pages graph. You may also monitor this with SHOW GLOBAL STATUS using the variables Innodb_buffer_pool_pages*.


For your concurrency workload, setting this variable can improve concurrency and reduce contention as different threads of read/write to cached pages. Minimum innodb_buffer_pool_instances should be lie between 1 (minimum) & 64 (maximum). Each page that is stored in or read from the buffer pool is assigned to one of the buffer pool instances randomly, using a hashing function. Each buffer pool manages its own free lists, flush lists, LRUs, and all other data structures connected to a buffer pool, and is protected by its own buffer pool mutex. Take note that this option takes effect only when innodb_buffer_pool_size >= 1GiB and its size is divided among the buffer pool instances.


This variable is the log file in a log group. The combined size of log files (innodb_log_file_size * innodb_log_files_in_group) cannot exceed a maximum value that is slightly less than 512GB. According to Vadim, a bigger log file size is better for performance, but it has a drawback (a significant one) that you need to worry about: the recovery time after a crash. You need to balance recovery time in the rare event of a crash recovery versus maximizing throughput during peak operations. This limitation can translate to a 20x longer crash recovery process!

To elaborate it, a larger value would be good for InnoDB transaction logs and are crucial for good and stable write performance. The larger the value, the less checkpoint flush activity is required in the buffer pool, saving disk I/O. However, the recovery process is pretty slow once your database was abnormally shutdown (crash or killed, either OOM or accidental). Ideally, you can have 1-2GiB in production but of course you can adjust this. Benchmarking this changes can be a great advantage to see how it performs especially during after a crash.


To save disk I/O, InnoDB’s writes the change data into lt’s log buffer and it uses the value of innodb_log_buffer_size having a default value of 8MiB. This is beneficial especially for large transactions as it does not need to write the log of changes to disk before transaction commit. If your write traffic is too high (inserts, deletes, updates), making the buffer larger saves disk I/O.


When innodb_flush_log_at_trx_commit is set to 1 the log buffer is flushed on every transaction commit to the log file on disk and provides maximum data integrity but it also has performance impact. Setting it to 2 means log buffer is flushed to OS file cache on every transaction commit. The implication of 2 is optimal and improves performance if you can relax your ACID requirements, and can afford to lose transactions for the last second or two in case of OS crashes.


With improvements to the InnoDB engine, it is recommended to allow the engine to control the concurrency by keeping it to default value (which is zero). If you see concurrency issues, you can tune this variable. A recommended value is 2 times the number of CPUs plus the number of disks. It’s dynamic variable means it can set without restarting MySQL server.


This variable though must be tried and tested on which hardware fits you best. If you are using a RAID with battery-backed cache, DIRECT_IO helps relieve I/O pressure. Direct I/O is not cached so it avoids double buffering with buffer pool and filesystem cache. If your disk is stored in SAN, O_DSYNC might be faster for a read-heavy workload with mostly SELECT statements.


innodb_file_per_table is ON by default from MySQL 5.6. This is usually recommended as it avoids having a huge shared tablespace and as it allows you to reclaim space when you drop or truncate a table. Separate tablespace also benefits for Xtrabackup partial backup scheme.


This attempts to keep the percentage of dirty pages under control, and before the Innodb plugin, this was really the only way to tune dirty buffer flushing. However, I have seen servers with 3% dirty buffers and they are hitting their max checkpoint age. The way this increases dirty buffer flushing also doesn’t scale well on high io subsystems, it effectively just doubles the dirty buffer flushing per second when the % dirty pages exceeds this amount.


This setting, in spite of all our grand hopes that it would allow Innodb to make better use of our IO in all operations, simply controls the amount of dirty page flushing per second (and other background tasks like read-ahead). Make this bigger, you flush more per second. This does not adapt, it simply does that many iops every second if there are dirty buffers to flush. It will effectively eliminate any optimization of IO consolidation if you have a low enough write workload (that is, dirty pages get flushed almost immediately, we might be better off without a transaction log in this case). It also can quickly starve data reads and writes to the transaction log if you set this too high.


Controls how many threads will have writes in progress to the disk. I’m not sure why this is still useful if you can use Linux native AIO. These can also be rendered useless by filesystems that don’t allow parallel writing to the same file by more than one thread (particularly if you have relatively few tables and/or use the global tablespaces)


Specifies whether to dynamically adjust the rate of flushing dirty pages in the InnoDB buffer pool based on the workload. Adjusting the flush rate dynamically is intended to avoid bursts of I/O activity. Typically, this is enabled by default . This variable, when enabled, tries to be smarter about flushing more aggressively based on the number of dirty pages and the rate of transaction log growth.


This variable is new in MySQL 8.0 which is applied globally and requires a MySQL restart since it’s not a dynamic variable. However, as documentation states that this variable is desired to be enabled only if your MySQL is running on a dedicated server. Otherwise, do not enable this on a shared host or shares system resources with other applications. When this is enabled, InnoDB will do an automatic configuration for the amount of memory detected for variables innodb_buffer_pool_size, innodb_log_file_size, innodb_flush_method. The downside only is that you cannot have the feasibility to apply your desired values on the detected variables mentioned.

MyISAM key_buffer_size

InnoDB is the default storage engine now of MySQL, the default for key_buffer_size can probably be decreased unless you are using MyISAM productively as part of your application (but who uses MyISAM in production now?). I would suggest here to set perhaps 1% of RAM or 256 MiB at start if you have larger memory and dedicate the remaining memory for your OS cache and InnoDB buffer pool.

Other Provisions For Performance slow_query_log

Of course, this variable does not help boost your MySQL server. However, this variable can help you out analyze slow performing queries. Value can be set to 0 or OFF to disable logging. Setting it to 1 or ON to enable this. The default value depends on whether the --slow_query_log option is given. The destination for log output is controlled by the log_output system variable; if that value is NONE, no log entries are written even if the log is enabled. You might set the filename or destination of the query log file by setting the variable slow_query_log_file.


If a query takes longer than this many seconds, the server increments the Slow_queries status variable. If the slow query log is enabled, the query is logged to the slow query log file. This value is measured in real time, not CPU time, so a query that is under the threshold on a lightly loaded system might be above the threshold on a heavily loaded one. The minimum and default values of long_query_time are 0 and 10, respectively. Take note also that if variable min_examined_row_limit is set > 0, it won’t log queries even if it takes too long if the number of rows returned are less than the value set in min_examined_row_limit.

For more info on tuning your slow query logging, check the documentation here.


This variable controls how often MySQL will sync binlogs to the disk. By default (>=5.7.7), this is set to 1 which means it will sync to disk before transactions are committed. However, this impose a negative impact on performance due to increased number of writes. But this is the safest setting if you want strictly ACID compliant along with your slaves. Alternatively, you can set this to 0 if you want to disable disk synchronization and just rely on the OS to flush the binary log to disk from time to time. Setting it higher than 1 means the binlog is sync to disk after N binary log commit groups have been collected, where N is > 1.

Dump/Restore Buffer Pool

It is pretty common thing that your production database needs to warm up from a cold start/restart. By dumping the current buffer pool before a restart, it would save the contents from the buffer pool and once it’s up, it’ll dump the contents back again from the buffer pool. Thus, this avoids the need to warm up your database back to the cache. Take note that, this version was since introduced in 5.6 but Percona Server 5.5 has it already available, just in case you wonder. To enable this feature, set both variables innodb_buffer_pool_dump_at_shutdown = ON and innodb_buffer_pool_load_at_startup = ON.


We’re now in 2019, there has been a lot of new hardware improvements. Typically, there’s no hard requirement that MySQL would require a specific hardware, but this depends on what you need the database to do. I would expect that you are not reading this blog because you are doing a test if it runs on an Intel Pentium 200 MHz.

For CPU, faster processors with multiple cores will be optimal for MySQL in most recent versions at least since 5.6. Intel’s Xeon/Itanium processors can be expensive but tested for scalable and reliable computing platforms. Amazon has been shipping their EC2 instances running on ARM architecture. Though I personally haven’t tried running or recall running MySQL on ARM architecture, there are benchmarks that had been made years ago. Modern CPU’s can scale their frequencies up and down based on temperature, load, and OS power saving policies. However, there’s a chance that your CPU settings in your Linux OS set to a different governor. You can check that out or set with “performance” governor by doing the following:

echo performance | sudo tee /sys/devices/system/cpu/cpu[0-9]*/cpufreq/scaling_governor

For Memory, it is very important that your memory is large and can equate the size of your dataset. Ensure that you have swappiness = 1. You can check it out by checking sysctl or checking the file in procfs. This is achieved by doing the following:

$ sysctl -e vm.swappiness vm.swappiness = 1

Or setting it to a value of 1 as follows

$ sudo sysctl vm.swappiness=1 vm.swappiness = 1

Another great thing to consider for your Memory management is considering turning off THP (Transparrent Huge Pages). In the past, I do recall we have some weird issues encountered with CPU utilization and thought it was due to disk I/O. It turned out, the problem was with kernel khugepaged thread which allocates memory dynamically during runtime. Not only this, during kernel goes for defragmentation, your memory will be quickly allocated as it passes it to THP. Standard HugePages memory is pre-allocated at startup, and does not change during runtime. You can verify and disable this by doing the following:

$ cat /sys/kernel/mm/transparent_hugepage/enabled $ echo "never" > /sys/kernel/mm/transparent_hugepage/enabled

For Disk, it is important that you have a good throughput. Using RAID10 is the best setup for a database with a battery backup unit. With the advent of flash drives that offers high disk throughput and high disk I/O for read/writes, it is important that it can manage the high disk utilization and disk I/O.

Operating System Related resources  ClusterControl for MySQL  A Performance Cheat Sheet for MongoDB  A Performance Cheat Sheet for PostgreSQL

Most production systems running on MySQL runs on Linux. It is because MySQL had been tested and benchmarked on Linux, and sounds that it’s the de facto standard for a MySQL installation. However, of course, there’s nothing stopping you from using it on Unix or Windows platform. It would be easier if your platform has been tested and there is a wide community to help, in case you experience some trouble. Most setups runs on RHEL/Centos/Fedora and Debian/Ubuntu systems. In AWS, Amazon has their Amazon Linux which I see as well being used in production by some.

Most important to consider with your setup is that your file system is using either XFS or Ext4. For sure, there are pros and cons between these two file systems but I won’t go to the details here. Some say XFS outperform Ext4 but there are reports as well that Ext4 outperforms XFS. ZFS is also coming out of the picture as a good candidate for an alternative file system. Jervin Real (from Percona) has a great resource on this one, you can check this presentation during the ZFS conference.

External Links

Tags:  MySQL mysql replication performance tuning optimization

Amazon RDS Aurora MySQL – Differences Among Editions

Amazon Aurora with MySQL Compatibility comes in three editions which, at the time of writing, have quite a few differences around the features that they support.  Make sure you don’t assume the newer Aurora 2.x supports everything in Aurora 1.x. On the contrary, right now Aurora 1.x (MySQL 5.6 based) supports most Aurora features.  The serverless option was launched for this version, and it’s not based on the latest MySQL 5.7.  However, the serverless option, too, has its own set of limitations

I found a concise comparison of what is available in which Amazon Aurora edition hard to come by so I’ve created one.  The table was compiled based mostly on documentation research, so if you spot some mistakes please let me know and I’ll make a correction.

Please keep in mind, this is expected to change over time. For example Amazon Aurora 2.x was initially released without Performance_Schema support, which was enabled in later versions.

There seems to be lag porting Aurora features from MySQL 5.6 compatible to MySQL 5.7 compatible –  the current 2.x release does not include features introduced in Aurora 1.16 or later as per this document

A comparison table MySQL 5.6 Based MySQL 5.7 Based Serverless MySQL 5.6 Based Compatible to MySQL MySQL 5.6.10a MySQL 5.7.12 MySQL 5.6.10a Aurora Engine Version 1.18.0 2.03.01 1.18.0 Parallel Query Yes No No Backtrack Yes No No Aurora Global Database Yes No No Performance Insights Yes No No SELECT INTO OUTFILE S3 Yes Yes Yes Amazon Lambda – Native Function Yes No No Amazon Lambda – Stored Procedure Yes Yes Yes Hash Joins Yes No Yes Fast DDL Yes Yes Yes LOAD DATA FROM S3 Yes Yes No Spatial Indexing Yes Yes Yes Asynchronous Key Prefetch (AKP) Yes No Yes Scan Batching Yes No Yes S3 Backed Based Migration Yes No No Advanced Auditing Yes Yes No Aurora Replicas Yes Yes No Database Cloning Yes Yes No IAM database authentication Yes Yes No Cross-Region Read Replicas Yes Yes No Restoring Snapshot from MySQL DB Yes Yes No Enhanced Monitoring Yes Yes No Log Export to Cloudwatch Yes Yes No Minor Version Upgrade Control Yes Yes Always On Data Encryption Configuration Yes Yes Always On Maintenance Window Configuration Yes Yes No

Hope this is helps with selecting which Amazon Aurora edition is right for you, when it comes to supported features.

Photo by Nathan Dumlao on Unsplash

Percona XtraDB Cluster 5.6.42-28.30 Is Now Available

Percona announces the release of Percona XtraDB Cluster 5.6.42-28.30 (PXC) on January 4, 2019. Binaries are available from the downloads section or our software repositories.

Percona XtraDB Cluster 5.6.42-28.30 is now the current release, based on the following:

All Percona software is open-source and free.

Fixed Bugs
  • PXC-2281: Debug symbols were missing in Debian dbg packages.
  • PXC-2220: Starting two instances of Percona XtraDB Cluster on the same node could cause writing transactions to a page store instead of a galera.cache ring buffer, resulting in huge memory consumption because of retaining already applied write-sets.
  • PXC-2230: rgcs.fc_limit=0 not allowed as dynamic setting to avoid generating flow control on every message was still possible in my.cnf due to the inconsistent check.
  • PXC-2238: setting read_only=1 caused race condition.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Percona XtraDB Cluster 5.7.24-31.33 Is Now Available

Percona is glad to announce the release of Percona XtraDB Cluster 5.7.24-31.33 (PXC) on January 4, 2019. Binaries are available from the downloads section or from our software repositories.

Percona XtraDB Cluster 5.7.24-31.33 is now the current release, based on the following:


The following variables are deprecated starting from this release:

  • wsrep_preordered was used to turn on transparent handling of preordered replication events applied locally first before being replicated to other nodes in the cluster. It is not needed anymore due to the carried out performance fix eliminating the lag in asynchronous replication channel and cluster replication.
  • innodb_disallow_writes usage to make InnoDB avoid writes during SST was deprecated in favor of the innodb_read_only variable.
  • wsrep_drupal_282555_workaround avoided the duplicate value creation caused by buggy auto-increment logic, but the correspondent bug is already fixed.
  • session-level variable binlog_format=STATEMENT was enabled only for pt-table-checksum, which would be addressed in following releases of the Percona Toolkit.
Fixed Bugs
  • PXC-2220: Starting two instances of Percona XtraDB Cluster on the same node could cause writing transactions to a page store instead of a galera.cache ring buffer, resulting in huge memory consumption because of retaining already applied write-sets.
  • PXC-2230: rgcs.fc_limit=0 not allowed as dynamic setting to avoid generating flow control on every message was still possible in my.cnf due to the inconsistent check.
  • PXC-2238: setting read_only=1 caused race condition.
  • PXC-1131: mysqld-systemd threw an error at MySQL restart in case of non-existing error-log in Centos/RHEL7.
  • PXC-2269: being not dynamic, the pxc_encrypt_cluster_traffic variable was erroneously allowed to be changed by a SET GLOBAL statement.
  • PXC-2275: checking wsrep_node_address value in the wsrep_sst_common command line parser caused parsing the wrong variable.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!


Managing GitHub with Terraform

If a service can be managed with API most probably you will find it in an impressive list of Terraform providers. Yes, GitHub is there, too. TwinDB hosts software in GitHub, it felt wrong I don’t manage it with Terraform yet, so I decided to give it a go. Prerequisites Directory layout I keep all […]

The post Managing GitHub with Terraform appeared first on TwinDB.

TasksMax: Another Setting That Can Cause MySQL Error Messages

Recently, I encountered a situation where MySQL gave error messages that I had never seen before:

2018-12-12T14:36:45.571440Z 0 [ERROR] Error log throttle: 150 'Can't create thread to handle new connection' error(s) suppressed 2018-12-12T14:36:45.571456Z 0 [ERROR] Can't create thread to handle new connection(errno= 11) 2018-12-12T14:37:47.748575Z 0 [ERROR] Error log throttle: 940 'Can't create thread to handle new connection' error(s) suppressed 2018-12-12T14:37:47.748595Z 0 [ERROR] Can't create thread to handle new connection(errno= 11)

I was thinking maybe we hit some

ulimit limitations or similar, but all the usual suspects were set high enough, and we were not even close to them.

After googling and discussing with the customer, I found they had had similar issues in the past, and I learned something new. Actually it is relatively new, as it has been around for a few years but is not that well known. It is called TasksMax:

Specify the maximum number of tasks that may be created in the unit. This ensures that the number of tasks accounted for the unit (see above) stays below a specific limit. This either takes an absolute number of tasks or a percentage value that is taken relative to the configured maximum number of tasks on the system. If assigned the special value “infinity“, no tasks limit is applied. This controls the “pids.max” control group attribute. For details about this control group attribute, see pids.txt.

Source Manual.

It was introduced to systemd in 2015:

I’d like to introduce DefaultTasksMax= that controls the default
value of the per-unit TasksMax= by default, and would like it to
set to some value such 1024 out-of-the-box. This will mean that any
service or scope created will by default be limited to 1024
tasks. This of course is a change from before that has the
potential to break some daemons that maintain an excessive number
of processes or threads. However, I think it’s a much better choice
to raise the limit for them, rather than stay unlimited for all
services by default. I think 1024 is not particularly low, but also
not particularly high. Note that the kernel by default limits the
number of processes to 32K in total anyway.

In the end, we can see in this commit they chose 512 to be the default settings for TasksMax, which means services that are not explicitly configured otherwise will only be able to create at most 512 processes or threads.

Why 512? I have read through the email list and there was some discussion about what should be the default. Eventually, I found this comment from one of the developers:

Anyway, for now I settled for the default TasksMax= setting of 512 for
all units, plus 4096 for the per-user slices and 8192 for each nspawn
instance. Let’s see how this will work out.

So this is how 512 become the default and no one has touched it since. MySQL is able to reach that limit and can cause error messages like those we see above.

You can increase this limit by creating a file called

/etc/systemd/system/mysqld.service  :[Service] TasksMax=infinity

You can use a specific number like 4096 (or any other number based on your workload), or infinity which means MySQL can start as many processes as it wants.


Not everyone will reach this limit, but if MySQL is giving error messages like this you should also check TasksMax as well as the other usual suspects. The easiest way to verify the current setting is:

#> systemctl show -p TasksMax mysql TasksMax=512

Photo by Vlad Tchompalov on Unsplash


These days several kinds and forks of MySQL are widely used, and while I promised not to write about MySQL bugs till the end of 2018, I think it makes sense to try to explain basic details about bug reporting for at least one of vendors that use JIRA instances as a public bug tracking systems. I work for MariaDB Corporation and it would be natural for me to write about MariaDB's JIRA that I use every day.

As a side note, Percona also switched to JIRA some time ago, and many of the JIRA-specific details described below (that are different comparing to good old apply to Percona bugs tracking system as well.

Why would MariaDB bugs be interesting to an average MySQL community member who does not use MariaDB at all most of the time? One of the reasons is that some MySQL bugs are also reported (as "upstream") to MariaDB and they may be fixed there well before they are fixed in MySQL. Consider MDEV-15953 - "Alter InnoDB Partitioned Table Moves Files (which were originally not in the datadir) to the datadir" (reported by Chris Calender) as an example. It was fixed in MariaDB 5 months ago, while corresponding Bug #78164 is still "Verified" and got no visible attention for more that 3 years. The fix is 12 rows added in two files (test case aside), so theoretically can be easily used to modify upstream MySQL by an interested and determined MySQL user who already compiles MySQL from the source code (for whatever reason), if the problem is important in their environment.

Another reason is related to the fact that work on new features of MariaDB server and connectors is performed in an open manner at all stages. You can see current plans, discussions and decision making on new features happening in real time, in JIRA. Existing problems (that often affect both MySQL and MariaDB) are presented and analyzed, and reading related comments may be useful to understand current limitations of MySQL and decide if at some stage switching to MariaDB or using some related patches may help in your production environment. There is no need to wait for some lab preview release. You can also add comments on design decisions and question them before it's too late. Great example of such a useful to read (for anyone interested in InnoDB) feature request and work in progress is MDEV-11424 - "Instant ALTER TABLE of failure-free record format changes".

Yet another reason to use MariaDB JIRA and follow some bug reports and feature requests there is to find some insights on how MySQL, its components (like optimizer) and storage engines (like InnoDB) really work. Consider my Bug #82127 - "Deadlock with 3 concurrent DELETEs by UNIQUE key". This bug was originally reported to Percona as lp:1598822 (as it was first noticed with Percona's XtraDB emgine) and ended up in their JIRA as PS-3479 (still "New"). In MySQL bugs database it got "Verified" after some discussions. Eventually I gave up waiting for "upstream" to make any progress on it and reported it as MDEV-10962. In that MariaDB bug report you can find explanations of the behavior noticed, multiple comments and ideas on the root case and on how to improve locking behavior in this case, links to other related bugs etc. It's a useful reading. Moreover, we see that there are plans to improve/fix this in MariaDB 10.4.

I also like to check some problematic and interesting test cases, no matter in what bugs database it was reported, on both MariaDB Server, Percona Server and MySQL Server, as long as it's about some common features. But may be it's so because I work with all these as a support engineer.

Anyway, one day following MariaDB Server bugs may help some MySQL DBA to do the job better. So, I suggest all MySQL users to check MariaDB's JIRA from time to time. Some basic details about the differences comparing to MySQL's bugs database are presented below.

First thing to notice in case of MariaDB's JIRA is a domain name. It's, so bug tracking system formally "belongs" to MariaDB Foundation - non-profit entity that supports continuity and open collaboration in the MariaDB ecosystem. Both MariaDB Foundation employees, MariaDB Corporation employees, developers working for partners (like Codership) and community members (like Olivier Bertrand, author of CONNECT storage engine I had written about here) work on source code (and bugs processing and fixing) together, at GitHub. Different users have different roles and privileges in JIRA, surely. But there is no other, "internal" bugs database in MariaDB Corporation. All work or bugs and features, time reporting, code review process, as well as release planning happen (or at least is visible) in an open manner, in JIRA.

Even if you do not have JIRA account, you still can see Jira Road Map, release plans and statuses. You can see all public comments and history of changes for each bug. If you create and log in into your account (this is needed to report new bugs, vote for them or watch them and get email notifications about any changes, obviously) you'll see also more details on bugs, like links to GitHub commits and pull requests related to the bug.

Unlike MySQL bugs database where bugs are split into "Categories" (where both "MySQL Server: Information schema" and "MySQL Workbench" are categories more or less of the same level) but are numbered sequentially over all categories, JIRA instances usually support "Projects", with separate "name" and sequential numbering of bugs per project.

At the moment there are 17 or so projects in MariaDB JIRA, of them the following public ones are most interesting for MySQL community users, I think:
Let's consider one MariaDB Server bug for example:

Unlike in MySQL bugs database, JIRA issues have "Type". For our case it's important that feature requests usually end up as "Task" vs "Bug" as a type for a bug. Some projects in MariaDB JIRA may also support a separate "New Feature" type to differentiate features from tasks not related to creating new code. In MySQL separate severity (S4, "Feature request") is used.

MariaDB JIRA issues have priorities from the following list:
  • Trivial
  • Minor
  • Major
  • Critical
  • Blocker
By default MariaDB bugs are filed with intermediate, "Major" priority. Priority may be changed by the bug reporter or by JIRA users (mostly developers) who work on the bug, it often changes with time (priority may increase if more users are affected, or if the fix does not happen for long enough time etc, or decrease when the problem can be workarounded somehow for affected users). Usually a bug with "Blocker" priority means there should be no next minor release for any major version listed in "Fix Version/s" without the fix.

There are many fields in MySQL bugs database to define priority of the fix (including "Priority" itself), but only "Severity" is visible to public. Usually "Severity" of the MySQL bug does NOT change with time (if only before it's "Verified").

It is normal to list all/many versions affected by the bug in JIRA in "Affected Version/s". If the bug is fixed, in "Fix Version/s" you can find the exact list of all minor MariaDB Server versions that got the fix.

Each JIRA issue has a "Status" and "Resolution". In MySQL bugs database there is just "Status" for both. Valid statuses are:
  • OPEN - this is usually a bug that is just reported or is not yet in the process of fixing.
  • CONFIRMED - this status means that some developer checked bug report and confirmed it's really a bug and it's clear how to reproduce it based on the information already present in the report. More or less this status matches "Verified" MySQL bug. But unlike in MySQL, even "Open" bug may be assigned to a developer to further work on it.
  • CLOSED - the bug is resolved somehow. See the content of the "Resolution" filed for details on how it was resolved.
  • STALLED - this is a real bug and some work on it was performed, but nobody actively works on it now.
  • IN PROGRESS - assignee is currently working on the fix for the bug.
  • IN REVIEW - assignee is currently reviewing the fix for the bug.
The following values are possible for "Resolution" field:
  • Unresolved - every bug that is not "CLOSED" is "Unresolved".
  • Fixed - every bug that was fixed with some change to the source code. If you log in to JIRA you should be able to find links to GitHub commit(s) with the fix in the "Fixed" JIRA issue.
  • Won't Fix - the problem is real, but it was decided not to fix it (as it's expected or may be too hard to fix). Consider my MDEV-15213 - "UPDATEs are slow after instant ADD COLUMN" as one of examples.
  • Duplicate - there is another bug report about the same problem. You can find link to it in the JIRA issue.
  • Incomplete - there is no way to reproduce or understand the problem based on the information provided. See MDEV-17808 for example.
  • Cannot Reproduce - bug reporter himself can not reproduce the problem any more, even after following the same steps that caused the problem before. See MDEV-17667 for example.
  • Not a Bug - the problem described is not a result of any bug. Everything works as designed and probably some misunderstanding caused bug reporter to think it was a bug. See MDEV-17790 as a typical example.
  • Done - this is used for completed tasks (like MDEV-17886) or bugs related to some 3rd party stored engine where the fix is done, but it's up to MariaDB to merge/use fixed version of the engine (like MDEV-17212).
  • Won't Do - it was decided NOT to do the task. See MDEV-16418 as one of examples.
In MySQL there are separate bug statuses for (most of) these. There are some tiny differences for the way some statuses like "Cannot reproduce" are applied by those who process bugs in MySQL vs MariaDB though.

Explanations above should be enough for any MySQL bugs database user to start using MariaDB's JIRA efficiently, I think. But I am open to any followup questions and I am considering separate blog posts explaining the life cycle of a MariaDB Server bug and some tips on efficient search in MariaDB JIRA.