MySQL Cluster (NDB Storage Engine) is a shared-nothing distributed in-memory database with support for disk-data tables as well. It provides a DB environment with high availability, near-linear scalability, realtime performance, automatic and user defined data partitioning, load balancing and transparent failover.
The data is distributed using a partitioning key (hash calculated on primary key as default) so the performance is especially great for primary key lookups and parallel data access.
In a few steps we will install, configure and run a simple 2 Data Node MySQL Cluster setup.
First an introduction to the different node types that makes a MySQL Cluster.
MySQL Cluster Node Types
A node in MySQL Cluster terminology is a software process. There are primarily 4 node types.
- SQL Node (mysqld)
is the SQL interface provided by the MySQL Server.
- API Node
is an application that uses the native NDB API. The SQL Node is "basically" an API Node.
- Data Node (ndbd)
is responsible for storing data and handling transactions.
- Management Node (ndb_mgmd)
is the management server which provides the cluster configuration and the cluster log
Below is picture of a very simple typical 2 Data Node deployment.
MySQL Cluster Installation
For now I will only cover installation of MySQL Cluster. Personally I prefer installing from tar files and not using rpm or other packaging tools because it gives you better control over the installation, so that's what we are going to use.
- Download the .tar.gz (non RPM version) from http://dev.mysql.com/downloads/cluster
I'm on OS X so I'll use 'mysql-cluster-gpl-6.3.20-osx10.5-x86_64.tar.gz'
- Create separate folders for the installation files and for the data storage
$ mkdir ~/db $ mkdir ~/db/etc # this is where we'll put our config files later $ mkdir -p ~/db/data/mysql $ mkdir ~/db/data/mysql-cluster
- Extract the tar file into the installation directory
$ cd ~/db $ tar zxvf mysql-cluster-gpl-6.3.20-osx10.5-x86_64.tar.gz
- Create a symbolic link 'mysql' (easy to upgrade or revert versions)
$ ln -s mysql-cluster-gpl-6.3.20-osx10.5-x86_64 mysql
- Initialize the MySQL data directory and system tables
$ cd mysql $ scripts/mysql_install_db --basedir=<homedir>/db/mysql --datadir=<homedir>/db/data/mysql
- Define a MYSQL_HOME and set paths
$ export MYSQL_HOME=~/db/mysql $ export PATH=$MYSQL_HOME/bin:$PATH $ export PATH=$MYSQL_HOME/libexec:$PATH # if installing using own source builds!
MySQL Cluster 7.0 is just around the corner. One new feature is better support for multi-core/threaded environments such as Sun's CMT servers. It will initially support up to 8 cores. There will be an additional binary called ndbmtd, which is the multithreaded version that is intended for use on multi-core systems.
The only change to the cluster configuration file is to add 'MaxNoOfExecutionThreads' to specify the maximum number of threads to use. If omitted ndbmtd will act as "single" threaded.
Next we will create two configuration files, one for the SQL Node, i.e., the MySQL Server and one for MySQL Cluster a.k.a NDB storage engine. The cluster configuration consist of 2 Data Nodes, 1 Management Node, 1 SQL Node and 7 API Nodes. The total amount of memory allocated for the data is 2GB, i.e,. 1GB per Data Node however change the configuration to match you environment. All running on localhost.
There are loads of cluster parameters to tune so check out the online manual for those.
- Minimal MySQL server configuration file, my.cnf
# The MySQL server [mysqld] port = 3306 socket = /tmp/mysql.sock basedir=/Users/alyu/Projects/db/mysql datadir = <homedir>/data/mysql # Enable NDB storage engine ndbcluster # Location of the management server ndb-connectstring=localhost
- Minimal MySQL Cluster configuration file, 2-node-config.ini (edit DataDir and FileSystemPath)
[NDBD DEFAULT] # Number of copies of the data (2 is the default and recommended setting) NoOfReplicas: 2 # trace files, logs, pid files DataDir:
<homedir>/db/data/mysql-cluster# Defining 8 API nodes
# Metadata, redo/undo(for disk) logs and data files
# must be created before the data nodes are started
# Data Memory for data contents and ordered indexes
# Index memory, usually 1/6 or 1/8 of DataMemory is enough but depends on
# number of unique hash indexes
# Size of each REDO log fragment,
# there are 4 log fragments per log file
# Bigger is better for high write load and should be change from default
# when DataMemory > 512MB
# Set to 6xDataMemory/4*FragmentLogFileSize (in MB)
# Good heuristics recommendation
# NoOfFragmentLogFiles = 6*DataMemory/4*FragmentLogFileSize
# Redo log buffer is used to log activities
# If you receive REDO log buffers overloaded try increase this value
Redouffer = 32MB
# Amount of memory used for logs buffers, disk operations, undo files etc
# Change this from default 20M to a larger value when using Disk Data Tables
#SharedGlobalMemory = 384M
# Amount of memory used for caching pages on disk, default 64M
# Set this as high as possible when using lots of disk data tables
# DiskPageBufferMemory =
# 0.8 x (total memory – (OS memory + [buffer memory + DataMemory + IndexMemory])
#DiskPageBufferMemory = 500M
# Management Node/Server
# Default arbitrator
# Data Nodes
# API nodes like the SQL Node, C++/Java applications using NDB API
# [MYSQLD] and [API] are interchangeable
# HostName defaults to localhost if omitted
# Id is automatically assigned if omitted
- Put the configuration files 'my.cnf' and '2-node-config.ini' in the 'db/etc' folder
If you are planning to use MySQL Cluster 7.0's multithreaded version 'ndbmtd' then you need to add 'MaxNoOfExecutionThreads' to the [NDBD DEFAULT] section in the cluster configuration.
The recommended values are:
|8 or more||8|
Starting MySQL Cluster
The first node that needs to be started is the Management Node because it provides the cluster configuration which all the other nodes need and ask for when they start up.
- Start the Management Node
$ cd ~/db $ ndb_mgmd -f etc/2-node-config.ini
- Start the Data Nodes, as many as there are defined in the cluster configuration
$ ndbd # or ndbmtd for MySQL Cluster 7.0 multithreaded version $ ndbd # or ndbmtd for MySQL Cluster 7.0 multithreaded version
- Start the SQL Node/MySQL server
$ mysqld --defaults-file=etc/my.cnf & # or use the mysqld_safe script
- Use the Management client and check the Data Nodes startup progress, wait until all Data Nodes are in started state
$ ndb_mgm -e "all status" Connected to Management Server at: 127.0.0.1:1186 Node 2: started (mysql-5.1.30 ndb-6.3.20) Node 3: started (mysql-5.1.30 ndb-6.3.20)
- Create a cluster table (NDB storage engine)
$ mysql -uroot mysql> create table t1 (id int not null auto_increment, city varchar(30) > primary key(id)) engine=ndb; mysql> insert into t1 values (null, 'Stockholm'); mysql> insert into t1 values (null, 'Tokyo'); mysql> select * from t1; +----+-----------+ | id | city | +----+-----------+ | 1 | Stockholm | | 2 | Tokyo | +----+-----------+ 2 rows in set (0.00 sec)
And that's all. Now you have a 2 data node cluster installed, running and ready to play with!
Command Line Tools
Finally some ndb tools that can be useful. They are all located in $MYSQL_HOME/bin directory.
Cluster management client which you use to monitor the cluster, set cluster log filters, start backups etc.
Lists all the tables in the cluster
- ndb_desc <table name>
Shows cluster table descriptions and partition info
Restore data from backups
select * from <table name>
select count(*) from <table name>