Tuesday, December 11, 2007

MySQL Administration

Overview of Administrative Duties

The MySQL database system consists of several components. You should be familiar with what these components are and the purpose of each to understand both the nature of the system you're administrating and the tools available to help you do your job. If you take the time to understand what you're overseeing, your work will be much easier. To that end, you should acquaint yourself with the following aspects of MySQL:

  • The MySQL server. The server, mysqld, is the hub of a MySQL installation; it performs all manipulation of databases and tables. mysqld_safe is a related program used to start up the server, monitor it, and restart it in case it goes down. (Prior to MySQL 4, mysqld_safe is named safe_mysqld.) If you run multiple servers on a single host, mysqld_multi can help you manage them more easily.

  • The MySQL clients and utilities. Several MySQL programs are available to help you communicate with the server. For administrative tasks, the most important of these are:

    • mysql— An interactive program that allows you to send SQL statements to the server and to view the results

    • mysqladmin— An administrative program that lets you perform tasks such as shutting down the server or checking its status if it appears not to be functioning properly

    • mysqlcheck, isamchk, and myisamchk—, Utilities that help you perform table analysis and optimization, as well as crash recovery if tables become damaged

    • mysqldump and mysqlhotcopy— Tools for backing up your databases or copying databases to another server

  • The server's language, SQL. Some administrative duties can be performed using only the mysqladmin command-line utility, but you're better off if you're also able to talk to the server in its own language. As a simple example, you may need to find out why a user's privileges aren't working the way you expect them to work. There is no substitute for being able to go in and communicate with the server directly, which you can do by using the mysql client program to issue SQL queries that let you examine the grant tables. And if your version of MySQL predates the introduction of the GRANT statement, mysql can be used to set up each user's privileges by manipulating the grant tables directly.

    If you don't know any SQL, be sure to acquire at least a basic understanding of it. A lack of SQL fluency will only hinder you, whereas the time you take to learn will be repaid many times over. A real mastery of SQL takes some time, but the basic skills can be attained quickly. For instruction in SQL and the use of the mysql command-line client, "Getting Started with MySQL and SQL."

  • The MySQL data directory. The data directory is where the server stores its databases and status files. It's important to understand the structure and contents of the data directory so that you know how the server uses the file system to represent databases and tables, as well as where files, such as the logs, are located and what's in them. You should also know your options for managing allocation of disk space across file systems should you find that the file system on which the data directory is located is becoming too full.


General Administration

General administration deals primarily with the operation of mysqld, the MySQL server, and with providing your users access to the server. The following duties are most important in carrying out this responsibility:

  • Server startup and shutdown. You should know how to start and stop the server manually from the command line and how to arrange for automatic startup and shutdown when your system starts up and shuts down. It's also important to know what to do to get the server going again if it crashes or will not start properly.

  • User account maintenance. You should understand the difference between MySQL user accounts and UNIX or Windows login accounts. You should know how to set up MySQL accounts by specifying which users can connect to the server and from where they can connect. New users should also be advised on the proper connection parameters that they will need to use to connect to the server successfully. It's not their job to figure out how you've set up their accounts! You'll also need to know how to reset forgotten passwords.

  • Log file maintenance. You should understand what types of log files you can maintain, as well as when and how to perform log file maintenance. Log rotation and expiration are essential to prevent the logs from filling up your file system.

  • Database backup and copying. Database backups are of crucial importance in the event of a severe system crash. You want to be able to restore your databases to the state they were in at the time of the crash with as little data loss as possible. Note that backing up your databases is not the same thing as performing general system backups (as is done, for example, by using the UNIX dump program). The files corresponding to your database tables may be in flux due to server activity when system backups take place, so restoring those files will not give you internally consistent tables. The mysqldump program generates backup files that are more useful for database restoration, and it allows you to create backups without taking down the server. You may also need to move databases in the event of a full disk.

    If you decide to run a database on a faster host, you'll need to copy its contents to a different machine. You should understand the procedure for doing this, should the need arise. Database files may be system dependent, so you can't necessarily just copy the files.

  • Database replication. Making a backup or a copy of a database takes a snapshot of its state at one point in time. Another option available to you is to use replication, which involves setting up two servers in cooperative fashion such that changes to databases managed by one server are propagated on a continuing basis to the corresponding databases managed by the other server.

  • Server configuration and tuning. Your users want the server to perform at its best. The quick-and-dirty method for improving how well your server runs is to buy more memory or to get faster disks. But those brute-force techniques are no substitute for understanding how the server works. You should know what parameters are available for tuning the server's operation and how they apply to your situation. At some sites, queries tend to be mostly retrievals. At others, inserts and updates dominate. The choice of which parameters to change will be influenced by the query mix that you observe at your own site.

    Configuration issues also include localizing the server (for example, to make sure that it uses the proper character set and time zone).

  • Multiple servers. It's useful to run multiple servers under some circumstances. You can test a new MySQL release while leaving your current production installation in place, or provide better privacy for different groups of users by giving each group its own server. (The latter scenario is particularly relevant to ISPs.) For such situations, you should know how to set up multiple simultaneous installations.

  • MySQL software updates. New MySQL releases appear frequently. You should know how to keep up to date with these releases to take advantage of bug fixes and new features. Understand the circumstances under which it's more reasonable to hold off on upgrading, and know how to choose between the stable and development releases.


Security

When you run a MySQL installation, it's important to make sure that the information your users entrust to their databases is kept secure. The MySQL administrator is responsible for controlling access to the data directory and the server and should understand the following issues:

  • File system security. A UNIX machine may host several user accounts that have no MySQL-related administrative duties. It's important to ensure that these accounts have no access to the data directory. This prevents them from compromising data on a file system level by copying database tables or removing them, or by being able to read log files that may contain sensitive information. You should know how to set up a UNIX user account to be used for running the MySQL server, how to set up the data directory so that it is owned by that user, and how to start up the server to run with that user's privileges.

  • Server security. You must understand how the MySQL security system works so that when you set up user accounts, you grant the proper privileges. Users connecting to the server over the network should have permission to do only what they are supposed to be able to do. You don't want to inadvertently grant superuser access to anonymous users due to faulty understanding of the security system!


Location of the Data Directory

A default data directory location is compiled into the server. Under UNIX, typical defaults are /usr/local/mysql/var if you install MySQL from a source distribution, /usr/local/mysql/data if you install from a binary distribution, and /var/lib/mysql if you install from an RPM file. Under Windows, the default data directory is C:\mysql\data.

The data directory location can be specified explicitly when you start up the server by using a --datadir=dir_name option. This is useful if you want to place the directory somewhere other than its default location. Another way to specify the location is to list it in an option file that the server reads at startup time. Then you don't need to include it on the command line each time you start the server. Data directory relocation is covered later in the chapter.

As a MySQL administrator, you should know where your server's data directory is located. If you run multiple servers, you should know where each one's data directory is. But if you don't know the location (perhaps you are taking over for a previous administrator who left poor notes), there are several ways to find out:

  • Ask the server for the location. The server maintains a number of variables pertaining to its operation, and it can report any of their values. The data directory location is indicated by the datadir variable, which you can obtain using a mysqladmin variables commandor a SHOW VARIABLES statement. From the command line, use mysqladmin. On UNIX, the output might look like this:

    % mysqladmin variables
    +---------------+-----------------------+
    | Variable_name | Value |
    +---------------+-----------------------+
    ...
    | datadir | /usr/local/mysql/var/ |
    ...

    On Windows, the output might look like the following instead:

    C:\> mysqladmin variables
    +---------------+-----------------------+
    | Variable_name | Value |
    +---------------+-----------------------+
    ...
    | datadir | c:\mysql\data\ |
    ...

    From within mysql, check the variable's value like this:

    mysql> SHOW VARIABLES LIKE 'datadir';
    +---------------+-----------------------+
    | Variable_name | Value |
    +---------------+-----------------------+
    | datadir | /usr/local/mysql/var/ |
    +---------------+-----------------------+

    If you have multiple servers running, they will be listening on different TCP/IP port numbers, sockets, or named pipes. You can get data directory information from each of them in turn by supplying appropriate --port or --socket options to connect to the port or socket on which each server is listening. Specifying a host of 127.0.0.1 explicitly tells mysqladmin to connect to the server on the local host using a TCP/IP connection:

    % mysqladmin --host=127.0.0.1 --port=port_num variables

    Under UNIX, specifying a value of localhost causes a UNIX socket connection to be used. You can also specify a --socket option if necessary to indicate the socket file pathname:

    % mysqladmin --host=localhost --socket=/path/to/socket variables

    Under Windows NT-based systems, a named pipe connection can be specified by giving '.' as a hostname, perhaps with a --socket option to indicate the pipe name:

    C:\> mysqladmin --host=. --socket=pipe_name variables

    For any platform, to connect via TCP/IP to a remote server running on another host, specify a --host option that indicates the name of the server host:

    % mysqladmin --host=host_name variables

    Specify a --port option as well if you need to connect to a port number other than the default.

  • Under UNIX, use the ps command to see the command line of any currently executing mysqld process or processes. By looking for a --datadir option, you may be able to determine the data directory location. If you have a BSD-style ps, try the following command:

    % ps axww | grep mysqld

    For a System V-style ps, try this instead:

    % ps -ef | grep mysqld

    The ps command can be especially useful if your system runs multiple servers because you can discover multiple data directory locations at once. The drawbacks are that ps must be run on the server host and that no useful information is produced unless the --datadir option was specified explicitly on the mysqld command line. (On the other hand, some of the startup scripts that invoke mysqld for you attempt to determine the data directory pathname and put it in the mysqld command line, which makes that information available to ps.)

  • Look in an option file that the server reads when it starts up. For example, if you look in /etc/my.cnf under UNIX or C:\my.cnf under Windows, you may find a datadir line in the [mysqld] option group:

    [mysqld]
    datadir=/path/to/data/directory

    The pathname indicates the location of the data directory.

  • The server's help message includes an indication of the default data directory location that is compiled in. This will tell you the directory that the server actually uses when it runs, if the location is not overridden at startup time. To see this output, issue the following command:

    % mysqld --help
    ...
    datadir /usr/local/mysql/var/
    ...
  • If you installed MySQL from a source distribution, you can examine its configuration information to determine the data directory location. For example, the location is available in the top-level Makefile. But be careful: The location is the value of the localstatedir variable in the Makefile, not the value of the datadir variable, as you might expect. Also, if the distribution is located on an NFS-mounted file system that is used to build MySQL for several hosts, the configuration information will be accurate only for the host for which the distribution was most recently built. That may not show you the data directory for the server in which you're interested.

  • Failing any of the previous methods, you can use find to search for database files. The following command searches for .frm (description) files:

    % find / -name "*.frm" -print

    The .frm files store the definitions of the tables managed by the server, so they are part of any MySQL installation. These files normally will be found in directories that all have a common parent directory; that parent should be the data directory.

In the examples that follow throughout this chapter where I denote the location of the MySQL data directory as DATADIR, you should interpret that as the location of the data directory for the server on your own machine.

Structure of the Data Directory

The MySQL data directory contains all of the databases and tables managed by the server. In general, these are organized into a tree structure that is implemented in straightforward fashion by taking advantage of the hierarchical structure of the UNIX or Windows file systems:

  • Each database corresponds to a directory under the data directory.

  • Tables within a database correspond to files in the database directory.

The exception to this hierarchical implementation of databases and tables as directories and files is that the InnoDB table handler stores all InnoDB tables from all databases within a single common tablespace. This tablespace is implemented using one or more large files that are treated as a single unified data structure within which tables and indexes are represented. The InnoDB tablespace files are stored in the data directory by default.

The data directory also may contain other files:

  • An option file, my.cnf.

  • The server's process ID (PID) file. When it starts up, the server writes its process ID to this file so that other programs can discover the value if they need to send signals to it. (This file is not used on Windows or by the embedded server.)

  • Status and log files that are generated by the server. These files provide important information about the server's operation and are valuable for administrators, especially when something goes wrong and you're trying to determine the cause of the problem. If some particular query kills the server, for example, you may be able to identify the offending query by examining the log files.

  • It's common to store files in the data directory such as the DES key file or the server's SSL certificate and key files.


Relocating Data Directory Contents

The preceding part of this chapter discusses the data directory structure in its default configuration, which is that all databases, status, and log files are located within it. However, you have some latitude in determining the placement of the data directory's contents. MySQL allows you to relocate the data directory itself or certain elements within it. There are several reasons why you might want to do this:

  • You can put the data directory on a file system that has a capacity greater than the file system where it's located by default.

  • If your data directory is on a busy disk, you can put it on a less active drive to balance disk activity across physical devices. You can put databases and log files on different drives or distribute databases across drives for the same reasons. Similarly, the InnoDB tablespace is conceptually a single large block of storage, but you can put its individual component files on different drives to improve performance.

  • Putting databases and logs on different disks helps minimize the damage that can be caused by a failure of a single disk.

  • You might want to run multiple servers, each with its own data directory. This is one way to work around problems with per-process file descriptor limits, especially if you cannot reconfigure the kernel for your system to allow higher limits.

  • Some systems keep server PID files in a specific directory, such as /var/run. You might want to put the MySQL PID file there, too, for consistency of system operation. In similar fashion, if your system uses /var/log for log files, you can also put the MySQL logs there. (However, many systems allow only root to write to these directories. That means you'd need to run the server as root, which for security reasons is not a good idea.)

The rest of this section discusses which parts of the data directory can be moved and how you go about making such changes.

Relocation Methods

There are two ways to relocate the data directory or elements within it:

  • You can specify an option at server startup time, either on the command line or in an option file. For example, if you want to specify the data dir ectory location, you can start the server with a --datadir=dir_name option on the command line or you can put the following lines in an op tion file:

    [mysqld]
    datadir=dir_name

    Typically, the option file group name for server options is [mysqld], as shown in the example. However, depending on your circumstances, other option group names may be more appropriate. For example, the [embedded] group applies to the embedded server. Or if you're running multiple servers using mysqld_multi, the group names will be of the form [mysqldn], where n is some integer associated with a particular server instance. discusses which option groups apply to different server startup methods and also provides instructions for running multiple servers.

  • You can move the thing to be relocated, and then make a symlink (symbolic link) in the original location that points to the new location.

Neither of these methods works universally for everything that you can relocate. summarizes what can be relocated and which relocation methods can be used. If you use an option file, it is possible to specify options in the global option file (such as /etc/my.cnf under UNIX or C:\my.cnf or my.ini in the system directory under Windows).

It's also possible to use the option file my.cnf in the default data directory (the directory compiled into the server). This is a good option file to use for server-specific options if you run multiple servers, but because the server looks for it only in the compiled-in data directory location, the file won't be found if you relocate that directory. (One workaround for this problem is to move the data directory and then make its original location a symlink that points to the new location.)

Entity to Relocate Applicable Relocation Methods
Entire data directory Startup option or symlink
Individual database directories Symlink
Individual database tables Symlink
InnoDB tablespace files Startup option
PID file Startup option
Log files Startup option

Assessing the Effect of Relocation

Before attempting to relocate anything, it's a good idea to verify that the operation will have the desired effect. I tend to favor the du, df, and ls-l commands for obtaining disk space information, but all of these depend on correctly understanding the layout of your file system.

The following example illustrates a subtle trap to watch out for when assessing a data directory relocation. Suppose your data directory is /usr/local/mysql/data and you want to move it to /var/mysql because df indicates the /var file system has more free space (as shown by the following example):

% df /usr /var
Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/wd0s3e 396895 292126 73018 80% /usr
/dev/wd0s3f 1189359 1111924 162287 15% /var

How much space will relocating the data directory free up on the /usr file system? To find out, use du-s to see how much space that directory uses:

% cd /usr/local/mysql/data
% du -s
133426 .

That's about 130MB, which should make quite a difference on /usr. But will it really? Try df in the data directory:

% df /usr/local/mysql/data
Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/wd0s3f 1189359 1111924 162287 15% /var

That's odd. If we're requesting the free space for the file system containing the data directory (that is, /usr), why does df report the space on the /var file system? The following ls-l command provides the answer:

% ls -l /usr/local/mysql/data
...
lrwxrwxr-x 1 mysqladm mysqlgrp 10 Dec 11 23:46 data -> /var/mysql
...

This output shows that /usr/local/mysql/data is a symlink to /var/mysql. In other words, the data directory already has been relocated to the /var file system and replaced with a symlink that points there. So much for freeing up a lot of space on /usr by moving the data directory to /var!

Moral: A few minutes spent assessing the effect of relocation is a worthwhile investment. It doesn't take long, and it can keep you from wasting a lot of time moving things around only to find that you've failed to achieve your objective.