Tuesday, June 16, 2009

Review: MySQL Administrator's Bible



On the front cover of MySQL Administrator’s Bible is a sentence that reads:

“The book you need to succeed!”

I must say, I do agree. Authored by two very experienced DBAs, Sheeri Cabral and Keith Murphy, they’ve combined their talents to cover what you really need to know to succeed. This book is very versatile. If you’re new to MySQL, or experienced in another database and have to start administrating MySQL, you need this book. I can honestly say, even if you have years of MySQL experience, you will learn something new. I did. Divided into four parts, MySQL Administrator’s Bible covers your First Steps with MySQL, Developing with MySQL, Core MySQL Administration and Extending Your Skills.

First Steps with MySQL starts with a gentle introduction to MySQL with company information, which seems to be changing annually, and most importantly, the MySQL community itself. What makes MySQL so fantastic is the community. After that, you’ll be lead into installing and configuring MySQL on various platforms including Linux, Windows and Solaris while touching on post installation configuration too. Basic security is covered as well as some tips on troubleshooting and accessing your new MySQL installation using tools included with MySQL or using third party software.

Developing with MySQL covers the MySQL Language Structure and if you’re coming from another RDBMS, it covers how MySQL deviates from the SQL standard by extending that standard to make MySQL the number one open source database used on the Internet. After that, this section covers the same type of topics covering just about any other mainstream databases such as using stored procedures, cursors, events, views and transactions.

The Core MySQL Administration is the heart of this book. It covers MySQL server tuning, covering all major storage engines including MyISAM, InnoDB, Falcon, PBXT, and NDB engines including the first time I’ve seen in print, the Maria storage engine. An entire chapter is devoted to implementing cache tables and using the query cache. Memcached is also mentioned, and mentioned again in the final section. Continuing on with what I consider the most important job of a DBA, backup and recovery. Databases are very central to running a business, any data loss could put a company out of business. Be prepared.

This section gives a solid introduction to the topic of dealing with users, and how they are managed within MySQL. Count on covering GRANT/REVOKE, using SHOW GRANTS and mk-show-grants MaatKit tool. Partitioning, logging and replication and measuring performance rounds out this section.

If you have experience with another RDBMS, plan on spending a significant amount of time in this section. Not that the other sections aren’t important, they are, but this is the bread and butter of what a MySQL DBA does on a daily basis.

Extending You Skills section can be considered getting your Masters in Database Administration. Just about every DBA will have to tackle improving queries and the tuning of indexes. The second most important job of a DBA is monitoring performance of your MySQL server. Don’t let your users be your first line of monitoring! Be proactive, there are plenty of open source monitoring tools available. The most popular are discussed, as well as MySQL Enterprise and third party companies too. MySQL Data Dictionary is covered in in detail over 58 pages. This is the most I’ve read in any book about the data dictionary. 

Last but not least, most high performance MySQL systems involve scaling up or out. It covers the usual suspects of replication, MySQL Cluster, and memcached. MySQL Proxy is initially covered and has an appendix to expand on that information. MySQL Proxy itself is worthy of its own book. (hint, hint :) ) Two more appendices cover MySQL Functions and Operators, and additional resources.

Even though this book targets MySQL 5.1/6.0, there is plenty of information that will apply to 5.0. If you’re still on 5.0, don’t hesitate to pick up a copy. This will be a book that can stay with you as your upgrade to 5.1 and beyond. The companion website – http://www.wiley.com/go/mysqladminbible contains all the code from the book too, rounding out this fine tome.

What didn’t I like about the book? There are only a couple of things, all personal I’m sure. First, I really don’t care too much for tables of options from the various tools. Most open source tools are developed rather quickly and options change. This could render portions of the book out of date quickly.

The other thing I noticed that wasn’t mentioned in the book was the community versions of MySQL supported by Open Query and Percona. The latter has their own storage engine, XtraDB and backup solution, XtraBackup.

All in all, this is a very solid book on administering MySQL. This book digs deeper, the experience of the authors really show. Well done Sheeri and Keith!

Thursday, April 23, 2009

Did You Get Your MySQL Certification ??

Did you get your certification during the User Conference? If so, be sure to stop by the MySQL Certified Professionals LinkedIn page and request membership! Congratulations on your certification!!

Wednesday, April 22, 2009

Thoughts on Oracle's Purchase of Sun

With a first day announcement at the MySQL 2009 User Conference,  Oracle had purchased Sun! Reading the tweets and blogs, it's understandable that some think it's the end for MySQL, some think it's not - I think it could be both.

MySQL itself is going through a second corporate purchase inside of 18 months. When Sun purchased MySQL, some of the internal talent left. Just recently we saw the exit of Marten, and just before that Monty. What made MySQL what it was in 2008 is the people behind the database, from corporate to the community. 

MySQL the database server itself is safe since it's GPLed. Anyone can have access to the code and create their own distribution. We've already seen this happen within the community, Jeremy Cole, Percona, OurDelta to name a few. The main concern that could slow or stop MySQL progress is all the programming talent leaving en masse. Put this together with the departure of all the original corporate management and MySQL is no longer MySQL the company we remember. The part we don't know for sure is Oracle's plans for MySQL. Just seems to be getting worse.

Drizzle could also be impacted since Brian Aker is a Sun employee, so is Jay Pipes, and I'm sure there are others. Some companies support Open Source Software by paying their employees to work on these projects. We don't know if Oracle will allow this to continue.

Who could be impacted by Oracle's purchase? That huge computer company in Redmond. For years, their database team has had access to the operating system source code. Gives a huge advantage to support and performance. Oracle not only gains the source code to Solaris, but also their hardware too. No company in my 22 years in IT has had access to everything - hardware, operating system and the database. It's going to be a wild ride....

The other company I think could be greatly impacted is Kickfire. Now that Oracle has all that Sun hardware talent on tap, they could also design a database appliance as well.

It's not all gloom and doom.

I see great opportunity for 3rd party companies that support MySQL. Some of these companies have former MySQL employees on staff too, and have already started contributing code to MySQL with their own patches and enhancements. This could be an opportunity for a consortium of companies to come together and support MySQL. Something like the Eclipse Foundation might work. I could also see David, Monty and Marten coming together again under a new company. Plenty of opportunities exist.

What advice can I give? First thing - don't panic. Your MySQL database servers are not going to stop working tomorrow. I'd research 3rd party companies like Percona and Pythian. Be diligent, don't wait until Oracle announces something. Like any great DBA, be proactive.

Thursday, March 12, 2009

Errata: Improving Drupal Performance

Back in November 2008, I wrote a 4 part series on improving Drupal’s MySQL Performance. In that series of posts, I used the BENCHMARK() function to demonstrate the improvement of some of the SQL that Drupal uses. 

Note: Displaying commands and SQL looks horrible in Firefox, but just fine in Opera and IE. Not sure why exactly. I’ve not found a clean way to display code on blogspot.com. If anyone knows how to format code so it looks great, maybe even has line numbers, could you drop me an email? Thanks.

Baron Schwartz, AKA xparb opines: 

“ BENCHMARK() doesn't work the way you think it does. The second parameter is an expression, not a string to be treated as a statement.

Try this:

set @sql := 'select sleep(5)';
select benchmark(1, @sql);


If it works as you think, it'll take 5 seconds. “

Since I’ve had a break in my studies, I have some time to do further research and respond to xparb. When I wrote up this series, BENCHMARK() was new to me, I hadn’t used it before. I definitely learned to do more diligent research prior to posting! The results of his comment?

mysql> SET @sql := 'select sleep(5)'; 
Query OK, 0 rows affected (0.00 sec) 

mysql> SELECT BENCHMARK(1,@sql); 

+-------------------+ 
| BENCHMARK(1,@sql) | 
+-------------------+ 
| 0                 | 
+-------------------+ 
1 row in set (0.00 sec) 

mysql> 



Sure enough, he’s correct, and that was no surprise to me actually. It’s clear BENCHMARK() isn’t working like I thought, the results should have been 5 seconds. Mostly what surprises me is that he’s the only person to catch my mistake. In thinking about this further, even if it had worked, I don’t think it’s that handy at all. I mean, it’s giving only a single result for the entire process of querying MySQL, which really isn’t all that helpful. It won’t provide the detailed information such as opening closing tables, CPU utilization, sending data, etc. What can be used?

If you’re using the Community Edition of MySQL 5.0.37 or greater, there’s hope. Jeremy Cole from Proven Scaling provided a patch to enable profiling queries within MySQL. I’ll demonstrate how to use the profile function with one of the Drupal queries I’ve used in a previous article.

By default, profiling isn’t enabled. To enable profiling, issue the following in your favorite MySQL client:

mysql> SET PROFILING=1; 
Query OK, 0 rows affected (0.00 sec) 

mysql>
 

Now, run a query just like normal:

mysql> SELECT * FROM system WHERE type = 'theme' ;

mysql>

mysql> SHOW PROFILES; 
+----------+------------+-------------------------------------------+ 
| Query_ID | Duration   | Query                                     | 
+----------+------------+-------------------------------------------+ 
| 1        | 0.00059500 | SELECT * FROM system WHERE type = 'theme' | 
+----------+------------+-------------------------------------------+ 
1 row in set (0.00 sec) 


SHOW PROFILES is listing the query we just ran. Looking at the Duration column, it took 0.000595 seconds to run this query. To see greater details, use the Query_ID like this:

mysql> SHOW PROFILE FOR QUERY 1; 

+--------------------+----------+ 
| Status             | Duration |

+--------------------+----------+ 
| starting           | 0.000078 | 
| Opening tables     | 0.000212 
|
| System lock        | 0.000007 | 
| Table lock         | 0.000010 | 
| init               | 0.000045 | 
| optimizing         | 0.000014 | 
| statistics         | 0.000016 | 
| preparing          | 0.000016 | 
| executing          | 0.000005 | 
| Sending data       | 0.000151 | 
| end                | 0.000006 | 
| end                | 0.000004 | 
| query end          | 0.000005 | 
| freeing items      | 0.000010 | 
| closing tables     | 0.000006 | 
| logging slow query | 0.000005 | 
| cleaning up        | 0.000005 | 
+--------------------+----------+ 
17 rows in set (0.00 sec) 

mysql>


This returned detailed information on how this query is performing. We can clearly see Opening tables and Sending data are two of the most time consuming parts of this query. Running a query a single time doesn’t provide enough data to judge the performance of a query. Putting together a Perl script to run the query multiple times, or during times of heavy use could provide some insight to make improvements.

Just a quick post to fix my error, and to introduce query profiling that’s available in MySQL Community Edition 5.0.37 or greater. Thanks to xparb for pointing this error out, and to Jeremy for creating this profiling patch.

Now, back to my studies.

Tuesday, November 25, 2008

Improving Drupal Performance Part IV

I'm not planning on a Part IV! :) I wrote these articles last summer and decided to post them to the blog so they didn't get lost.

Improving Drupal Performance Part III

Introduction

Using the tools and techniques from Part II, we'll look at the areas of Drupal that can be improved. We'll go over queries, indexes and mysql configuration. This will be based upon loading the home page only.

Initial Load

Looking at a single loading of the home page produces a total of 122 SELECT queries, 3 SET commands, and 2 UPDATES. Only 30 SELECTS are unique, so there's plenty of duplicated queries. The slow query log contains 5 queries when the system is loaded with 10 concurrent connections for a 30 second period. The queries in the slow query log could also be queries that are not using indexes. We'll start with these queries first.

Slow Query Log

mysql» SELECT name, filename, throttle FROM system WHERE type = 'module' AND status = 1 ORDER BY weight ASC, filename ASC;
+----------------+-----------------------------------------------+----------+
| name           | filename                                      | throttle |
+----------------+-----------------------------------------------+----------+
| block          | modules/block/block.module                    |        0 | 
| color          | modules/color/color.module                    |        0 | 
| comment        | modules/comment/comment.module                |        0 | 
| filter         | modules/filter/filter.module                  |        0 | 
| help           | modules/help/help.module                      |        0 | 
| menu           | modules/menu/menu.module                      |        0 | 
| node           | modules/node/node.module                      |        0 | 
| system         | modules/system/system.module                  |        0 | 
| taxonomy       | modules/taxonomy/taxonomy.module              |        0 | 
| user           | modules/user/user.module                      |        0 | 
| watchdog       | modules/watchdog/watchdog.module              |        0 | 
| devel_generate | sites/all/modules/devel/devel_generate.module |        0 | 
+----------------+-----------------------------------------------+----------+
12 rows in set (0.00 sec)

mysql» EXPLAIN SELECT name, filename, throttle FROM system WHERE type = 'module' AND status = 1 ORDER BY weight ASC, filename ASC\G
*************************** 1. row ***************************
  id: 1
  select_type: SIMPLE
  table: system
  type: ALL
possible_keys: NULL
  key: NULL
  key_len: NULL
  ref: NULL
  rows: 35
  Extra: Using where; Using filesort
1 row in set (0.00 sec)

mysql» SHOW CREATE TABLE system;
CREATE TABLE `system` (
  `filename` varchar(255) NOT NULL default '',
  `name` varchar(255) NOT NULL default '',
  `type` varchar(255) NOT NULL default '',
  `description` varchar(255) NOT NULL default '',
  `status` int(11) NOT NULL default '0',
  `throttle` tinyint(4) NOT NULL default '0',
  `bootstrap` int(11) NOT NULL default '0',
  `schema_version` smallint(6) NOT NULL default '-1',
  `weight` int(11) NOT NULL default '0',
  PRIMARY KEY (`filename`),
  KEY `weight` (`weight`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8


mysql»

We can see that this query isn't using any indexes. The EXPLAIN output shows NULL for possible_keys, key, key_len and ref. This table contains 35 rows, so it's doing a full table scan to return the result. Benchmarking this query produces:

mysql» SET @sql := "SELECT name, filename, throttle FROM system WHERE type = 'module' AND status = 1 ORDER BY weight ASC, filename ASC;" ;
Query OK, 0 rows affected (0.01 sec)

mysql» SELECT BENCHMARK(1000000, @sql);
+--------------------------+
| BENCHMARK(1000000, @sql) |
+--------------------------+
| 0                        | 
+--------------------------+
1 row in set (0.76 sec)

mysql» 

mysql» CREATE INDEX idx_type_status on system (type, status);
Query OK, 35 rows affected (0.01 sec)
Records: 35 Duplicates: 0 Warnings: 0

mysql» SELECT BENCHMARK(1000000, @sql);
+--------------------------+
| BENCHMARK(1000000, @sql) |
+--------------------------+
| 0                        | 
+--------------------------+
1 row in set (0.66 sec) 


Creating an index on the type and status column helped by 13% . Instead of a full table scan, EXPLAIN reports only 12 rows accessed:

mysql» EXPLAIN SELECT name, filename, throttle FROM system WHERE type = 'module' AND status = 1 ORDER BY weight ASC, filename ASC\G

*************************** 1. row ***************************

  id: 1
  select_type: SIMPLE
  table: system
  type: ref
possible_keys: idx_type_status
  key: idx_type_status
  key_len: 771
  ref: const,const
  rows: 12
  Extra: Using where; Using filesort
1 row in set (0.00 sec)

mysql»


Another speedup can be realized by removing the ORDER BY so MySQL doesn't have to create a temporary table, then having PHP sort the data. There would be a trade off with this approach that overall benchmarking would need to be used.

mysql» set @sql1 = "SELECT name, filename, throttle FROM system WHERE type = 'module' AND status = 1;";
Query OK, 0 rows affected (0.00 sec)

mysql» SELECT BENCHMARK(1000000, @sql1);
+---------------------------+
| BENCHMARK(1000000, @sql1) |
+---------------------------+
| 0 | 
+---------------------------+
1 row in set (0.52 sec)

mysql»


Covering indexes can't be created due to MySQL's index limit of 1000 bytes. In looking at the table, there could be ways to change the table schema to fit all the columns needed for a covering index.

The next query:

mysql» SELECT * FROM system WHERE type = 'theme'\G
*************************** 1. row ***************************
  filename: themes/garland/page.tpl.php
  name: garland
  type: theme
  description: themes/engines/phptemplate/phptemplate.engine
  status: 1
  throttle: 0
  bootstrap: 0
schema_version: 0
  weight: 0
1 row in set (0.00 sec)

mysql» EXPLAIN SELECT * FROM system WHERE type = 'theme'\G

*************************** 1. row ***************************
  id: 1
  select_type: SIMPLE
  table: system
  type: ALL
possible_keys: NULL
  key: NULL
  key_len: NULL
  ref: NULL
  rows: 35
  Extra: Using where
1 row in set (0.00 sec)

mysql» set @sql := "EXPLAIN SELECT * FROM system WHERE type = 'theme';";
Query OK, 0 rows affected (0.00 sec)

mysql» SELECT BENCHMARK(1000000, @sql);
+--------------------------+
| BENCHMARK(1000000, @sql) |
+--------------------------+
| 0                        | 
+--------------------------+
1 row in set (0.65 sec)



Since this query has the type column listed first, it can use the exact same index that was created for the first query:

mysql» EXPLAIN SELECT * FROM system WHERE type = 'theme'\G
*************************** 1. row ***************************
  id: 1
  select_type: SIMPLE
  table: system
  type: ref
possible_keys: idx_type_status
  key: idx_type_status
  key_len: 767
  ref: const
  rows: 1
  Extra: Using where
1 row in set (0.00 sec)

mysql» 


There's also another query that's very close to the two above:

mysql» SELECT * FROM system WHERE type = 'theme' AND name = 'garland'\G
*************************** 1. row ***************************
  filename: themes/garland/page.tpl.php
  name: garland
  type: theme
  description: themes/engines/phptemplate/phptemplate.engine
  status: 1
  throttle: 0
  bootstrap: 0
schema_version: 0
  weight: 0
1 row in set (0.00 sec)

mysql»

mysql» EXPLAIN SELECT * FROM system WHERE type = 'theme' AND name = 'garland'\G
*************************** 1. row ***************************
  id: 1
  select_type: SIMPLE
  table: system
  type: ref
possible_keys: idx_type_status
  key: idx_type_status
  key_len: 767
  ref: const
  rows: 1
  Extra: Using where
1 row in set (0.00 sec)

mysql»


By careful query design, it's possible to have a single index support more than one query type. MySQL supports more than one index on a table, but only one index can be used at a time. Having many indexes on a table can impact UPDATES and INSERTS, so minimizing the number of indexes per table will help the performance of writing to tables.

Another approach to these queries would be to use a single query that returns the most data. All the queries above have the exact same result set returned showing that the other two queries are duplicated and may be eliminated.

Next query:

mysql» SELECT nt.type, nt.* FROM node_type nt ORDER BY nt.type ASC\G

*************************** 1. row ***************************
  type: page
  type: page
  name: Page
  module: node
  description: If you want to add a static page, like a contact page or an about page, use a page.
  help: 
  has_title: 1
  title_label: Title
  has_body: 1
  body_label: Body
min_word_count: 0
  custom: 1
  modified: 1
  locked: 0
  orig_type: page

*************************** 2. row ***************************
  type: story
  type: story
  name: Story
  module: node
  description: Stories are articles in their simplest form: they have a title, a teaser and a body, but can be extended by other modules. The teaser is part of the body too. Stories may be used as a personal blog or for news articles.
  help: 
  has_title: 1
  title_label: Title
  has_body: 1
  body_label: Body
min_word_count: 0
  custom: 1
  modified: 1
  locked: 0
  orig_type: story
2 rows in set (0.00 sec)

 

This particular query will always do a full table scan. It's returning the same column twice, type. That could also be removed to minimize the amount of data that needs to get sent back to the client. The nt.* already has the type column in it. In general, when I see any SELECT * type queries, I always wonder if all columns are really needed. If not, columns should be specified so the correct indexes can be created for that table.

Final query in the slow query log:

mysql» SELECT DISTINCT b.* FROM blocks b LEFT JOIN blocks_roles r ON b.module = r.module AND b.delta = r.delta WHERE b.theme = 'garland' AND b.status = 1 AND (r.rid IN (2) OR r.rid IS NULL) ORDER BY b.region, b.weight, b.module\G

*************************** 1. row ***************************
  module: user
  delta: 0
  theme: garland
  status: 1
  weight: 0
  region: left
  custom: 0
  throttle: 0
visibility: 0
  pages: 
  title: 
*************************** 2. row ***************************
  module: user
  delta: 1
  theme: garland
  status: 1
  weight: 0
  region: left
  custom: 0
  throttle: 0
visibility: 0
  pages: 
  title: 
2 rows in set (0.01 sec)

mysql» EXPLAIN SELECT DISTINCT b.* FROM blocks b LEFT JOIN blocks_roles r ON b.module = r.module AND b.delta = r.delta WHERE b.theme = 'garland' AND b.status = 1 AND (r.rid IN (2) OR r.rid IS NULL) ORDER BY b.region, b.weight, b.module\G

*************************** 1. row ***************************
  id: 1
  select_type: SIMPLE
  table: r
  type: system
possible_keys: PRIMARY
  key: NULL
  key_len: NULL
  ref: NULL
  rows: 0
  Extra: const row not found
*************************** 2. row ***************************
  id: 1
  select_type: SIMPLE
  table: b
  type: ALL
possible_keys: NULL
  key: NULL
  key_len: NULL
  ref: NULL
  rows: 7
  Extra: Using where; Using temporary; Using filesort
2 rows in set (0.00 sec)

mysql» set @sql := "SELECT DISTINCT b.* FROM blocks b LEFT JOIN blocks_roles r ON b.module = r.module 
» AND b.delta = r.delta WHERE b.theme = 'garland' AND b.status = 1 
» AND (r.rid IN (2) OR r.rid IS NULL) ORDER BY b.region, b.weight, b.module;";

mysql» SELECT BENCHMARK(1000000, @sql);
+--------------------------+
| BENCHMARK(1000000, @sql) |
+--------------------------+
| 0 | 
+--------------------------+
1 row in set (0.72 sec)

 
Putting an index on blocks.theme and blocks.status gives a slight improvement:

mysql» CREATE INDEX idx on blocks (theme, status);  
Query OK, 7 rows affected (0.00 sec)
Records: 7 Duplicates: 0 Warnings: 0

mysql» SELECT BENCHMARK(1000000, @sql);
+--------------------------+
| BENCHMARK(1000000, @sql) |
+--------------------------+
| 0 | 
+--------------------------+
1 row in set (0.69 sec)

mysql» EXPLAIN SELECT DISTINCT b.* FROM blocks b LEFT JOIN blocks_roles r ON b.module = r.module AND b.delta = r.delta WHERE b.theme = 'garland' AND b.status = 1 AND (r.rid IN (2) OR r.rid IS NULL) ORDER BY b.region, b.weight, b.module\G 

*************************** 1. row ***************************
  id: 1
  select_type: SIMPLE
  table: r
  type: system
possible_keys: PRIMARY
  key: NULL
  key_len: NULL
  ref: NULL
  rows: 0
  Extra: const row not found
*************************** 2. row ***************************
  id: 1
  select_type: SIMPLE
  table: b
  type: ref
possible_keys: idx
  key: idx
  key_len: 768
  ref: const,const
  rows: 2
  Extra: Using where; Using temporary; Using filesort

2 rows in set (0.00 sec)

mysql»


Eliminating the ORDER BY and having PHP do the sorting could also make an improvement. This query is rather interesting because it's joining two tables together but it's not using any columns from the second table. This query returns the same result set:

mysql» SELECT DISTINCT b.* FROM blocks b WHERE b.theme = 'garland' AND b.status = 1 ORDER BY b.region, b.weight, b.module\G

*************************** 1. row ***************************
  module: user
  delta: 0
  theme: garland
  status: 1
  weight: 0
  region: left
  custom: 0
  throttle: 0
visibility: 0
  pages: 
  title: 
*************************** 2. row ***************************
  module: user
  delta: 1
  theme: garland
  status: 1
  weight: 0
  region: left
  custom: 0
  throttle: 0
visibility: 0
  pages: 
  title: 
2 rows in set (0.01 sec)


By eliminating the JOIN to the blocks_role table a 14% increase in performance is seen:

mysql» SELECT BENCHMARK(1000000, @sql);

+--------------------------+

| BENCHMARK(1000000, @sql) |
+--------------------------+
| 0                        | 
+--------------------------+
1 row in set (0.59 sec)


That's the end for the slow query log.

General Log

Starting with the General log, the query that's run the most is:

select dst from url_alias where src = 'S', where S is a string. This query is run 42 times in a single connection, and for some reason Drupal is creating queries for data that doesn't exist. This could be due to my test data being incomplete.

Benchmarking:

mysql» set @sql := "SELECT dst FROM url_alias WHERE src = 'node/3705';";
Query OK, 0 rows affected (0.00 sec)

mysql» SELECT BENCHMARK(1000000, @sql);
+--------------------------+
| BENCHMARK(1000000, @sql) |
+--------------------------+
| 0                        | 
+--------------------------+
1 row in set (0.74 sec)


What's going on:

mysql» EXPLAIN SELECT dst FROM url_alias WHERE src = 'node/3705'\G
*************************** 1. row ***************************
  id: 1
  select_type: SIMPLE
  table: url_alias
  type: ref
possible_keys: src
  key: src
  key_len: 386
  ref: const
  rows: 1
  Extra: Using where
1 row in set (0.00 sec)



Using a covering index is possible on this table:

mysql» CREATE INDEX idx on url_alias (src,dst);
Query OK, 3706 rows affected (0.11 sec)
Records: 3706 Duplicates: 0 Warnings: 0

mysql» EXPLAIN SELECT dst FROM url_alias WHERE src = 'node/3705'\G
*************************** 1. row ***************************
  id: 1
  select_type: SIMPLE
  table: url_alias
  type: ref
possible_keys: idx
  key: idx
  key_len: 386
  ref: const
  rows: 1
  Extra: Using where; Using index
1 row in set (0.00 sec)

mysql»


The Using index in the Extra column means the dataset can be returned from the index alone, eliminating the need to use a table. It benchmarks:

mysql» SELECT BENCHMARK(1000000, @sql);
+--------------------------+
| BENCHMARK(1000000, @sql) |
+--------------------------+
| 0                        | 
+--------------------------+
1 row in set (0.63 sec)

mysql»



Showing almost 15% improvement. The biggest improvement would be to consolidate into a single query using IN. As an example:

mysql» SELECT dst FROM url_alias WHERE src IN ('node/3705','node/3706','node/3673','node/3672');

It Benchmarks:

mysql» SELECT BENCHMARK(1000000, @sql);
+--------------------------+
| BENCHMARK(1000000, @sql) |
+--------------------------+
| 0                        | 
+--------------------------+
1 row in set (0.66 sec)



This is a little slower than just the WHERE version, but if you multiply the 4 seperate queries using WHERE:

0.63x4 = 2.52 seconds

The improvement would be about 71% increase in performance in this example. 


The next frequently called query:

select data, created, headers, expire from cache_filter where cid = 'S' .

mysql» EXPLAIN SELECT data, created, headers, expire FROM cache_filter WHERE cid='1:16a45438cd9716a31fa9fbf24badbb8a'\G
*************************** 1. row ***************************
  id: 1
  select_type: SIMPLE
  table: cache_filter
  type: const
possible_keys: PRIMARY
  key: PRIMARY
  key_len: 767
  ref: const
  rows: 1
  Extra: 
1 row in set (0.00 sec) 

mysql» show create table cache_filter\G
*************************** 1. row ***************************
  Table: cache_filter
Create Table: CREATE TABLE `cache_filter` (
  `cid` varchar(255) NOT NULL default '',
  `data` longblob,
  `expire` int(11) NOT NULL default '0',
  `created` int(11) NOT NULL default '0',
  `headers` text,
  PRIMARY KEY (`cid`),
  KEY `expire` (`expire`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

mysql»


On the surface, this query looks pretty good. Upon inspecting the table schema, the only improvement I can see is to reduce the size of the cid field. This field is a MD5 checksum, with some additional data prepended consuming 35 CHARS. The cid field is VARCHAR(255), but should be CHAR(36) instead. With the longblobs and text fields, covering indexes can't be created. Since this query doesn't require these fields in the WHERE clause, changing the table to move those fields to a separate table might improve performance, but at the cost of rewriting the SQL statement and PHP code to reflect those changes.


Conclusion

I've picked the top queries, and have shown a few schema optimizations to improve Drupal performance. The same process is used for the remainder of the queries, and since this paper is already about 13 pages long I thought this was enough to demonstrate what can be done. I'd say a conservative performance increase of around 10% can be realized for the opening anonymous homepage.

Tools Used for Benchmarking and Profiling Drupal Part II

Introduction

  This paper will outline my test environment, software used and procedures for benchmarking and profiling. This process can apply to both scaling up and scaling out architectures. Scaling out will require a slightly different approach due to the fact that the entire webserver suite and/or MySQL will be split across multiple servers. 

Test System

  My test system is an AMD 2100 with 2GB RAM and single PATA harddrive. (Stop laughing.) It has XAMPP 1.6.6 installed which contains Apache 2.2.8, PHP 5.2.5 and MySQL 5.0.51a with eAccelerator turned off. Drupal 5.7 and the devel module have been installed. The test Drupal site contains 5000 nodes, with 50,000 comments, 5000 vocabulary and terms with a max length of 12. These first attempts at improving Drupal performance will be done with Drupal core only. The only nonstandard module that has been installed is the devel module.

Devel Module

  The devel module was used to generate all the random test data for the site. After working with it, I noticed it causes a large amount of overhead to the site. During initial familiarization with the module, it created all the test data just fine, but using the module to analyze queries and performance was another matter. Usually the first top 3 longest running queries were from the devel module itself. I'm going to develop a better way to collect this data without the high impact of using the devel module.

  Replacing the devel modules ability to capture the queries can be done within MySQL itself by adding the following to my.cnf file and restarting mysqld:

[mysqld]
 
log = /opt/lampp/var/mysql/drupal.log
log-queries-not-using-indexes
long_query_time = 1
log-slow-queries = /var/log/mysql/drupal-slow.log


  With these mysqld settings, the devel module can be turned off. These settings will also provide more information like logging queries that don't use indexes, something the devel module can't do, plus a greater depth of analysis.

Preparation

  Before we start baselining, we need to make sure MySQL caching is turned off:

mysql» SHOW VARIABLES LIKE 'query_cache%';
+------------------------------+---------+
| Variable_name | Value |
+------------------------------+---------+
| query_cache_limit            | 1048576 | 
| query_cache_min_res_unit     |    4096 | 
| query_cache_size             |       0 | «- query cache is turned off. 
| query_cache_type             |      ON | 
| query_cache_wlock_invalidate |     OFF | 
+------------------------------+---------+
5 rows in set (0.02 sec)

mysql»


  We don't want MySQL caching any of our queries. If it does, then it'll store the results of a query in the cache. The results of subsequent queries will be in the cache and returned much faster than reading tables or indexes. We don't want this right now, but I'll turn it back on later to see how it improves performance.

  On the PHP side, there shouldn't be any caching, nor any other web caching being used.

http_load Baseline

  We're going to start off with baselining the opening homepage. http_load can have multiple URLs in a file to benchmark more than one URL on a given site. I created a directory to hold all the project data.

$ echo 'http://localhost/drupal/' »» urls

  Running http_load with a single connection:

$ http_load -parallel 1 -seconds 20 urls
53 fetches, 1 max parallel, 1.05491e+06 bytes, in 20 seconds
19904 mean bytes/connection
2.65 fetches/sec, 52745.6 bytes/sec
msecs/connect: 0.158604 mean, 0.186 max, 0.148 min
msecs/first-response: 355.007 mean, 520.553 max, 332.182 min
HTTP response codes:
  code 200 -- 53

  This first baseline was taken with the devel module enabled. Turning it off produced this http_load results:

$ http_load -parallel 1 -seconds 20 urls
60 fetches, 1 max parallel, 1.1748e+06 bytes, in 20.0011 seconds
19580 mean bytes/connection
2.99984 fetches/sec, 58736.9 bytes/sec
msecs/connect: 0.163783 mean, 0.241 max, 0.152 min
msecs/first-response: 317.243 mean, 453.476 max, 283.361 min
HTTP response codes:
  code 200 -- 60


  Turning off the devel module resulted in a 12% increase in fetches and a first-response improvement of 11%. These two numbers are pretty much the most important numbers to consider. Probably not a good idea to use the devel module on a production system.

  By increasing the -parallel option, we can increase the concurrent requests to the Drupal system:

$ http_load -parallel 10 -seconds 20 urls
60 fetches, 10 max parallel, 1.1748e+06 bytes, in 20 seconds
19580 mean bytes/connection
3 fetches/sec, 58740 bytes/sec
msecs/connect: 0.1933 mean, 0.549 max, 0.07 min
msecs/first-response: 2867.04 mean, 3269.07 max, 2664.81 min
HTTP response codes:
  code 200 -- 60


  As you can see, my little workstation pretty much detonates with 10 concurrent requests with a first response time of nearly 3 seconds!

MySQL Configuration

  We need to log all the queries that are used when the homepage is requested.

[mysqld]

log = /opt/lampp/var/mysql/drupal.log
log-queries-not-using-indexes
long_query_time = 1
log-slow-queries = /var/log/mysql/drupal-slow.log


  All queries from Drupal will get logged in order in the drupal.log file. The partial contents look like this:


080706 23:56:33 8 Connect root@localhost on 
  8 Init DB Drupal
  8 Query SET NAMES "utf8"
  8 Query SELECT CASE WHEN status=1 THEN 0 ELSE 1 END FROM access WHERE type = 'host' AND LOWER('127.0.0.1') LIKE LOWER(mask) ORDER BY status DESC LIMIT 0, 1
  8 Query SELECT u.*, s.* FROM users u INNER JOIN sessions s ON u.uid = s.uid WHERE s.sid = 'feceaae02b66389ab7d19505b9a9f0a4'
  8 Query SELECT r.rid, r.name FROM role r INNER JOIN users_roles ur ON ur.rid = r.rid WHERE ur.uid = 1
  8 Query SELECT data, created, headers, expire FROM cache WHERE cid = 'variables'
  8 Query SELECT COUNT(pid) FROM url_alias
  8 Query SELECT src FROM url_alias WHERE dst = 'node'
  8 Query SELECT name, filename, throttle FROM system WHERE type = 'module' AND status = 1 ORDER BY weight ASC, filename ASC
  8 Query SELECT data, created, headers, expire FROM cache_menu WHERE cid = '1:en'
  8 Query SELECT COUNT(*) FROM node_access WHERE nid = 0 AND ((gid = 0 AND realm = 'all')) AND grant_view »= 1
  8 Query SELECT COUNT(*) FROM node n WHERE n.promote = 1 AND n.status = 1
  8 Query SELECT n.nid, n.sticky, n.created FROM node n WHERE n.promote = 1 AND n.status = 1 ORDER BY n.sticky DESC, n.created DESC LIMIT 0, 10
  8 Query SELECT dst FROM url_alias WHERE src = 'rss.xml'
  8 Query SELECT * FROM system WHERE type = 'theme'
  8 Query SELECT dst FROM url_alias WHERE src = 'misc/feed.png'


  These queries were generated by the homepage URL: http://localhost/drupal/

  The drupal-slow.log will contain any queries that take 1 second or longer to run. It's partial content for the same URL:

/opt/lampp/sbin/mysqld, Version: 5.0.51a-log (Source distribution). started with:
Tcp port: 3306 Unix socket: /opt/lampp/var/mysql/mysql.sock
Time Id Command Argument
# Time: 080706 23:50:04
# User@Host: root[root] @ localhost []
# Query_time: 0 Lock_time: 0 Rows_sent: 11 Rows_examined: 46
use Drupal;
SELECT name, filename, throttle FROM system WHERE type = 'module' AND status = 1 ORDER BY weight ASC, filename ASC;
# User@Host: root[root] @ localhost []
# Query_time: 0 Lock_time: 0 Rows_sent: 1 Rows_examined: 35
SELECT * FROM system WHERE type = 'theme';


  Important lines are in bold. For the first SELECT statement it had a query time of 0, lock time of 0, the rows returned were 11, and the number of rows that were searched was 46 out of the system table. You'll notice that the query run time is 0, but we've configured MySQL to log all queries that take 1 second or longer to run, and log them in the slow query log. Why are 0 time queries in the slow query log? Remember we've also configured MySQL to log all queries not using an index. The slow query log will contain both, queries that took 1 second to run and queries that didn't use indexes. A slow running query looks like this:

# Time: 080709 17:04:02
# User@Host: root[root] @ localhost []
# Query_time: 1 Lock_time: 0 Rows_sent: 2 Rows_examined: 11
SELECT DISTINCT b.* FROM blocks b LEFT JOIN blocks_roles r ON b.module = r.module AND b.delta = r.delta WHERE b.theme = 'garland' AND b.status = 1 AND (r.rid IN (1) OR r.rid IS NULL) ORDER BY b.region, b.weight, b.module;


  It's important to note that queries listed in drupal-slow.log may not be the only queries that need optimization. It's entirely possible that a query that runs for less than 1 second, but are called several hundred times could also be a candidate for optimization. Drupal-slow.log is just one place to start in the optimization process. Both log files will get rather large, the drupal.log file contains 172 queries just to render the homepage! There are several tools available to help analyze all this data. We'll need to determine which queries on a running system consume the most resources. We'll combine the logging and http_load to generate some http traffic to fill the logs with results.

Logfile Analysis

  There are a few tools available to analyze the logfiles. We'll be using the mysql statement log analyzer to analyze both log files. Using http_load to create some http traffic:

$ http_load -parallel 10 -seconds 120 urls
389 fetches, 10 max parallel, 7.61662e+06 bytes, in 120.002 seconds
19580 mean bytes/connection
3.24162 fetches/sec, 63471 bytes/sec
msecs/connect: 0.181763 mean, 0.511 max, 0.046 min
msecs/first-response: 2855.27 mean, 3271.22 max, 2710.4 min
HTTP response codes:
  code 200 -- 389


The command generated a general logfile almost 6MB, and slow query logfile of 602K. This is too much information to manually process. An easy to use log analyzer is mysqlsla, and can be downloaded from http://hackmysql.com. This is a Perl script that requires the Perl DBI and DBD::mysql installed. It can be used to process both the general log and the slow query log. The command:

$ mysqlsla --explain --general drupal.log --flat --sort ce --socket mysql.sock -u root

The output from mysqlsla is rather lengthy, even simply trying to analyze the homepage. Here's the number one query:

$ mysqlsla --explain --general drupal.log --flat --sort ce --socket mysql.sock -u root
Reading general log 'drupal.log'.
38519 total queries, 33 unique.
Sorting by 'ce'.

__ 001 _______________________________________________________________________

Count : 11693 (30%)
Database : Drupal
Execution time: 1.000 ms
Count * Exec : 11.693 s
Rows (EXPLAIN): 1 produced, 1 read
EXPLAIN : 
  id: 1
  select_type: SIMPLE
  table: url_alias
  type: ref
  possible_keys: src
  key: src
  key_len: 386
  ref: const
  rows: 1
  Extra: Using where


select dst from url_alias where src = 'S' 


Going over the output:

Count: How many times the query was run.
Database: Self explanitory
Execution Time: How long the query took to run
Count * Exec: Total running time for this query, even with different WHERE clause.
Rows: From EXPLAIN
EXPLAIN: Output from EXPLAIN command.
Note: The WHERE clause can be confusing. The 'S' means the src is a string, 'N' would be integer/number.

EXPLAIN output:

id: identifies the SELECT statement
select_type: simple, subquery, derived, union or result
table: table or tables used in SELECT
type: join type
possible_keys: indexes which COULD be used
key_length: number of bytes used in index
ref: columns/constants being used from table(s)
rows: number of estimated scanned table rows
extra: additional info like using where, using filesort, using temporary

This is one query that doesn't show up in the slow query log. Looking at the number of times the query is run should be a red flag to investigate ways to make it faster. Mysqlsla is a great tool to help analyze queries. It combines EXPLAIN with other metrics to help organize large logs.

Benchmarking MySQL

The easiest way to benchmark the performance of a single query is using the BENCHMARK() function. This function can be used to test execution speed of a query. It's really great if you have a query that normally runs too fast to determine runtime. This query runs in 0.00 seconds:

mysql» SELECT dst FROM url_alias WHERE src = 'node/3705';
+-----------------+
| dst             |
+-----------------+
| node-3705-story | 
+-----------------+
1 row in set (0.00 sec)

mysql» 


Running it 10000000 times gives us:

mysql» set @sql := "SELECT dst FROM url_alias WHERE src = 'node/3705';";
Query OK, 0 rows affected (0.00 sec)

mysql» SELECT @sql;
+----------------------------------------------------+
| @sql |
+----------------------------------------------------+
| SELECT dst FROM url_alias WHERE src = 'node/3705'; | 
+----------------------------------------------------+
1 row in set (0.00 sec)

mysql» SELECT BENCHMARK(10000000, @sql);
+---------------------------+
| BENCHMARK(10000000, @sql) |
+---------------------------+
| 0                         | 
+---------------------------+
1 row in set (3.98 sec)

mysql»


It's important to be sure query caching is turned off or the results will be incorrect. BENCHMARK runs the query one after another, not in parallel. 

MySQL also includes an overall benchmarking suite. This suite doesn't target Drupal directly, but it's a great way to compare server systems, or general configuration changes to mysqld.  

Profiling MySQL

MySQL profiling is a technique used to see what's happening internally within MySQL. Having the ability to see what's happening within mysqld and the storage engines can help with understanding how queries impact the server. We'll analyze this query from the drupal.log:

mysql» SELECT t.* FROM term_node r INNER JOIN term_data t ON r.tid = t.tid INNER JOIN vocabulary v ON t.vid = v.vid WHERE r.nid = 3705 ORDER BY v.weight, t.weight, t.name;
+-----+-----+------------+---------------------------+--------+
| tid | vid | name | description | weight |
+-----+-----+------------+---------------------------+--------+
| 5 | 2 | dastaruhit | description of dastaruhit | 0 | 
| 24 | 2 | rora | description of rora | 0 | 
| 49 | 2 | vab | description of vab | 0 | 
| 20 | 2 | thavu | description of thavu | 1 | 
| 7 | 2 | mo | description of mo | 2 | 
+-----+-----+------------+---------------------------+--------+
5 rows in set (0.00 sec)

mysql» EXPLAIN SELECT t.* FROM term_node r INNER JOIN term_data t ON r.tid = t.tid INNER JOIN vocabulary v ON t.vid = v.vid WHERE r.nid = 3705 ORDER BY v.weight, t.weight, t.name\G

*************************** 1. row ***************************
  id: 1
  select_type: SIMPLE
  table: r
  type: ref
possible_keys: PRIMARY,nid,tid
  key: nid
  key_len: 4
  ref: const
  rows: 12
  Extra: Using temporary; Using filesort
*************************** 2. row ***************************
  id: 1
  select_type: SIMPLE
  table: t
  type: eq_ref
possible_keys: PRIMARY,vid
  key: PRIMARY
  key_len: 4
  ref: Drupal.r.tid
  rows: 1
  Extra: 
*************************** 3. row ***************************
  id: 1
  select_type: SIMPLE
  table: v
  type: eq_ref
possible_keys: PRIMARY
  key: PRIMARY
  key_len: 4
  ref: Drupal.t.vid
  rows: 1
  Extra: 
3 rows in set (0.00 sec)


The query used a temporary table and a filesort to produce the results. What kind of table? Was it in memory or on disk? What about sorting and indexes? Profiling will help answer questions like these.

To get started, flush all the STATUS variables:

mysql» FLUSH STATUS;
Query OK, 0 rows affected (0.00 sec)

mysql»


Run the query:

mysql» SELECT t.* FROM term_node r INNER JOIN term_data t ON r.tid = t.tid INNER JOIN vocabulary v ON t.vid = v.vid WHERE r.nid = 3705 ORDER BY v.weight, t.weight, t.name;

Determine what kind of temporary tables were created:

mysql» SHOW GLOBAL STATUS LIKE 'Created%';

+-------------------------+-------+

| Variable_name           | Value |

+-------------------------+-------+
| Created_tmp_disk_tables |     1 | 
| Created_tmp_files       |     0 | 
| Created_tmp_tables      |     4 | 
+-------------------------+-------+
3 rows in set (0.00 sec)

mysql»


This query created 1 temporary on disk table and 2 temporary in memory tables. How about sorting?

mysql» SHOW GLOBAL STATUS LIKE 'Sort%';

+-------------------+-------+

| Variable_name | Value |

+-------------------+-------+
| Sort_merge_passes |     0 | 
| Sort_range        |     0 | 
| Sort_rows         |     5 | 
| Sort_scan         |     1 | 
+-------------------+-------+
4 rows in set (0.00 sec)

mysql»


MySQL scanned a single table with 5 rows to produce the results. What about indexes?

mysql» SHOW GLOBAL STATUS LIKE 'Key%';
+------------------------+-------+
| Variable_name          | Value |
+------------------------+-------+
| Key_blocks_not_flushed |     0 | 
| Key_blocks_unused      | 13820 | 
| Key_blocks_used        |   678 | 
| Key_read_requests      |    21 | 
| Key_reads              |     0 | 
| Key_write_requests     |     0 | 
| Key_writes             |     0 | 
+------------------------+-------+
7 rows in set (0.01 sec)

mysql»


This query made 21 Key_read_requests but didn't use any indexes from the key cache, so it had to read the .MYI file to search through the index.

The are also Handler* status variables that show what the storage engine is doing too. This can become very involved, especially with the InnoDB storage engine.


Conclusion

The techniques of benchmarking and profiling MySQL helps in figuring out the bottlenecks in a system. The process is to create traffic, log queries, then analyze the logfiles to determine which queries need attention. Take those queries one by one, use EXPLAIN and "replay" them into the system and look at the various STATUS variables to see how they are impacting the server. In Part III, we'll go over some of the queries and tune the queries and mysqld. One thing that also needs to be kept in mind is Drupal profiling. There's the distinct possibility that we'll have some good queries, but Drupals interaction with MySQL is where the improvement may need to happen. Any server optimization will be left out.

An Introduction to Drupal & MySQL Performance Tuning Part I

Introduction


Improving Drupal's MySQL performance is more than just optimizing Drupal and MySQL themselves. I wanted to mention that the server hardware, operating system, webserver, and PHP also play important roles in deploying the fastest system possible. Benchmarking and profiling a system is a very methodical step-by-step process starting from the hardware, the operating system, Apache, PHP, MySQL then finally Drupal itself.

Server Hardware

Everything starts at the hardware level and having a way to benchmark CPU, memory, and harddrives is very important. Not all motherboards are created equal either. If at all possible, developing a benchmarking plan to test different CPUs, memory and harddrives will go along way to help creating a high performance Drupal/MySQL system. Using sysbench from http://sysbench.sourceforge.net/, it's possible to show which CPU, memory and harddrive combinations that will provide the best performance at the hardware level. Armed with this information, the correct hardware can be purchased. Even if a client already has a Drupal site in production, knowing this benchmark information can help in pointing out which areas should be investigated.

Scaling up versus scaling out. Scaling up is the process of building single bigger servers for a given Drupal/MySQL installation. Scaling out is the process of spreading the webserver and/or Drupal/MySQL stack across several servers.

OS Tuning

Since kernel 2.6.x, the Linux kernel does a very good job with automatic tuning. On a few systems, I've increased the size of the network buffers I realized an improvement to warrant the effort in benchmarking, making the configuration changes, and documentation. If the system requires NFS, that also needs to be investigated via a thorough benchmarking plan geared towards that end.

Webserver

Some general Apache parameters like SendBufferSize can be increased up to the largest HTML page size, up to the kernel network write limit. Another approach is to split different media across different servers, video on one, images on another, audio on yet another webserver based upon site useage statistics. A caching front end to the webserver stack could also be beneficial. The effort to investigate memcached may be worth the effort too. There are many ways to approach this problem, each with pros and cons.

PHP

PHP can benefit from output buffering, and using an gzip output handler. This will send the output of a the completed script and compress it before handing it off to the webserver. PHP code profiling should be included.

There are also PHP accelerators available like pecl at http://pecl.php.net/package/memcache and eAccelerator. Commercial PHP accelerators also exist.

Benchmarking, Documentation and Configuration Management

The central task for all of the above is to always benchmark single changes and to document them thoroughly. Automating as much of the benchmarking process as possible will help in making it easy to do. Difficult tasks generally don't get done for very long. The documentation should be created online to facilitate sharing, archiving and searching. Using a version control system to manage configuration files, database schemas and test data, and source code is highly recommended.

Conclusion

Even though this project doesn't necessarily involve benchmarking and profiling the entire stack, it's important to keep in mind all these factors when working on improving the performance of Drupal. In Part 2, I'll write about how to put together a benchmarking and profiling environment concentrating on Drupal queries and MySQL only. If time permits, I can also work on benchmarking and profiling Apache & PHP.

Tuesday, November 11, 2008

MySQL Certified Professionals LinkedIn Group

I'm passing the torch to Dave Stokes to manage the MySQL Certified Professionals LinkedIn Group. I'm just not that involved in doing DBA work these days. My career is moving further away from MySQL, plus San Diego isn't a big MySQL town like some other parts of the country. I didn't want the group to grow stagnant and Dave has been very gracious to take it over.

Thanks Dave!

Sunday, August 24, 2008

Last Week in Drizzle - Vol 3

This is the third post in the weekly series "Last Week in Drizzle" where we summarize the efforts of various folks in the Drizzle community over the past week. This edition covers Aug 18th through the 24th. As with the week before, a number of developers and community advocates continue to refactor the code base, come together in discussions on the mailing list, and brainstorm on how to solve the tough problems that Drizzle is trying to address. It sounds like many in the community have been swamped this week, but there's still plenty to report. Jay Pipes and myself are tag teaming Last Week in Drizzle.

Continued Growth in the Drizzle Community

Drizzle mailing list has 191 members, up from 148 last week As I type this, there are 42 folks hanging out on the #drizzle Freenode channel. The Drizzle wiki has been steadily growing in its content, there have been a number of entries added or updated:

* Updated FAQ
* Simple Replication
* Table Definitions

Ongoing Conversations on the Mailing List

We've had less Bikeshed stuff this past week. Biggest topics discussed this week:



Community Blogs

Jay Pipes has written up two guides about
Getting a Working C/C++ Development Environment for Developing Drizzle and A Contributor's Guide to Launchpad.net - Part 1 - Getting Started

Domas Mituzas blogs about his thoughts on Drizzle and modularity.

Brian "krow" Aker writes: Removal of code for views, some code removed for unireg and definitions. Table objects also created.

Drizzle Build Farm

Still seeking additional Build Bots. You can also find out realtime status about our buildfarm on irc #Drizzle. Details at http://ronaldbradford.com/blog/interacting-with-buildbot-using-irc-2008-08-18/

From last week:

Ronald has continued to champion the Drizzle Build Farm, and writes the following update and request to the community:

The Build Farm has helped us in identifying problems but we are still seeking people to contribute different platforms to our automated process become better.

Currently we cover the following platforms

- Ubuntu 8.04 - 32 & 64 bit - Debian 5 - 32 & 64 bit - Gentoo 8 - 32 & 64 bit - CentOS 5 - 64 bit - Fedora 8 - 32 & 64 bit - SUSE 11 - 32 bit - Mac OS/X 10.5 - 64 bit

See the wiki for how to contribute to the Drizzle Build Farm.


Final Words

That wraps up this week's entry. My apologies to anyone I missed in this edition. Feel free to add errata and additions to the comments of the entry and I will update the blog post accordingly.