Day 29 of 60: What are the single queue directory bottlenecks?

Earlier posts have shown that using a single queue directory imposes a significant bottleneck when processing concurrent connections with Sendmail. Yesterday I posed some questions, and today I've started work on answering the first one.

The first question was:

What is responsible for the dramatic slow down in the single-queue case (test 4)?

To start answering this I wrote queue-create-syscall.d. This starts tracing syscalls when the queue-create-start probe fires. As each syscall returns the script collects data on the number of times the syscall fired, the total execution time for this syscall, and the average execution time for this syscall.

For ease of post-processing the results are printed in CSV format. This demonstrate a DTrace feature that's not clear from the documentation -- the DTrace printa() function can print details from multiple aggregations in one call.

To gather the data I run two tests, each consisting of five runs. The first test (test9 in the repository) consisted of one queue directory, and one sending process, the second (test10 in the repository) consisted of one queue directory and 10 sending processes. In other words, the same as tests 1 and 5, but instrumented differently.

Test 9 - 1 queue directory, 1 sender

View raw data

The count column in the raw data is interesting. We already know that for 30,000 messages Sendmail is going to create 60,000 files. That explains why there are 60,000 calls to open(). But there are only 30,000 corresponding calls to close(). I wonder why that is?

However, in this instance these concerns are dwarfed by the amount of time taken by the fdsync() calls. This chart makes this a bit more obvious.

Test 9, average syscall exec time

That's pretty understandable. Sendmail is exceptionally paranoid about making sure that file contents have been committed to stable storage, and makes liberal use of the fsync() C library call, which eventually calls fdsync. Given that fsync() does not return until the data is safely on the disk it's unsurprising that it accounts for much of the system call time.

Charting the total execution time spent in each system call is also illustrative. Here the increased number of fdsync() calls is apparent, accounting for a full minute of the execution time, with open(), rename(), and write() accouting for a few seconds each.

Test 9, total syscall exec time

Test 10 - 1 queue directory, 10 senders

View raw data

This test showed dramatically different results, as expected by the almost-50-times-slower run time.

If you examine the raw data you'll notice that the count column matches those from test 9, which is exactly as expected. Again, ministat confirms that there's no statistically significant difference between the results from the different runs. The average execution time for the system calls is dramatically different however.

Test 10, average syscall exec time

Not only is fdsync() taking approximately 19 times longer to run, open(), rename(), and write() have all become more time consuming.

This difference is reflected in the total amount of time spent in each system call. The fewer calls to write() mean that it's jump is not quite so pronounced. However, this chart makes it clear where much of the system call time is being spent.

Test 10, total syscall exec time

The next step is to instrument open(), and see where that's spending its time.

No comments:

Post a Comment