Today I've been looking at how multiple queue directories can help when processing concurrent connections.
The methodology was identical to the previous tests. The only change was to the smtp-source(1) command line. The previous tests were run with
-s 1
, indicating one concurrent connection. These tests were run with -s 10
, to force 10 concurrent connections.Test 5 - 1 queue directory
View raw data
ministat showed that all but the fourth run had no difference at 99.5%. The fourth run did, but very small.
Difference at 99.5% confidence
-2308.38 +/- 2274.66
-4.08402% +/- 4.02435%
So I'll continue to use the first run for the remaining figures.
It was obvious that these tests were taking considerably longer to run than the originals, but it wasn't clear quite how much slower they were until I compared the results from test1 with these results. The plot's not very interesting, but the differences are. The first result is from 1 queue / 1 connection, the second is from 1 queue / 10 connections.
# | N | Min | Max | Median | Mean | Stddev |
---|---|---|---|---|---|---|
x | 61 | 953 | 1398 | 1143 | 1143.7705 | 88.056496 |
+ | 60 | 43499 | 68081 | 55929 | 56522.367 | 4354.2006 |
Difference at 99.5% confidence
55378.6 +/- 1722.91
4841.76% +/- 150.634%
Clearly, having a single queue directory when processing many concurrent connections is a liability, compared to the single connection case, being almost 50 times slower.
This is especially apparent if I plot these results and overlay the results from test1.
In this plot the row of squares at the very bottom, almost bumping in to the X axis, is the data from test 1. The data from test 5 is the much more widely spaced points towards the top of the graph.
The "Slopes" table shows the data from test 5 and test 1 respectively. As you can see, not only is there much more 'jitter' in the results from test 5 when compared like this, the cost per directory entry (the slope of the trend line) is much steeper too.
Test 6 - 5 queue directories
View raw data
These tests felt subjectively much faster when I ran them. The ministat results bear this out. These ministat results compare all the runs from this test, showing no significant differences between the runs.
# | N | Min | Max | Median | Mean | Stddev |
---|---|---|---|---|---|---|
x | 60 | 9645 | 14158 | 11747.5 | 11781.067 | 1156.3799 |
+ | 60 | 9592 | 13722 | 11701.5 | 11771.4 | 1068.0186 |
No difference proven at 99.5% confidence | ||||||
* | 60 | 9492 | 16093 | 11637.5 | 11719.633 | 1127.3325 |
No difference proven at 99.5% confidence | ||||||
% | 60 | 9842 | 14877 | 11934.5 | 11897 | 971.37713 |
No difference proven at 99.5% confidence | ||||||
- | 60 | 9832 | 13630 | 11756 | 11595.317 | 903.07967 |
No difference proven at 99.5% confidence |
This plot compares the first run from test 5 with the first run from this test.
x test5-run1
+ test6-run1
: = Mean
M = Median
+----------------------------------------------------------+
| + |
| + |
| + |
| ++ |
| ++ |
| ++ |
| +++ |
| +++ |
| +++ |
| +++ |
| +++ x |
| ++++ x |
| ++++ xx |
| ++++ xx |
| ++++ xxx xx |
|+++++ xxxxxxxx x |
|+++++ x xxxxxxxx x |
|+++++ xx xxxxxxxx xxx |
|+++++ x xxx xxxxxxxxxxxx x|
| |:| |___M:___| |
+----------------------------------------------------------+
# | N | Min | Max | Median | Mean | Stddev |
---|---|---|---|---|---|---|
x | 60 | 43499 | 68081 | 55929 | 56522.367 | 4354.2006 |
+ | 60 | 9645 | 14158 | 11747.5 | 11781.067 | 1156.3799 |
Difference at 99.5% confidence
-44741.3 +/- 1797.18
-79.1568% +/- 3.17959%
It's apparent that adding even 4 additional queue directories has a significant positive benefit on the amount of time taken to queue messages.
This chart makes the differences even more apparent. The green plot points are from test 5, the red plot points from test 6. The figures above show that test 6's standard deviation is less than that of test 5, and this is obvious from this chart in the reduced 'jitter' between the points.
Note that although this is a big improvement from test 5, these results are still almost 10 times slower than the results from the first 4 tests. Using ministat to compare the results from test 1 and test 6, test 6 is 930.02% +/- 40.1145% slower.
Test 7 - 10 queue directories
View raw data
Having seen the significant effect that adding 4 additional queue directories made, I was hoping to see a greater benefit with a total of 10 directories.
Again, running these tests they felt subjectively faster, although not significantly faster than test 6.
The ministat results confirm that the test runs are equivalent, with no obvious outlier data points, so I set about comparing tests 6 and 7. ministat indicated a small improvement.
x test6-run1
+ test7-run1
: = Mean
M = Median
+----------------------------------------------------------+
| + + |
| + + + x |
| + + ++ + + ++ x x x x xx x |
|+ + + ++ ++++ +x++++x++x x+xx xx xxx xx x x |
|+ + ++++++++++ ++++ ++++++++++xx+xx +x+xxxxxxx +xx x xxx|
| |__________:__________| |
| |_________:_________| |
+----------------------------------------------------------+
# | N | Min | Max | Median | Mean | Stddev |
---|---|---|---|---|---|---|
x | 60 | 9645 | 14158 | 11747.5 | 11781.067 | 1156.3799 |
+ | 60 | 7967 | 13184 | 9782 | 9841.6 | 1070.5916 |
Difference at 99.5% confidence
-1939.47 +/- 628.644
-16.4626% +/- 5.33605%
This improvement is also obvious if the results from test 6 and test 7 are charted and overlaid, like so:
Green points are from test 6, red points are from test 7. 10 queue directories is clearly superior to 5 in this test, although by nothing like the margin of difference going from 1 to 5 queue directories.
Test 8 - 20 queue directories
View raw data
This final test involved 20 queue directories. The ministat results show no significant differences between the results of each run, so I compared run 1 from this test with run 1 from test 7.
This showed no significant differences.
# | N | Min | Max | Median | Mean | Stddev |
---|---|---|---|---|---|---|
x | 60 | 7967 | 13184 | 9782 | 9841.6 | 1070.5916 |
+ | 60 | 7700 | 12850 | 9631 | 9691.6833 | 1099.0576 |
No difference proven at 99.5% confidence |
This suggests that there is no significant benefit, when speaking of the time required to create a new queue entry, going from 10 to 20 queue directories. At least in the environment that I'm conducting the tests.
The chart bears this out. This shows results from test 7 (green) and test 8 (red) overlaid.
I suspect this is because when the number of queue directories equals the number of simultaneous connections other factors start contributing with more weight.
Conclusion
Even with a single disk, multiple queue directories have a dramatic effect on the time taken to queue messages as soon as Sendmail is processing concurrent connections.
That effect appears to tail off as the number of queue directories approaches the number of concurrent connections.
Further work
These results pose some questions that I intend to investigate and answer in the next post.
First, what is responsible for the dramatic slow down in the single-queue case (test 4)? I suspect that the process of creating a new directory entry is serialised in the filesystem, and with 1 directory and 10 connections all the processes meet at this bottleneck. I should be able to use DTrace to instrument the single queue case in more detail, and get metrics on exactly which part of the process is taking the most time.
Second, with these results the benefit of multiple queue directories tails off as the number of concurrent connections approaches the number of queue directories. This can be tested (e.g., 20 connections with 5, 10, and 20 queue directories, 30 connections with 5, 10, and 20 queue directories, and so on), and I'll do that to see if anything interesting falls out of the data.
Since you clearly need yet another variable in the mix, I'd be interested in what difference ufs vs zfs makes on the queue performance.
ReplyDeleteHeh. It's on the TODO list, as I'd like to get some experience with ZFS. Not sure that I'll get the chance to do so in the next 30 days though. A lot depends on how rapidly I can get through this round of data gathering and reporting.
ReplyDelete[...] As expected, given the results from yesterday this shows the dramatic difference when using multiple queue directories. [...]
ReplyDelete[...] Back on day 28 I looked at the effect of multiple queue directories with concurrent senders. [...]
ReplyDelete