Try before you buy: Day 27 of 60: Instrumenting Sendmail queue file creation (pt 3)

I've commited the first sets of results to the repository in the aptly named results/ directory.

To refresh your memory, the question I intended to answer was:

does the number of queue directories (on a single disk) make a significant impact on the time taken to create new entries in the queue?

They're quite surprising.

First, let me point you to the disclaimer.

With that out of the way... I've run a total of four tests. Each test consisted of five runs.

The first test was run with one queue directory (/var/spool/mqueue/qdir00). The second with five (qdir00 through qdir04), the third with ten (... through qdir09), and the last with 20 (... through qdir19).

After each test run Sendmail was stopped, the queue directories removed and recreated, and Sendmail was restarted. Nothing else changed between tests except the number of queue directories.

Test 1 - 1 queue directory

View raw data

The first thing I did was run the data through ministat to see if there were any glaring discrepencies. To get it in a format suitable for ministat I knocked together "to_ministat.pl, which extracts the second column from each results file. This is the column that contains the numbers I'm interested in. This converted test1/results.1 to test1/m.1, results.2 to m.2, and so on.

The results are (omitting the plot, as it doesn't add much in this case, and converting to HTML):

#	N	Min	Max	Median	Mean	Stddev
x	61	953	1398	1143	1143.7705	88.056496
+	61	950	8729	1120	1245.5574	977.82579
No difference proven at 99.5% confidence
*	61	977	1624	1141	1148.1475	101.9063
No difference proven at 99.5% confidence
%	61	843	1319	1108	1111.1148	89.652495
No difference proven at 99.5% confidence
-	61	583	1310	1119	1111.3115	109.16952
No difference proven at 99.5% confidence

So I can be 99.5% confident that the results I got from the five test runs are consistent. There's an outlier in the second test, but that can be put down to a random factor in the environment -- it didn't seriously perturb the test results.

Given that, here's the results after plotting the figures from result.1 and adding a trendline and confidence band.

Plot from test 1, result set 1

I'm quite surprised by that. I'd been expecting a much larger upward slope to the trend line, on the assumption that as the number of entries in the directory increased the time for creating a new entry would also increase. That's clearly not the case on this system, at least up to 60,000 entries in the directory.

Why 60,000 entries, when the testing generated 30,000 messages? Remember that each message in the queue is represented by at least two files, a qf* file and a df* file.

This does show that there's a penalty paid as the number of entries in the queue gets larger (every additional entry increases the amount of time required by ~ 1.94 uSec) but it's a much smaller penalty than I had anticipated.

Given this result, I don't expect the results from the 5, 10, and 20 queue directory tests to show a marked improvement.

Lets look at them.

Test 2 - 5 queue directories

View raw data

Here's the ministat comparison of the raw data:

#	N	Min	Max	Median	Mean	Stddev
x	61	548	1436	1112	1104.7869	123.63874
+	61	964	2252	1130	1155.9836	168.10389
No difference proven at 99.5% confidence
*	61	608	1384	1116	1128.623	111.73319
No difference proven at 99.5% confidence
%	61	906	1373	1129	1142.3607	110.06393
No difference proven at 99.5% confidence
-	61	756	1262	1111	1104.6393	91.55782
No difference proven at 99.5% confidence

Again, no significant differences between the datasets. So again, I've selected dataset 1 as the representative sample.

This time I can compare this against the first results and see if there is any significant difference.

x test1/results.m.1
+ test2/results.m.2
: = Mean
M = Median
+----------------------------------------------------------+
|                                  x   x +                 |
|                                  x  ++ +x  x             |
|                              +   x +++ ++ xxx            |
|                          +   +  x+++++ ++ +xx            |
|                          +++++++++++++x++x+xx+  +        |
|+                        ++++++++++++++x++++x++ x+     x +|
|                                 |____:_____|             |
|                            |_______:_______|             |
+----------------------------------------------------------+

#	N	Min	Max	Median	Mean	Stddev
x	61	953	1398	1143	1143.7705	88.056496
+	61	548	1436	1112	1104.7869	123.63874
No difference proven at 99.5% confidence

As you can see, no difference worth writing home about between 1 and 5 queue directories at the 99.5% confidence level. If you're prepared to let things slide to 95% confidence then there is a difference.

Difference at 95.0% confidence
        -38.9836 +/- 38.0923
        -3.40834% +/- 3.33041%

but, as the figures show, it's very small. That difference is also visible in the slope of the trend line, which is increasing at approximately 25% of the rate of the "single queue directory" case.

Plot from test 2, result set 1

Test 3 - 10 queue directories

View raw data

Again, here's the ministat comparison of the data from this run.

#	N	Min	Max	Median	Mean	Stddev
x	61	817	1378	1124	1120.7049	91.530568
+	61	682	1435	1123	1135.6557	123.29475
No difference proven at 99.5% confidence
*	61	584	1303	1117	1124.1311	109.27938
No difference proven at 99.5% confidence
%	61	863	1337	1117	1119.623	93.458755
No difference proven at 99.5% confidence
-	61	606	1413	1115	1115.9016	119.9109
No difference proven at 99.5% confidence

As with the results from test 2, there's no difference at the 99.5% level between this test and the results from the first test. However, in addition, there's no difference at the 95% level either.

As you can see, the graph shows that queuing time does increase at a slower rate than with one queue directory it's actually slower than with 5. However, these differences are so small as to be insignificant.

Plot from test 3, result set 1

Finally, the fourth set of data.

Test 4 - 20 queue directories

View raw data

#	N	Min	Max	Median	Mean	Stddev
x	61	539	1429	1112	1112.9344	128.68655
+	61	598	1369	1121	1116.9508	115.07308
No difference proven at 99.5% confidence
*	61	593	1455	1104	1118.9016	116.49445
No difference proven at 99.5% confidence
%	61	924	1689	1136	1141.8033	102.40684
No difference proven at 99.5% confidence
-	61	660	1354	1141	1133.1311	116.49857
No difference proven at 99.5% confidence

Again, no significant difference between the individual result sets. And also again, there's no significant difference between these results and the results in the "1 queue directory" test.

This is borne out by the chart. Yet again the trendline has a shallower slope than the single queue directory case, but the difference is minimal.

Plot from test 4, result set 1

Conclusion

I had expected that multiple queue directories would have made a much larger difference than they have when Sendmail is queuing mail. What this demonstrates is that (at least in the situation where there is only process queuing messages) that it doesn't appear to make much difference whatsoever.

The next set of tests will experiment with varying the number of concurrent connections that are used, to see if that workload makes any difference.

Note, also, that these results have no bearing on the effectiveness of multiple queue directories when it comes to trying to deliver mail that has been queued. That's going to be yet another set of tests.

Try before you buy

Pages

Day 27 of 60: Instrumenting Sendmail queue file creation (pt 3)

Test 1 - 1 queue directory

Test 2 - 5 queue directories

Test 3 - 10 queue directories

Test 4 - 20 queue directories

Conclusion

No comments:

Post a Comment

About Me

Followers