Day 37 of 60: Instrumenting queue processing time

Previously I've written about variables that may affect how rapidly Sendmail can process the mail queue. I've now started working to gather data on exactly how much influence these variables have.

Basic methodology

The relay-zone is configured to accept messages from the internal-zone, and relay them.

The internal-zone runs smtp-source(1) to generate a large number of messages. These are sent to an address in the external-zone. These are relayed through the relay-zone.

The relay-zone is configured to immediately the queue the messages.

From the global zone I run a DTrace script that captures information about the elapsed time taken by each queue run.

In the relay-zone, sendmail -q is run to deliver the messages.

smtp-sink(1) is running in the external-zone to accept the messages and throw them away.

Test infrastructure

It's now that the multiple zones I created back in day 5 have started to come in to their own, and I've made a number of changes to the configuration files in the repository to use them.

The first set of changes is largely cosmetic. Since the zones will no longer share a common Sendmail configuration I've created a directory per-zone, and moved the Sendmail configuration files under this directory. This necessitated moving a few things in the repository, and this was commits 1127, 1128, 1129, 1130, 1131, 1132, 1133, and 1134.

The second set of changes is to the config files and build infrastructure for Sendmail in the relay-zone. These changes enable Sendmail's access_db feature, and configure Sendmail to allow the internal-zone to relay mail through the server. These changes are in commits 1141 and 1142.

Sendmail probes

I've wrapped all the DTrace specific code in

#endif /* _FFR_DTRACE */

which follows the style in the rest of the Sendmail code, and will make it much easier for the Sendmail developers to incorporate these changes (if they so desire). You will need to add:

APPENDDEF(`conf_sendmail_ENVDEF', `-D_FFR_DTRACE=1')

to your site.config.m4 file if you're following along. This was commit 1126.

I've added a number of additional probes to Sendmail that fire at various points in the queue running process.

They are:

  • When a queue run starts

  • When a single queue runner starts

  • When the queue is sorted

  • When a single queue job (delivering a single queue entry) starts

There are corresponding probes for the end of each of those actions too.

I've also added two more detailed probes to the file locking code. These probes fire just before and just after an attempt to lock a file. As well as noting the attempt they also note the file descriptor, name, and the type of the lock (shared, exclusive, non-blocking, or an unlock). The 'return' probe also notes whether the lock request was successful.

This was commit 1143.

Creating the queue

I suspect that I'm going to be doing a lot of tests with different sized queues. Rather than have to recreate them from scratch each time I've pre-generated and copied them.

I'll be testing with 1, 5, 10, 20, 30, and 40 queue directories to see how much of a difference they make to delivery.

Accordingly, I ran the following command in the internal-zone.

smtp-source -d -l 13870 -m 3000 -s 1 -t nik@external-zone relay-zone

This created 30,000 messages in the queue. In the relay-zone I then:

tar cf mqueue.1q.tar /var/spool/mqueue

to copy the messages. I then removed the contents of /var/spool/queue, created 5 queue directories, restarted Sendmail, and repeated the smtp-source(1) command.

I repeated those steps for 10, 20, 30, and 40 queues, giving me a collection of tarballs each containing 30,000 messages, spread over a number of different queue sizes. From now on I'll wipe the queue and extract the appropriate tarball to generate the messages.

Configuring the external-zone

All the messages are going to be delivered from the relay-zone to the external-zone. There needs to be an SMTP service running in the external-zone to accept them.

I could have run Sendmail (perhaps configured in queue-only mode) but I decided that that was additional effort that I didn't want to expend. Instead, I've made use of smtp-sink(1). This is a companion program to smtp-source(1), and also ships with Postfix. As the name suggests it acts as a sink for SMTP connections. It listens on port 25, accepts the message(s) its sent, and then throws them away.

It can do considerably more than this (e.g., you can configure it to reject connections from certain hosts, or to return failure codes to certain SMTP commands) but for the time being I just need it to accept the messages and then drop them.

I've run this command In the external-zone:

smtp-sink -c external-zone:25 50

This tells it to listen on port 25 on the external-zone's IP address, with the accept backlog set to 50 pending connections. The -c option causes it to print a running counter of messages as it processes them, and acts as a useful visual cue during testing.

Verification and testing

With this configured testing was relatively simple. After extracting a queue directory tarball it was enough to run:

sendmail -q

and verify that the connection counter printed by smtp-sink(1) was increasing. That's what happened, and at the end of the process it showed that 30,000 messages had been accepted by smtp-sink(1).

Instrumentation with DTrace

To instrument the queue running process I've written queue-run-duration.d which produces CSV output showing how long each queuerunning PID spent in each part of the queue running process.

The next step is to get and report on the data.

No comments:

Post a Comment