Try before you buy: dtrace

Showing posts with label dtrace. Show all posts

Day 59 of 60: Final thoughts

This system will be going back to Sun soon, while I wait to find out whether or not they've decided to grant me the system. In the meantime, here are some final thoughts on the last 59 days.

Day 46 of 60: Queue sort strategies

I've been looking at different queue sort strategies to see what their overhead is. Since all the messages are going to be delivered to a single host these results aren't necessarily going to be indicative of what you would see on a production server. However, they should serve to illustrate any inherent speed advantages of one sort strategy over another.

Read on for the resuls.

Day 41 of 60: Multiple queues, multiple queue runners (pt 2)

That's odd.

Day 38 of 60: Multiple queues, multiple queue runners (pt 1)

I've started to get data about the effect of multiple queues with multiple queue runners.

As before I'm using 1, 5, 10, 20, 30, and 40 queue directories, and I'm instrumenting with queue-run-duration.d. This time I'm starting queue runners with the command sendmail -q30s. This will cause Sendmail to create a new queue runner to process the queue every 30 seconds.

The problem is that, even with 30,000 messages, Sendmail can process the whole lot in about 40 seconds, which doesn't give enough time for more than two queue runners to start.

So I'm using the -w option to smtp-sink(1) to insert a 1 second delay at the DATA stage. So (roughly) the first 30 messages go through at the rate of one per second. Then a second queue runner starts, and messages go at the rate of 2 a second, and so on.

But it's still slow going.

As I write this it occurs to me that I could use DTrace to induce this slowdown, by using chill() to have the process pause for a tenth of a second at the start of every job run. That's something I may look at later. As Thursday is my normal day in London, look for updates on Friday.

Day 38 of 60: Multiple queues, one queue runner

Today I'm looking at the results that I've obtained from the latest round of tests. These tests used sendmail -q to deliver 30,000 messages to a different zone. There were 10 runs to each test, and the different tests collected data on timings for 1, 5, 10, 20, 30, and 40 queue directories.

Day 37 of 60: Instrumenting queue processing time

Previously I've written about variables that may affect how rapidly Sendmail can process the mail queue. I've now started working to gather data on exactly how much influence these variables have.

Day 32 of 60: Complete instrumentation of queue creation

Or: "How do I use DTrace with programs that fork?"

With some help from the dtrace-discuss[1] mailing list I've now written a couple of D scripts that can trace what Sendmail is doing between probe points. There's a writeup, and sample output, below the fold.

[1] Note -- the forum archive doesn't seem to link to the discussion yet. When it does I'll update this link to point to the discussion. The subject was "Using pid provider when process forks".

Day 31 of 60: Queues and connections

Back on day 28 I looked at the effect of multiple queue directories with concurrent senders.

These results showed that there was considerable benefit with 10 senders and 10 queue directories. The benefit going to 20 queue directories with 10 senders was negligible.

At the time I wondered whether this was a general rule -- i.e., is anything more than 10 queue directories overkill? Or is there a correlation between the number of queue directories compared to the number of simultaneous sending systems.

Day 30 of 60: What are the single queue directory bottlenecks? (pt 2)

Having established that there's a significant increase in the amount of taken by the fdsync() and open() system calls when Sendmail creates queue entries with a single queue directory I've set about tracking down what that bottleneck is.

Day 29 of 60: What are the single queue directory bottlenecks?

Earlier posts have shown that using a single queue directory imposes a significant bottleneck when processing concurrent connections with Sendmail. Yesterday I posed some questions, and today I've started work on answering the first one.

The first question was:

What is responsible for the dramatic slow down in the single-queue case (test 4)?

Day 28 of 60: Instrumenting Sendmail queue file creation (pt 4)

Yesterday I looked at the effect of multiple queue directories when processing messages over a single connection.

Today I've been looking at how multiple queue directories can help when processing concurrent connections.

The methodology was identical to the previous tests. The only change was to the smtp-source(1) command line. The previous tests were run with -s 1, indicating one concurrent connection. These tests were run with -s 10, to force 10 concurrent connections.

Day 27 of 60: Instrumenting Sendmail queue file creation (pt 3)

I've commited the first sets of results to the repository in the aptly named results/ directory.

To refresh your memory, the question I intended to answer was:

does the number of queue directories (on a single disk) make a significant impact on the time taken to create new entries in the queue?

They're quite surprising.

Day 27 of 60: Instrumenting Sendmail queue file creation (pt 2)

It's time to run an instrumented Sendmail, throw some messages at it, and see how it performs. Specifically, does the number of queue directories (on a single disk) make a significant impact on the time taken to create new entries in the queue?

Day 14 of 60: Minor updates

I've been a bit busy with other work over the past few days, and haven't made quite as much progress as I'd like.

There are a few things that have moved forward though.

Day 10 of 60: First probes added to Sendmail

Following Monday's info dump about queues, I've spent some time over the last few days reading the DTrace documentation in detail. In particular, the Solaris Dynamic Tracing Guide. This is the DTrace handbook, with a great deal of information about how to use DTrace.

It also contains the information about how to add custom DTrace probes to user applications. I was a bit surprised when I first read that, as it's only a couple of pages long.

It turns out that adding DTrace probes really is that simple...

Day 8 of 60: Sendmail queues

The time has come to start adding DTrace functionality to Sendmail. Of course, there's no point in just diving in and adding code left, right, and centre, so over the last couple of days I've been thinking about what I should be instrumenting first.

Day 5 of 60: DTrace mode for Emacs?

I'm just starting to get my feet wet with DTrace. Does anyone know of a decent Emacs mode for editing .d files?

Day 2 of 60: Importing and branching Sendmail

Now that I've started to get a development environment that I feel comfortable with I've imported the latest release of Sendmail in to my Subversion repository. This is publicly accessible, so you can follow along at home if you've got a Subversion client installed.

raison d’être

It started when I read a number of posts at Jonathan Schwartz's blog (in order: here, here, here, here, and here).

Jonathan is Sun's CEO (although he wasn't at the time he started this series). The essence of it is that Sun are so stoked about their new hardware that:

So... here's an invitation to developers and customers that don't want to move to Solaris, want to stay on GNU/Linux, but still want to take advantage of Niagara's (or our Galaxy system's) energy efficiency - click here, we'll send you a Niagara or Galaxy system, free. Write a thorough*, public review (good or bad - we just care about the fidelity/integrity of what's written - to repeat, it can be a good review, or a poor review), we'll let you keep the system. Free.

That sounds like a good deal to me. So I started thinking about how I might take advantage of this offer.

Try before you buy

Pages