user warning: Duplicate entry '827686' for key 1
query: INSERT INTO dr_accesslog (title, path, url, hostname, uid, sid, timer, timestamp) values('capacity planning', 'taxonomy/term/15', '', '38.107.191.85', 0, '6embtud40on8btpkpfo7jdofd4', 250, 1268788546) in /usr/local/apache/htdocs/includes/database.mysql.inc on line 172.
Fri, 07/25/2008 - 21:23 by Jarod Jenson
I do my best to go "Green", but seeing as how I tend to get on a plane more frequently than I would like; I probably have a much larger carbon footprint than most (especially since I get on MD-80’s frequently). That being said, it is not exactly what I meant by the term "Environmentalist".
I was actually referring to environment in the UNIX process sense. If you are reading this blog there is a high probability you know to which environment I am referring, so a discussion of that is left as an exercise to the reader.
This is a pretty trivial example of what DTrace can do (in fact, almost embarrassingly so), but it is compelling in the sense that it helps to illustrate a point I make over and over - most of the performance pathologies we find with DTrace can be addressed with very few changes to an application. In this particular case, the change to the application is zero.
Here is the setup:
Recently I was looking at an application that processed market data from many different sources. The messages received by this system were formatted internally by other systems into a proprietary format for aggregation and (magic business competitive advantage) processing. The performance of this system was quite good, but as you can imagine – more was wanted.
The first thing I did (after running mpstat(1M) of course) was to grab a userland profile of the application:
(The stack has been truncated for obvious reasons)
Well, this certainly wasn’t what we expected. This is supposed to be sitting on CPU doing magic business competitive advantage processing. The fact that getenv(3C) was so prominent was clearly not what we wanted. Now, I know getenv(3C) should be pretty darn fast since I filed CR 4991763 with Bart way back when I first started falling in love with DTrace, and Phil had knocked out a pretty sweet fix. Since we weren’t seeing the same lock contention as in the bug report, it had to be something different.
One of the steps of the processing was to take a timestamp field and pass it localtime(3C) to format it. Normally, this isn't a big deal as this is a pretty fast operation. However, our wonderful “go-to” DTrace profiling technique had other things to say about it.
Even if you are not a developer, you should be more than willing to point your browser at src.opensolaris.org/source/ occasionally. You never know what you’ll find. If we do that, we see that getenv(3C) actually ends up calling findenv(). findenv() simply iterates over the _environ list looking for a match. Even without a developer background, it is pretty easy to tell that:
182 /* Slow comparison for rest of string. */ 183 while (*s1 == *s2 && *s2 != '=') { 184 s1++; 185 s2++; 186 }
is doing a character for character comparison for each item in the environment (and by the comments, probably slow). Well, let’s see what environment variable we are looking for:
“TZ” (aka Timezone) is all we are looking for in this case? Well, that seems harmless enough right? Why on Earth with all of the “real” processing we have to do for the application are we spending so much time in getenv(3C)? Based on the getenv(3C) code (and our understanding of getenv(3C)) the best hypothesis as to why we would spend so much time in this function I simple - our environment list must be huge.
Well, I can’t show it here, but yes indeed – the environment list was huge. Environment variables were being used to specify configuration options for the application. Oddly enough, they all had the same application specific prefix with just an _XXX suffix where XXX was the option. This meant that the short circuit in findenv():
178 /* Fast comparison for first char. */ 179 if (*s1 != *s2) 180 continue;
wouldn’t happen for (in this case) 18 characters. The TZ environment variable was being set (based on what geo the app was running in), but it was last in the list. This meant that we spent a whole bunch of time doing character comparison.
The solution was to set TZ first and everything else later. A whopping 28% gain for just changing environment variable order. A simple one-liner in a start script - pretty cool.
Like I said, this is a trivial example of DTrace, but no one would have thought to look at the environment variable order if the userland profile hadn’t taken us there. This is what I talk about repeatedly when I say DTrace will lead you to the answer. You don’t have to be an expert to know that if you are spending time in an area that you didn’t expect, that something is probably wrong.