Oh Teh Noes - Your Remote Sense of Impending Doom

Posted by Tom on 2010-10-24 18:01

"Could not load file or assembly 'OmgWeAreAllGoingToDie' or one of its dependencies. There is not enough space on the disk. (Exception from HRESULT: 0x80070070)"

We've all seen it. Alice was on holiday when it was her turn to clear out the IISLogs. Or Bob forgot to BACKUP LOG before he DBCC SHRINKFILEd. Or Chris didn't read that email about renewing the SSL certificate. Either way, the site's down and the clients are outside the office with pitchforks and burning brands and baying for your blood. Now I only have two readers, so if one of you is violently exsanguinated by a marauding mob of users that cuts my readership in half, which would be really bad. With that in mind Oh Teh Noes is a small, light, extensible little app that warns you when terrible, terrible things are about to happen to your server.

Is that right?

Yarr.

Oh Teh Noes is a console app that accepts an XML file as an argument. That XML file contains a list of tasks to be run, which run some fairly standard and roughly customisable checks, such as:

So how do I use it?

Oh Teh Noes is designed to be run as a scheduled task. If you wanted to run using the task definitions in the dailyTasks.xml file, the command line will simply look like this:

OhTehNoes dailyTasks.xml

The specified XML file contains a list of task definitions, thus:

<tasks logName="Test Logger">
    <task type="TaskTypeA" 
        name="Optional Name" 
        paramterA="100"
        parameterB="foo" />
    <task type="TaskTypeB" parameterA="bar" />
</tasks>

Each task definition lives in its own <task> element. The type is a required attribute, and must correspond to the type of the task (see below for the list of included tasks as of now). The name attribute is optional, but will be included with alerts so may help unravel things if you have the many tasks running on many boxen. Take a look below for the required attributes for the different kinds of task.

It informs you of impending doom using a slightly tweaked version of log4net that contains an extra appender which allows it to send unbuffered emails and add data into the email subject. I've included that as an assembly rather than source. As far as I know the log4net licence permits that, but if someone with a beard comes and punches me in the nuts then I guess I'll stand corrected. Until then: binaries it is. Besides that it's vanilla, so if you're familiar with log4net you'll be right at home. All of the log4net settings are tweakable in OhTehNoes.exe.config.

Tasks

The tasks themselves live in assemblies in the /plugins/ directory, and when run Oh Teh Noes will scan this directory for assemblies, and scan those assemblies for valid tasks. What follows is a list of the tasks provided out of the box, and attributes you need to configure them.

DiskSpace

Checks all of the physical drives in the machine and throws a warning if any of them have less MB remaining than is specified by warningThreshold.

SslCertificate

Reads the SSL certificates for the local machine, and then it finds one for a domain (or subdomain) in the comma-seperated list in certsToCheck it makes sure it has more than the amount of days specified in warningThreshold until it expires, or it throws a warning.

FileUpToDate

If the file specified in filename hasn't been modified in the last thresholdInMinutes minutes, sirens sounds. We tend you use this one to make sure files we should be receiving daily from third parties actually arrive.

QueryReturnsRows

When this task it run, it connects to the SQL server specified by connectionString, runs the query in sqlQuery and spits its dummy if no rows are returned. We use this one on some of our clients' sites to make sure that we've had an order in the last hour during the normal working day, like so:

SELECT
    TOP 1 *
FROM
    [Orders]
WHERE
    [DateCreated] > DATEADD(hour, -1, GETDATE())
    OR DATEPART(hour, GETDATE()) < 10
    OR DATEPART(hour, GETDATE()) > 18

And it's open source, right?

Yup.

}

I think that about covers it. If you can make use of it, please do. If you find a bug, let me know. If you can think of more features or plugins, drop me a line or have a go at implementing yourself.

P.S. From a code perspective the main point of interest is the plugin system, and my next post will probably go into that in a bit of detail. Beyond that it's all fairly standard stuff, but it solves a problem we had and I can think of no better raison d'être for a piece of code.