And the expensive corporate system uptime management box missed the problem entirely.

The “Offersum.inbox” had over 6000 unprocessed records, and was growing at a rate of 13 records a minute.  Microsoft’s advice:

This folder receives summarization messages (.sum files) for advertisements from child sites and processes the files to the SMS site database. A backlog of files may indicate a performance problem that is caused by lots of messages. Examine status messages for the SMS Offer Status Summarizer for possible problems.

So I looked at the Status Summarizer.

Nothing.

“Nothing either”, was the result of checking the SMS log files.

“A backlog of files may indicate a performance problem” was the clue.  My offsider checked the CPU utilisation.  24% total utilisation.  This box has four processors, and there was a process using all of one CPU.

Ah ha!  We have a a culprit.

SQLagent.exe decided to go wild, and get itself stuck in a loop.  Stopping and starting the SQL Server service fixed that.  Within 20 seconds, all 6000+ records had been processed.

The utility which detected the original problem, that the expensive corporate system missed?  A 150 line PowerShell script.

Which I’ll share next week.

Bookmark and Share