Automatically restarting services or processes based on resource usage

In the ideal world, every software we install on our servers and workstations uses as few resources as possible, doesn’t have memory or handle leaks and never crashes.

But in reality, Sysadmins often have to deal with temperamental business-critical third-party applications (or in-house developed) which exhibit a number of issues, including:

  • Memory Leak: The application keeps eating away at the available memory like a chubby caterpillar chewing on a leaf
  • Handle Leak: The application continuously increases its handle count, which takes away from kernel memory over time
  • CPU Spike: The application uses all CPU time of one or more cores

When one of these issues is encountered, a manual application (or service) restart, along with a potential bug report, is usually the only solution. Consequently, keeping a close eye on both Windows and third-party software – especially on servers – is considered good practice. But even better than looking is being proactive of course, for example by automatically restarting a service which uses too much memory or CPU.

Frozen Leak

This is where EventSentry comes in. EventSentry doesn’t just analyze metrics available through Windows performance counters (e.g. CPU usage, handle or memory count of a process.), it also allows you to take corrective action based on granular rule sets. This ensures that all active applications are behaving nicely by staying within pre-defined performance boundaries.

To get there, we utilize 3 features in EventSentry:

1. Performance Monitoring
2. Event Log monitoring
3. Service restart or process action

Since examples usually work best, I will outline the steps required to restart the printer spooler service if it uses more than 100 Mb of RAM. This is for illustration purposes only, I’m not suggesting that the printer spooler service should not use more than 100 Mb of RAM.

Performance Monitoring
Application performance monitoring is already setup out-of-the-box via the “Performance Applications” System Health package. This package, by default, is assigned to all hosts and collects key application metrics in the EventSentry database. Since this package is generic and captures all processes (without generating alerts), we’ll create a separate package that will only monitor the spooler service.

Unless you resort to scripting, it is unfortunately not easily possible to automatically link process names (as they are reported by the Windows performance monitoring subsystem) to a service name. As such, we will need to first find out the process of the service we are monitoring and then monitor only that instance of the performance counter. To determine the process for a given service, simply view the properties of the services in the “Services” or “View local services” application and look for the “Path to executable” field. New versions of Windows also show a list of all services in task manager and let you jump to the process by clicking on “Go to details”. The name of the instance is the process name without the .exe extension, spoolsv in this case.

The next step is to create a new System Health package and add a performance object. Select the System Health packages container, click “Add package” from the ribbon and enter a suitable name. Select the newly created package and add the performance object to the package. Now select the “Performance” object and click the “+” icon to add a new performance object to monitor. Every performance object in EventSentry requires at least a name (to describe the counter) as well as the actual Windows performance counter. The respective performance counter for monitoring the memory usage of a process is Process(*)\Working Set, and since we are only interested in the spooler process we will use of the Process(spoolsv)\Working Set performance counter. When you are done, the dialog should look similar to what is shown below:

Performance Counter Setup
Specifying the performance counter to monitor the memory usage of the spooler process

The default frequency is 10 seconds which works well for most counters, but you can increase this frequency for counters which change only minimally over the short term (as is usually the case for memory usage and handle count), so we will use 30 seconds in this case.

Now that we are successfully tracking the memory usage of the spooler service, we need to setup a hard limit in order to get an event when that limit is exceeded. Click on the “Alert” tab and configure the dialog as shown below:

Specifying the alert limit for the performance counter
Specifying the alert limit for the performance counter

We are only concerned with the top section of the dialog, please see the documentation for more details on the “Notify at most …” and below options.

The last step in this section is to assign the package: Select the package, click “Assign” in the ribbon and assign the package to a computer or group. EventSentry is now tracking the memory usage of the spoolsv process and will log a warning event if the memory usage exceeds 100 Mb.

Action
EventSentry uses actions to send emails, toggle services or start processes. Since we want to restart the spooler service, we’ll create a Service action. Select the “Actions” container and click the “Add” button. Select the “Service” action type and assign it a descriptive name, e.g. “Restart Print Spooler Service”.

The configuration of this action is probably the most simple in this tutorial – just specify the service name and the desired action as shown below:

Specifying the service to be restarted
Specifying the service to be restarted

Connecting the dots: Event Log Filter
We’re monitoring the memory usage of the spooler now and have an action which can restart the spooler service, but how do we connect the two? You probably guessed it – with an event log filter. Event Log filters allow you to connect an event (e.g. memory usage is too high) with an action (e.g. restart spooler service).

We’ll create an event log filter which will look for the exact event that is being logged when the memory usage of our performance counter exceeds 100 Mb, and trigger the service restart action.

Similar to what we did with the system health package, right-click the “Event Log Packages” container (or use the ribbon) to create a new event log package and assign it to the computer(s) and or group(s) in question.

Then, add a new INCLUDE filter to the package. Alternatively you can also click the “Alerts” button while the performance object is selected to go through a wizard. Either way, the filter should look like the screenshot below:

Specifying the event properties which will trigger the service restart action
Specifying the event properties which will trigger the service restart action

Now, when the performance monitor writes event id 12104 with the above properties, EventSentry will trigger the “Restart Print Spooler Service” action which should reset the memory usage of the process. As an added bonus, an email is also fired off so that the operator knows that EventSentry took the corrective action.

Note: Don’t forget to push the configuration to any remote hosts if necessary.

Now sit back and relax knowing that another thing is taken care of for you.