WSUS Monitoring

by Dave Breslin
September 4, 2012

WSUSDashboard

This dashboard combines event and patch audit data to monitor Windows Server Update Services (WSUS) patch deployments.

This post uses "patch" and "update" as interchangeable terms. The report template titled “WSUS Monitoring Report” complements this dashboard and the report it generates can be provided to somebody without online access to SecurityCenter’s GUI to perform detailed analysis offline.

Summary

WSUS monitoring focused on patch deployments is greatly enhanced by complementing WSUS server events with WSUS client events as demonstrated by the dashboard template. WSUS client events provide a real-time view of update failures and successes. Nessus WSUS integration provides an additional measurement and understanding of overall patch deployment effectiveness by including an IT security risk perspective.

Capturing WSUS client update events along with WSUS patch installation data provides the ability through analysis using SecurityCenter dashboards, reporting and real-time querying to answer these questions:

  • Will a full internal enterprise vulnerability scan using SecurityCenter catch WSUS in the middle of a patch deployment?
  • How long does a successful patch deployment take?
  • Do we fail compliance standards like PCI DSS (30 days) for security patch deployments in regards to the amount of time a successful client rollout takes?
  • Are we going to fail compliance vulnerability scans like those defined by PCI DSS because of delays in WSUS patch deployment cycles? 
  • Is network bandwidth an issue for WSUS patch deployments because some network locations take far longer than others and do we need to consider additional WSUS servers?
  • Does the current patch deployment look typical or is it having unusually long delays?
  • Which WSUS clients have consistent issues with updates across multiple deployments?
  • What is the current risk due to an ongoing patch deployment?
  • Where on the network did the most patch installation failures occur?
  • What is the status of the current patch deployment (leveraging real-time metrics)?
  • How did previous WSUS patch deployments perform?

 

Details

The dashboard primarily operates by using Windows events that have been collected in LCE for a WSUS server and its clients. It also includes a table component (bottom left) that leverages Nessus WSUS integration to indicate if patches, Windows updates, have been successfully applied and focuses on listing unapplied updates for security bulletins that have known exploits.

The "WSUS Server and Client Event Counts" matrix component (top left) provides Windows event counts for a WSUS server and its clients accumulated over the past 7 days.

WSUSMatrix

Windows events can be collected remotely using LCE WMI monitoring agents or by installing an LCE client on a Windows host. The WSUS server event counts focus on the errors and warnings that Ron highlighted in the Tenable discussion post “Support for WSUS Logs”. They are WSUS server event ids 13001 and 13021. The threshold for triggering event id 13001 is configured using the WSUS health monitoring parameter “InstallUpdatesInPercent”. Two threshold values for triggering event id 13021 are configured using WSUS health monitoring parameters "SilentClientsInDays" and "SilentClientsInPercent". Here are the default values for the aforementioned parameters:

HealthMonitoringParms

In short, the WSUS server event counts are for events that require special attention after updates have been released. For further details see the very informative and concise Microsoft TechNet article titled "Managing WSUS 3.0 from the Command Line".

The WSUS client event counts provide accumulated totals for updates that have successfully installed or failed using WSUS client events. They should not be confused with similar host based counts provided by the WSUS server. However, there should be a high level of correlation.

There are great advantages to capturing WSUS client update events including identifying problem WSUS clients that show a trend to continual installation failures across many WSUS update releases. Analysis of the events over time during an update release can highlight areas in the internal network(s) that demonstrate an unacceptable lag in applying updates. The client update events are captured in real-time but a WSUS server relies on periodic updates from its clients to provide similar information. The lag between WSUS server and client communication will be challenging for reporting the current installation status of a high priority patch that has been released across an entire enterprise in response to an incident or immediate threat.

The matrix component displays accumulated event counts for the past 7 days. It can be configured to a larger or shorter time frame or additional columns added for displaying accumulated counts by additional time frames, for example:

  WSUSMatrix2

The “WSUS Server and Client Failure Events (Past 7 Days)” line graph component (top right) plots WSUS failure events over time.

WSUSGraphFailures 

We can visually see the difference between reporting update installation failures using WSUS client events versus waiting for the appropriate WSUS server event, triggered by a failure threshold. However, we do expect a high level of time correlation unless there is some breakdown or extreme time lag between WSUS server and client communication:

WSUSGraphFailures

Clicking the Browse Component Data icon allows the dashboard user to drill into the details of the client update failures to answer questions like:

  • Which clients had the most failed installations?
  • Where on the network did the most failures occur?
  • What are the detailed messages provided by the Windows events?

WSUSGraphFailuresBrowseIcon

 

EventDrillIn1

Which clients had the most failed installations?

 

EventDrillIn2

Where on the network did the most failures occur?

 

EventDrillIn3

What are the detailed messages provided by the Windows update failed events?

 

The WSUS health monitoring parameter “IntervalsInMinutes” controls when the WSUS server will report events. It's possible to disable event reporting by modifying this parameter so when monitoring WSUS server events it would be wise to know how the parameter has been configured.

The “WSUS Clients – Windows Update Successful Events (Past 7 Days)” line graph (bottom right) was the main driver behind the entire dashboard design. We visually see how WSUS released updates propagate across the enterprise over time. We get a picture of how long a successful rollout takes.

WSUSClients

On releasing an update (patch) using a WSUS server we’d intuitively expect to see a skewed bell curve for its successful installation as clients download the patch and then install it.

WSUSClients

The example above is counter intuitive to what we’d expect to see if there weren’t issues with patch distribution and installation. It would require investigation using some of the drill down analysis already presented in this post to understand the two separated update time frames of highly compressed activity. Additionally, the client configuration dictates how automated the download and installation of updates is which will have a strong influence on the graph if manual intervention is required for downloading or installing updates.

ClientSettings

A good research article for information on the WSUS client (Windows Update) settings can be found at this link. For useful information on the WSUS client (Windows Update) utility “wuauclt” and its switches to help in troubleshooting client update issues see the Microsoft TechNet appendix titled “Appendix H: The wuauclt Utility”. Additionally, become familiar with the log file “C:/Windows/WindowsUpdate.log” that can found on every WSUS client.

If there is a success to failure installation ratio (acceptable or not) then failures will be most visible around time frames involving large numbers of successful updates:

TimeCorrelationAcrossUpdatesFails

It's very easy to modify both line graph dashboard components to display much larger time frames.

Timeframe

Also, note in the GUI snippet above the ability to set an absolute time frame. This would allow the line graph components to be copied to create additional components and their time frames modified to display a previous patch release for comparison.

The "Missing Security Updates (with Known Exploits) Reported by WSUS Using Nessus" table component (bottom left) primarily serves to indicate the vulnerability risk update failures or delays are creating in terms of exploitability.

WSUSMissingSecurityUpdates

Clicking the Browse Component Data icon allows the dashboard user to drill into the details of the missing updates.

WSUSMissingSecurityUpdates2

 

DrillIn1

All the missing patches (dashboard component lists first 999)

 

DrillIn2

The list of hosts with missing patches

 

DrillIn3

First Discovered Date, Last Observed Date, Host Identifying Info, Synopsis, Description & Resolution

 

DrillIn4

Risk Factor, STIG Severity, CVSS Score, Plugin Output, CPE, CVE & BID

 

DrillIn5

Cross References, Vulnerability Publication Date, Patch Publication Date, Plugin Publication Date, Plugin Last Modified Date, Public Exploit Available and Exploitable With

 

There are advantages and disadvantages to using the Nessus WSUS integration to report missing security related patches. One of the disadvantages is the time lag in WSUS server and client communication for installed update status reporting.  As mentioned previously, this would be challenging in reporting the current status of an enterprise patch rollout if an update was released as a sense of urgency to address an incident or immediate threat.  The existing component (bottom left) could be modified to report the same information only through Nessus credentialed scanning of each WSUS client (Windows host).

The dashboard components do not attempt to filter by WSUS clients (Windows hosts) that use a particular WSUS server and those that do not. In a large enterprise there may be one or more WSUS servers and a mixture of Microsoft and third party patching systems. SecurityCenter’s asset grouping functionality can be used to group clients together using various host and domain identifiers and then leveraged in component filters or for an entire dashboard when applied to SecurityCenter’s user access control.  Many distributed LCE systems might be leveraged across a very large enterprise mirroring the architecture of many distributed WSUS servers perhaps due to bandwidth constraints but allowing the querying of results centrally via a single SecurityCenter.

For monitoring more than one WSUS server and its clients it is recommended that a unique dashboard per server be created using the template; however, with some customization and additional components there’s no reason why one dashboard could not be used to monitor multiple WSUS servers and clients.

As a final note, the LCE normalized event "Windows-UpdateClient_Installation_Ready" was not included in any of the dashboard components. It is generated by WSUS client events that indicate an update has been downloaded but has yet to be installed. Whilst performing analysis on delays in deploying patches this event will be very helpful.