Monitoring Windows Systems from Linux

最近看到一篇国外友人写的关于windows系统监控的文章，作者阐述了多种Linux监控windows的机制。

我在现在的公司也主要负责网络监控系统的产品研发，对作者的观点很赞同。文章也给了我很多帮助。但是作为整个网络系统的监控来说，用户往往不希望在所有被监控主机上安装具有网络安全隐患的监听进程，这点，是我们作为网络监控产品研发人员需要致力去解决的问题。

我在我已经做好的监控系统里用到的监控技术，下面这篇文章都有提到，包括RRD,WMIC,NRPE,SNMP等等，不过，上面我提过，从提高用户信任度来看，扩展snmp是最好的办法，因为它在网络上只使用snmp，其它监控数据均可使用snmp扩展的方式在远程机上执行相关程序/库/脚本。WMI/SSH方式也是比较好的方法。

Thanks to Eric A. Hall for the nice article.下面是Eric A.Hall的原文：

——————————————————-

For many people, the idea of using Linux as a low-cost network management platform can be highly seductive. As the argument goes, even the most rudimentary Linux distributions include the components that are needed to build a modest management console, with Net-SNMP extracting management information from devices on the network, RRDtool storing and graphing the collected data, and one of the many Linux-based network management packages providing a Web-based point-and-click interface to the system as a whole.

For the most part, this approach can even work fairly well for basic monitoring tasks, although it also has its limits. In particular, most network devices do not publish all of their available management data through SNMP, and getting to the additional data typically requires the use of an alternative management interface.

This can even be a challenge with Linux itself, since many of the system variables in the /proc filesystem are not published over SNMP by default, while many of the web and email application servers that are commonly used with Linux do not have any SNMP interfaces at all. If you need to capture any of that data, you’ll need to find a way to extract it from log files or process-management tools, and then import it into your management station through a secondary interface.

Quick Guide to WMI

WMI is the primary management interface for the Windows operating system and is also used by many Windows-based applications. However, the management capabilities of WMI go way beyond the features found in traditional network management technologies such as SNMP. Instead, WMI is essentially a systems management service that is capable of manipulating everything from partitions to user accounts, with the ability to report interesting statistics about the managed services as an incidental byproduct of that larger design.

More specifically, WMI is based on the Web-Based Enterprise Management (WBEM) specification, which was designed as an open, consistent, high-level system management service. In the WBEM model, conforming devices use standardized object-based schema to publish common management attributes and data (such as using a “ComputerSystem” class to define common attributes about the host operating system), thereby allowing management stations to develop consistent views of multiple different systems through the common interface. WBEM also allows objects to be edited, and some objects can also be activated through available methods. WMI and WBEM are essentially the same basic technologies for those purposes.

However, the WBEM standards also defines common transfer protocols to use for these queries, which are HTTP on TCP port 5988 and HTTP/SSL on TCP port 5989. Windows does not use these protocols for WMI, but instead uses their own DCOM/RPC protocols. This difference also has multiple secondary knock-on effects, such as the different protocols having different authentication mechanisms, which amplifies the basic incompatibility. As a result, implementing WMI on Linux requires implementing the SMB protocol and the DCOM/RPC messaging system before you can even think about implementing WBEM. This combination of technologies has not been available until recently, which is why administrators with mixed environments have had to jump through so many hoops to get at management data in WMI.

But while this can be tedious for systems like Linux, it’s a huge problem for the systems and services that use Windows Management Instrumentation (WMI) as their primary management subsystem, since there has not historically been any way to query WMI from Linux directly (see sidebar at right). Instead, administrators who have committed themselves to Linux-based management consoles have had to rely on gateway or proxy technologies that query Windows systems for the desired data on behalf of the management station. For Windows-heavy networks, the path of least resistance has simply been to use Windows-based management platforms that can access the data directly, and forgo the Linux management station altogether.

In practice, there are a variety of ways to pull WMI data into Linux-based management tools, many of which are discussed throughout the remainder of this article. However, while most of these tools are useful for extracting some degree of data from WMI, they also have unique operating considerations which can affect their utility in unexpected ways. For example, some of the solutions require new software to be installed at the management station, on a gateway device, or at each of the Windows hosts that will be monitored, and some of them may require modifications to the security permissions on some of those systems as well. Similarly, the different solutions can expose widely varying amounts of data, which not only determines the functionality of a particular package, but also introduces additional security considerations.

The Windows SNMP Agent

The simplest way to get data from WMI into Linux management stations is to go through the SNMP agent that is included with the Windows operating system, although there are some significant limits on the information that is available through this interface. More accurately, the principle restriction with the Windows SNMP agent is that it does not provide much in the way of Windows-specific data.

For example, the HOST-RESOURCE-MIB defines basic CPU utilization metrics that shows the average processor utilization levels for the last minute of activity, but that is all it provides. The problem here is that the standardized data just isn’t very useful, especially compared to the CPU utilization data from the WMI performance counters which tells us how many tasks are currently in the process queue, how much of the load is from system tasks versus user tasks, and much more. Unfortunately, none of the interesting and useful data is available through the native SNMP agent.

The good news is that some third parties have stepped up to fill this void, and it’s possible to get at the interesting data via SNMP by using aftermarket extensions. The biggest name in this space is SNMP Informant, which is the brand name for a series of products ranging from a basic freeware extension that exposes a limited amount of operating system data all the way up to a commercial multi-extension that exposes huge tracts of data from the operating system and add-on packages like Microsoft Exchange and SQL Server (among others).

SNMP Informant works by strongly data-typing the core performance counter objects, and then mapping predefined OIDs against those objects, thereby allowing administrators to make explicit and clean references to system-level objects in a consistent and reliable way. However, the downside to this approach is that every OID in the extension must be predefined, and SNMP Informant only focuses on several very important areas but does not attempt to expose the entire WMI subsystem to SNMP. This means that if you need to access performance counters or some other piece of WMI data that SNMP Informant does not already publish, you have to look elsewhere.

One interesting extension that has appeared in this space recently is the freeware SnmpTools, which purports to allow mapping administrator-defined OIDs to a variety of data sources, including hard-coded string values, dynamic performance counter objects in WMI, and even dynamic output from text-based scripts and programs. Although the WMI-specific part of the extension is limited to performance counter objects, it’s theoretically possible to query other parts of the subsystem through the use of a local script and return the data through the command interface.

Another angle on this space is that many of the server hardware vendors provide their own SNMP extensions for Windows as a way to publish management information about the server hardware. When added to the other extensions discussed above, this can really round out the data that is available through the stock SNMP agent.

The general downside to this class of tools is that they have to be installed on every Windows system, since they extend the SNMP agent on the local host. However, they do not require any new software on the Linux management station, since they tend to work seamlessly with existing SNMP interfaces. These technologies also do not typically require any significant changes to existing security models, since the native SNMP agent’s authentication services will continue to be used. On the other hand, the lack of encryption in SNMP means that adding extensions that publish more data will result in more data being available for eavesdroppers to discover.

WBEM and WMI

Just as SNMP is generally the preferred management technology for Linux systems to use when querying Windows devices, the other end of the spectrum has it that WMI is the most natural technology to use when managing Windows systems and services. Thus, if Windows cannot be made to speak SNMP adequately, another option is to make Linux speak WMI. But as stated in the introduction, WMI has not historically been available to Linux systems due to a variety of technological issues. However, it has long been possible to make Linux systems speak WBEM, and since the principle difference between WBEM and WMI is the transfer protocol in use (again, see sidebar), all that’s really needed to expose WMI to Linux is a WBEM listener for Windows and a WBEM client for Linux.

There are a couple of options for running a WBEM listener on Windows. For one, the Open Group has an open-source WBEM implementation called OpenPegasus with a standalone WBEM-to-WMI gateway component called WMI Mapper, which listens for incoming HTTP/WBEM queries on the standardized ports, processes the requests as WMI queries on the destination system, and then returns the answer data to the original requester. Unfortunately, WMI Mapper is only available from the OpenPegasus web site as raw source code, which can be an issue for many organizations. However, the HP Systems Insight Manager server-management toolkit provides a prebuilt version of WMI Mapper as a separate downloadable add-on package.

Another option for adding WBEM capabilities to Windows comes from the IBM Systems Director server-management suite. Specifically, IBM Systems Director provides Windows agents with comprehensive WBEM implementation that can also be used to query WMI on the local system. These agents can carry a lot of overhead, but they can also add a tremendous amount of manageability to their host systems. Furthermore, organizations who are thinking about using WBEM on their other platforms can look at the Director agents as a way to get a consistent WBEM interface on multiple platforms simultaneously.

Once one of these packages have been installed and configured, the only remaining requirement is a WBEM client for Linux that can generate queries and return formatted data to the management console. There are a handful of WBEM toolkits available for Linux (including tools that are included in the aforementioned WBEM-based server-management consoles), but one simple utility for this purpose is wbemcli from the Standards-Based Linux Instrumentation (SBLIM) suite, which allows you to generate requests for named resources and apply basic formatting to the response data using command line options. Some of the common Linux distributions already include the sblim-wbemcli package, but it can also be downloaded from sourceforge.

There is also the somewhat-recent option of running a WMI client directly on Linux, and bypassing all of the intermediate technologies altogether. Specifically, the Linux-based Zenoss management platform provides a utility called wmic, which uses the Samba4 libraries to access Windows and interact with WMI directly. Using the wmic utility, it’s possible to retrieve just about any object from anywhere in WMI (the author uses it to generate custom graphs from Everest sensor readings stored in WMI), although you have to generate the queries using the WQL (WMI Query Language, which is very similar to SQL). wmic is packaged as a separate program in some Linux distributions, but is also included in the open-source version of Zenoss Core.

Of all the options for getting management data out of WMI, wmic is clearly the simplest and cleanest way to do it. However, it’s also important to recognize that wmic is only useful for querying WMI, and it is unable to engage or manipulate WMI objects. If you need that level of access then one of the WBEM approaches is going to be your only option for now. Another advantage that WBEM has over wmic is the fact that you can deploy WBEM servers on your Linux systems so that all of your devices are presenting fairly consistent high-level management interfaces. This is not something that should be embraced casually, but it is an important and compelling opportunity that should not be dismissed blithely either.

In terms of deployment, the Director Agent is the only technology mentioned that requires new software to be installed on the Windows systems for the data to be accessible to Linux management consoles. The OpenPegasus/HP WMI Mapper gateway is able to issue WMI requests for objects on local and remote Windows systems, so it only needs to be installed on a single gateway device. wmic does not need any new software on any Windows device. All of these solutions require client-side software of some kind on the Linux management console.

All of these solutions also require changes to Windows’ security permissions in order to function, particularly in the areas of WMI object permissions, and in the case of wmic there are likely to be changes to DCOM permissions as well (the default Windows permissions only allow administrative users to issue remote queries, which is not viable for most networks).
Remote Command Execution

If SNMP and WBEM/WMI all prove to be unsatisfactory for some reason, then it’s time to explore alternative management interfaces. Luckily, this tends to be fairly straightforward on Linux-based management systems; everything uses the command-line to move data around already, so calling an alternative management tool is theoretically as simple as replacing the snmpget command with an appropriate substitute, and ensuring that properly-formatted response strings are generated.

As was alluded to in the introduction, this kind of alternative command model is sometimes needed in order to pull interesting data from the local Linux system itself. For example, if you need to routinely extract management data from a local email server’s log file, then you will probably need to execute an arbitrary script that returns the required data and then pass the formatted results back to the management station. By extension, if you need to gather this data from all of the Linux hosts on your network, then one option worth considering is to simply distribute the script to each of them, and then use a local program to execute the remote scripts as needed. This same model can also be made to work with Windows, and in fact is actively embraced by some popular management toolkits.

However, this model can also be slow, resource-intensive, and even risky, depending on how it is implemented. Simply put, it’s expensive to spawn command processes, and twice as expensive to spawn them locally and remotely. It would be foolish to use this model when some other lighter option was available. On the other hand, if you need to perform a computationally- or I/O-intensive process in order to obtain the desired information (such as parsing through open log files, or gathering multiple pieces of data for comparison purposes) then it can often be faster and cheaper to execute the command on the remote host instead of issuing multiple discrete queries across the network and performing the calculations locally.

As for making this work, modern versions of Windows usually include most of the tools that are needed, although some assembly is often required, with greater amounts of work being needed for progressively older versions of the operating system. For example, modern versions of Windows include the Windows Script Host interpreter and the character-based cscript.exe front-end, which cumulatively allow you to execute VBScript files from the command line. Even more recently, the Windows PowerShell also provides a scriptable environment that is accessible directly. Either of these tools can be used to query and even manipulate WMI objects, including WMI objects on other Windows hosts across a network.

There are also multiple options available for executing these scripts from remote. One option here is to install an SSH server on the Windows host, and then use ssh hostname remote-command from the Linux console to execute the desired script, just like you would on UNIX systems. Windows systems with Subsystem for UNIX Applications (SUA)/Services for UNIX (SFU) can download premade SSH servers from Interop Systems. Alternatively, there are multiple SSH servers for Windows available that are based on Cygwin if you prefer to use that environment.

As another option, the Nagios management platform provides a remote-command interface called NRPE that is essentially a client-server protocol for executing predefined commands. In this model, target systems are setup as NRPE servers, while the management station runs an NRPE client. Commands are defined in a configuration file at each server, and the client connects to the server and instructs it to execute one of those commands, with any additional parameters being supplied in the request. If the command is known to the server, it executes the request and returns the response data, then closes the connection. NRPE is simpler to setup than SSH, and it is also arguably more secure given that the server cannot run arbitrary programs. There is also a prebuilt NRPE server for Windows available.

Naturally, the ramifications for this class of technology can be immense. On the security front, their whole raison d’être is to facilitate remote command-level access to your Windows servers, which frankly should be enough to give anyone pause. As for deployment, these tools require additional software or scripts on the Windows hosts, but since the scripting tools can perform WMI queries over the network you really only need the executables to be installed on one specific server, which can then act as a proxy for all of the other Windows hosts that it has access to (this will require that the scripts have a hostname argument, obviously). Some software may also be required on the Linux management station, such as the NRPE client, or wrapper scripts that call the preferred tool and dispose of the response data.

Administrators who are interested in pursuing remote-command execution techniques should study some of the scripts hosted on monitoringexchange.org, which includes a variety of VBScript files for extracting data from WMI. Some of these scripts can be useful for some of the other technologies mentioned earlier as well.

Leave a comment