Managing PC Servers Using TME 10 NetFinity

By Craig Elliott

'Ensuring that your PC servers are operating efficiently is very important to the client/server architecture. TME 10 NetFinity enables you to ensure efficient operation by monitoring key indicators such as CPU and memory utilization, system temperature and voltage, and disk usage. Additionally, you can monitor the system itself, as well as its critical applications.'

In the move toward client/server architecture, availability of the PC environment and local area networks becomes increasingly critical. Maintaining a high availability network requires using the best components, the servers, network devices such as routers, and even the network itself. Using reliable hardware such as IBM PC Servers can minimize unexpected outages. And implementing TME 10 NetFinity's extensive management capabilities further enhances your system's availability, enabling you to move from a reactive into a proactive systems management mode.

Overview
Does the name NetFinity sound familiar? Perhaps, because it was originally released as IBM NetFinity. Developed by the IBM PC Company to manage IBM hardware, NetFinity performed extensive asset management (of hardware and software inventory) as well as other tasks such as monitoring critical system resources and automating reponses to alerts. As IBM began renaming its products to be consistent with the SystemView brand, NetFinity became PC SystemView. After IBM's acquisition of Tivoli, the product has again become NetFinity, but now its is TME 10 netFinity to align with the new TME 10 brand.

NetFinity supports a variety of platforms to enable management in the majority of customer's environments. There are Clients, called Services in NetFinity terms, for Windows 3.x, Windows 95, Windows NT, OS/2, and Novell NetWare servers. There are also Managers on the Windows 3.x, Windows 95, Windows NT, and OS/2 platforms. This enables you to implement NetFinity on your existing platform rather than having to switch platforms to accommodate your management application. Another bonus is that the functionality is the same across all platforms. Therefore, you have the same management capabilities from a Windows 3.x system as you do from an OS/2 system. Also, the user interface is consistant across all platforms, eliminating the learning curve in a multi-platform environment.

When installing NetFinity, the installation program queries the hardware and only installs the components which are applicable to your system. Therefore, not all systems will have the same program icons in the graphical user interface. Some tools are only applicable to a Manager system. Therefore, you may not have the same icons present on the Client system as you do on the Manager system.

Common Management Tools
Although originally designed for IBM PCs, NetFinity can provide valuable management tools regardless of the hardware or operating system you use. These tools will allow you to monitor your remote systems to ensure they are operating adequately. Here are some of the common management tools and the benefits they provide:

Remote System Manager
The Remote System Manager is your window into managing the systems in your environment. It allows the grouping of systems into logical collections of like systems which will be managed similarly. For example, you may have one group for all servers, one for all workstations, and another for all systems within a specific department. These groups are created based on system keywords specified during configuration. Groups are created using any combination of the 8 keywords specified for each system. You can also limit the systems added to each group based on the operating system and/or protocol being used on the system. This gives you the capability to create a group of not only systems in a specific department, but only those running a specific operating systems and communicating via a specific protocol.

The availability of a system can be determined simply by viewing the group within the System Group Manager. (System Group Manager is the title of the window opened when you double-click on the Remote System Manager icon.) Systems which have gone off-line (i.e. have been stopped), are represented by an icon which is greyed-out. Operational systems have a normal icon which is also representative of the system type. Optionally, using the Alert Manager you can add an overlay to the system’s icon to indicate an operational problem or other condition which requires your attention.

In addition to just changing the icon, you can configure the Remote System Manager to notify you by generating an alert when a system comes on-line or goes off-line. You can specify how often you want to check the operational status. This can be on a per system or per group basis. Also, using the Remote System Manager you can reboot a remote system or power on one using the Wake On Lan feature of the new IBM adapters.

Critical File Monitor
Regardless the application or operating system, there are certain certain files which affect how it is configured. For critical applicaitons including network operating systems, changes to these configuration files can result in system outages.

The Critical File Monitor allows you to monitor these critical system files for changes and generate an alert when a change is detected. These critical files are selected by the user. Therefore, any file on any system can be monitored for changes. For example, on an IBM LAN Server system, the user could select to monitor the CONFIG.SYS, IBMLAN.INI, PROTOCOL.INI or any other file and receive a notification when it is changed. This would warn you before the system is rebooted, giving you the opportunity to verify the changes and prevent the system from becoming unusable.

File Transfer
Once you are notified that a critical file has been changed, then what? How can you replace it or correct it? The File Transfer utility provides a graphical interface you can use to copy a file from the Manager system to the remote system. This could be used to copy a back-up of the changed configuration file from the Manager to the Client, thus preventing an inoperable system upon reboot.

In addition to transfering any file between the NetFinity Manager and NetFinity Client, the File Transfer utility can also transfer entire directories, or delete files from the local or remote system. This also makes a useful tool for making back-ups of configuration files.

Event Scheduler
If you use the File Transfer tool to replace changed files, how do you get the back-up copy of the file to the Manager system to begin with? Well, you can always use File Transfer to copy it from the Client to the Manager. However, that requires connecting to each system and manually copying the desired files. Since you typically want to back-up these files periodically, you may wish to automate this process. Using the Event Scheduler, you can periodically perform the file transfer unattended to ensure you have the latest copies of these configuration files.

In addition to transferring files, you can also use the Event Scheduler to schedule other NetFinity services such as rebooting the system, executing a command, or collecting software and hardware information. This action can be scheduled for a single system, multiple systems, or an entire group of systems concurrently.

Remote Session
Instead of replacing the changed file you wish to view the changes or even edit the file to correct the changes. The Remote Session tool will enable for you to perform that function remotely.

The Remote Session tool establishes a remote command line session with any Client system, regardless of the operating system. From here, you can issue operating system commands just as TYPE to view the contents of the file. Additionally, you can execute a text-based application such as EDIT or TEDIT to make corrections to the files.

Applications other than editors can be executed as well, as long as they are text-based. On OS/2 systems, Presentation Manager applications can be remotely launched in a separate session using the START command. Since they are launched in a separate window, and since they are graphical rather than text-based, you cannot interact with them. When opening a Remote Session on a NetWare server, you are actually establishing a Remote Console session with that server. From there, you can perform any function as if you were setting at the console.

Screen View
Sometimes executing text-based commands isn’t enough. It may be necessary to view the screen on the remote system, especially when a graphical application is being executed. Using the Screen View tool, you can take a snap-shop of a desktop to view exactly what is on the screen. This is very useful when performing remote problem determination.

For example, suppose if a user calls the help desk asking how to execute a program. The help desk staff can take a snap shot of the screen and instruct the user what commands to type or icons to execute. After the user completes the instructions, an updated screen shot can be captured to ensure that the instructions completed successfully. This series of giving instruction and updating the screen capture can be repeated until the application is successfully executing.

Another way to use the Screen View tool is capturing error information. If a user calls the help desk to report an error, the help desk personnel can capture the screen containing this information and optionally save it as a bitmap for future reference.

Process Manager
Just because a system is running doesn’t mean that it is available to perform it’s intended function. This may be the case when the operating system is running but a critical application has failed. The Process Manger allows you to monitor individual processes to ensure that they are executing and to optionally generate an alert if they are not. For example, you may wish to monitor a LAN Server Service such as LSSERVER to verify it is running and to generate an alert when if it stops. Critical application processes such as those within LAN Server are typically started automatically when the system is restarted.

Process Manager can also monitor these processes to ensure that they are started within a specific amount of time after the system is restarted. The amount of time given after the system is started for the application to be started is specified by the user.

Finally, there are some processes you do not want to execute. These include games - specifically those which run over the network and generate a lot of traffic. You can use the Process Manager to monitor a system for one of these specific processes and to generate an alert when one starts running.

System Information
An important function when it comes to managing your systems is tracking exactly what hardware is installed where. This becomes critical when installing new software which has specific hardware requirements, when adding new users to a network which will result in increased usage of disk space or memory by the server, or at the end of the year when assest must be counted for tax purposes. In either case, having an updated list of your assets is invaluable.

The System Information tool collects detailed information about your hardware and displays it graphically. This includes things like the amount, size, and speed of SIMMS installed, number and size of disk drives, adapters in expansion slots, etc. Optionally, this information can be exported to a Lotus Notes or DB2/2 database for future reporting purposes. If you don’t have Lotus Notes or DB2/2, you can save this information to an ASCII file or to a formatted file for importing into another database.

Software Inventory
Having a complete software inventory is as important as your hardware inventory, especially when planning a roll-out of new software.

The Software Inventory tool will allow you to scan your system to see what software is installed. It searches your disk for applications registered in a database to perform the detection. Currently, it has the capbility to detect more than 2000 applications. You can also import software dictionaries from the Software Publisher’s Association SPAudit or from Qsoft to use during the search. Optinonally, you can add your own software to the dictionary, giving you the capability to search for custom written applications.

Applications are identified by a file name (or group of files), the date, and size. Optionally, the software can be groped based on application type, and searches made only for a specific type of application. The results of the system scan can be exported to a Lotus Notes or DB2/2 database, exported to an ASCII file, or formatted for importing into another database.

System Profile
To make the information collected about your system, NetFinity provides the System Profile tool to allow you to record user and hardware specific information. This includes data such as user name, department, and hours worked as well as hardware make, model, and serial numbers. Additionally, the System Profile tool provides space for entering any miscellaneous information about the system. This is a handy place to keep up with information such as NetBIOS names, Locally Administered Addresses, TCP/IP host names, etc. This information can also be exported to a Lotus Notes or DB2/2 database for future reporting purposes.

System Monitor
Just be cause your system is running doesn’t mean that it is operating optimally. The System Monitor tool allows you to monitor critical system resources for potential problems. This includes things like CPU Utilization, Memory Usage, and Disk Usage. Thresholds can be configured by the user, and alerts generated when those values are exceeded. This will enable the notification of exceptions like CPU utilization being too high or amount of available disk space being too low. This data can also be collected and exported to a Lotus Notes or DB2/2 database.

Monitoring system operation and receiving alerts isn’t sufficient for maintaining system availability. You must be able to automate responses to the alerts you receive. The Alert Manager provides the capability to process these alerts once they are received. There is a variety of proactive actions which can be executed automiatically when an alert is received. This includes logging the alert, executing commands, dialing a pager, or forwarding the alert to another NetFinity Manager or to an SNMP Manager. This will enable error correction activities to be initiated before users begin calling the help desk.

Alert processing is very specific. You can configure actions based on a number of qualifying factors.

These include the following: Using these factors, you can configure a specific resolution action for each system, alert type, or severity. For example, if Server A has a performance problem, you may want to dial the administrator’s pager. If Server B has the same performance problem, you can send an E-Mail message to the back-up administrator.
 * Alert Type
 * Severity
 * The NetFinity application which generated the alert
 * Application Alert Type
 * Sender ID

Web Manager
Some organizations may need to remotely perform some systems management activities. Other organizations may need to give management capabilities to people who don’t necessarily need or want the NetFinity Manager code installed on their system.

NetFinity provides the Web Manager to accomplish this task. The Web Manager is a mini HTTPD server which allows any of the NetFinity tools to be served remotely over an internet or intranet and executed by any system using a Web browser. This provides a method of creating an administrator workstation without having to install the NetFinity product on it. A user with a Web Browser simply connects to the Managing system’s Web Manager, and from there can manage any of the Clients on the network. To execute the Remote Session tool, the web browser must support Java.

Serial Control
Some organizations might need remote management capabilities, but lack a connection between two sites, as in small branch offices with stand-alone networks under the responsibility of the main office. In this environment, you can use the Serial protocol of NetFinity to connect from one NetFinity Manager system to another over an async line. The Manager at the main office can connect the Manager at the remote branch, then pass through to the clients needing to be managed. The Serial Control tool allows you to configure this connection including phone numbers, User ID’s and passwords to be used for authentication.

Specific Management Tools
In addition to the general management capabilities for all systems, NetFinity also provides specific management tools for systems like the PC Servers which implement enhanced architectural designs. This includes enhanced memory monitoring, monitoring reference partitions on the hard disk, and collecting additional hardware information from systems with either a MicroChannel or PCI bus.

Power-On Error Detect
On the non-array models of PC Servers, hardware configuration is performed using utilities stored ona special partition on the harddisk typically referred to as a Reference Partition or IML partition. When the system is restarted, it verifies the configuration by running a power-on self test or POST for short. If a configuration change is recognized or a failed component is discovered a POST error is displayed on the screen. NetFinity provides a utlitiy called Power-On Error Detect (or POED for short) which will generate an alert based on receiving a POST error. This utility is loaded on the IML partition on the harddisk and the alert is generated over NetBIOS.

The POED service will generate warnings whenever the IML partition is booted. These alerts will be received by the POED service on the NetFinity Manager, which can generate other alerts to be received and processed by the Alert Manger.

System Partition Access
Maintaining the latest version of Adapter Definition Files either on the reference diskette or on the IML partition of the harddisk can greatly improve the performance of the hardware.

Updating the IML parition on the non-array models of the PC Server Systems with the latest version of these files typically requires you to shut-down the system and boot into the IML partition. Then, you must manually copy the files from each diskette to the IML partition using the Copy Option Diskette utility.

The System Partition Access tool provides a method to access the IML parition without having to shut-down the system. This allows you to upgrade these files while maintaining system availability. Additionally, you can make a back-up of the IML partition without ever having to shut-down the system. Once these files have been updated, changing the configuration is a simple matter of booting into the IML partition, running change configuration, save the configuration, then reboot the system again.

RAID Manager
To meet the expandeded storage needs, some models of the PC Servers ship with an optional RAID adapter. This enables the configuration of disk drives with enhanced data integrity, providing the ability to recreate the data dynamically if a drive fails. However, along with this enhanced functionality is also an additional burdon of managing the drive array configuration.

The RAID Manager allows you to remotely configure the array on a Client system from the Manager system by stopping or starting a device, adding a device, or setting a hot-spare drive. By viewing the RAID Manager interface, you can graphically see which devices make up the array and which adapter they are connected to. The RAID array can also be monitored with the System Monitor to ensure that the devices are functioning. Optionally, and alerts can be generated if a device stops or if the array configuration changes.

Predictive Failure Analysis
The ultimate way to prevent data loss would be to know when drive is going to fail before it actually fails. That is exactly the function that the Predictive Failure Analysis (or PFA) performs. It monitors IBM disk drives on the non-array models of the PC Servers for potential failures, and generates an alert from 24-48 hours before the failure actually occurs. This alert can be used as the basis to place a service call to IBM and have the drive replaced before a failure actually occurs, thus preventing the loss of critical data on the drive.

ECC Memory Manager
In addition to ensuring data integrity on the disk drive, you should also monitor the data in memory to ensure its integrity. The ECC Memory Manager provides a method of counting single-bit errors and attempting to correct those errors on PC Servers with error checking and correction memory installed. The System Monitor tool can monitor these errors and generate an alert if a specific threshold is exceeded. Additionally, if a user-defined threshold is exceeded, the ECC Memory Manger can generate a non-mask interrupt (NMI). Special caution should be taken when using this feature of the ECC Memory Manager since a NMI will cause the most systems to hault.

Enhanced Management Functionality
Beyond the basic operating system and hardware management, NetFinity also provides monitoring of specialized hardware such as the PC Server 720 and of network operating system to ensure that they are operating satisfactorily. This section will discuss some of these enhanced management capabilities.

PC Server 720
Because excessive heat or reduced voltage can result in system failure. On the PC Server 720, environmental indicators have been engineered into the hardware. Using the System Monitor tool, you can monitor these environmental indicators, including: As with the other System Monitor monitors, you can configure the generation of alerts when thresholds are exceeded. This allows you to be notified when the operating temperature becomes excessive, or there is an unexpected change in the voltage of the system.
 * Power Supply Temperature
 * System Temperature
 * Planar Temperature
 * Power Supply Voltage

In addition to monitoring environmental indicators, the System Monitor tool also has enhanced monitors for the CPU’s. Therefore, in addition to having one monitor with an average across all CPU’s, there is a monitor for each CPU in the system. This enables you to determine each individual processor's workload.

ServerGuard
IBM"s ServerGuard adapter provides environmental monitoring capability for the PC Server 500 and the Micro Channel versions of the PC Servers 310, 320, and 520. The ServerGuard management program seamlessly into the NetFinity folder. Just as with the PC Server 720, you can monitor these generate alerts when these environmental indicators using the System Monitor tool and have alerts generated when user-defined thresholds are exceeded.

Additionally, the ServerGuard adapter has the ability to cleanly shut-down the system as well as remotely power-off and power-on the system. This is done through the ServerGuard management application, which is also installed in the NetFinity graphical user interface.

Network Operating Systems
An optimally tuned PC Server won’t ensure you have the best system performance unless the software is also performing optimally. Using the System Monitor tool, you can monitor indicators of your network operating system performance to ensure it is operating optimally. As with the other System Monitor monitors, alerts can be generated when user defined thresholds are exceeded.

Additional information on NetFinty can be found at various sites on the internet. Here are a few sites of interest you should consider browsing:
 * []
 * []
 * []