A network that measures downtime in millions of dollars per minute (or per second!) needs a serious, enterprise-level network management tool. Nothing less will do.
The ideal network management platform accurately discovers devices, computers and applications on the network, works on networks of any size and uses computing resources frugally (after all, it performs no data processing - it's there to watch over the network).
It can work within the framework of a global directory (LDAP, for example). It graphically depicts the entire network, subsets of the network and individual devices. It monitors the status and health of every device or computer on the network. It can glean its data from a variety of sources, including agents, probes, SNMP-enabled devices, log files and Windows performance files.
That's not all. It needs to work as well with IPv6 as it does with IPv4; accept and use complex descriptions of thresholds; and can send alert notifications via e-mail, pager or text message to different individuals or groups depending on the nature of the problem, and it can escalate these notifications when the problem persists.
It also must perform root cause analysis to identify a problem device or computer that's causing a cascade of network error messages. It can correct some problems automatically by restarting a process, resetting a port or running a script. It works within virtual environments and cloud-based environments. It integrates with help desk software and with other monitoring tools. It produces useful, easy-to-understand and timely reports. It's highly scalable and reliable. And the ideal network management is easy to use.
We invited four enterprise-level network management software vendors to submit their best products for review in our Alabama lab. IBM sent us Tivoli Netcool/OMNIbus and Tivoli Network Manager IP Edition, CA Technologies sent us CA eHealth and CA NetQoS ReporterAnalyzer. And HP sent both the Windows and Linux versions of its Automated Network Management Suite. BMC initially accepted our invitation, but then offered us "a guided tour of the products in our environment" instead of sending us a product to review. (See how we conducted our test.)
Picking a winner among these three network managers is impossible. Each one is a sophisticated, mature and highly capable tool for achieving maximum network availability, uptime and performance. If you have a serious network, any one of these three network managers will help you quickly solve network problems and will save your organization megabucks.
HP Automated Network Management Suite: Flawless, Scalable, Modular
HP's Automated Network Management Suite's high points are its modularity, its ability to monitor service level compliance and its automation of many of a network engineer's daily tasks - i.e., it's scalable, it helps track actual vs. expected performance and it saves time. As we tested, we didn't find any drawbacks in Automated Network Management Suite.
Automated Network Management Suite consists of Network Node Manager (NNM) and a spate of components and Smart Plug-ins (SPI), including HP Network Automation, NNMi Integration Enablement, NNM iSPI Network Engineering Toolset, NNM iSPI Performance for Metrics, NNM iSPI Performance for Traffic and NNM iSPI Performance for Quality Assurance, NNM iSPI Performance for Traffic, NNM iSPI for IP Telephony, NNM iSPI for IP Multicast and NNM iSPI for Multiprotocol Label Switching (MPLS), all under an umbrella of network automation. Network Node Manager monitors for faults and network availability, while the performance-related plug-ins gather utilization data and monitor for specific devices, protocols and applications.
Automated Network Management Suite accurately discovered our network (noting all our network devices, servers and virtualized environments), tracked device status, processed SNMP alerts, graphically displayed our network, alerted us to problems, fixed problems automatically, gathered statistics and produced useful reports.
HP supplies more than 2,000 Management Information Bases (MIB) with Automated Network Management Suite. These cover a wide variety of network equipment from over 50 major hardware vendors, including routers, switches, bridges and repeaters.
Automated Network Management Suite captured some Layer 2 data, but for the most part it mapped Layer 3 details. Just a few of the myriads of these details were utilization and error percentages, total packets by category and by protocol, retransmits, server memory utilization and full-duplex utilization percentage.
Automated Network Management Suite collected network health data, analyzed the stored device status and event data and reported results in useful charts and graphs. The system's root-cause problem analysis was especially helpful in zeroing in on a specific device that was causing an outage or performance problem, while its path-analysis capability was similarly helpful in pinpointing problems and performance degradations involving network pathways and linkages.
HP management suit gets more cloud-ready
Automated Network Management Suite's automatic baseline feature set alarm thresholds for us by analyzing collected device status and event data, thus giving it the ability to more realistically detect exceptions, faults and errors. After it created a baseline for our network, we manually added a few thresholds of our own. Automated Network Management Suite thereafter generated prompt and highly informational alarms, via pager or e-mail, to notify us when the thresholds were exceeded.
Automated Network Management Suite's distributed architecture scales well to handle larger and more complex network environments. Automated Network Management Suite even monitored itself to ensure it's running normally. It paged our administrator and sends e-mail alerts if the self-monitor finds, for instance, that Network Node Manager, or its server, had died. Automated Network Management Suite can initiate corrective actions, such as restarting a background process or resetting a router port.
The Web browser-based user interface is responsive, thoughtfully designed and highly configurable. Automated Network Management Suite provides a central console for controlling multiple Network Node Manager instances. This central console consolidated event management, performance monitoring and automated alert processing in the lab. Our network administrator used its high-level Visual Basic Script-like language to customize the Automated Network Management Suite's behavior and display. The console dashboard's network health indicators were helpful and informative.
For business-oriented service-level agreements (SLA) we established, Automated Network Management Suite tracked our transactions, their network travel, their processing at the server and their storage in a database. Automated Network Management Suite gave us availability and response time details, and it alerted us when any of our SLA parameters were exceeded.
Automated Network Management Suite runs on Windows Server 2003, Windows Server 2008, Red Hat Enterprise Linux and Solaris.
IBM Tivoli Netcool/OMNIbus: Highly scalable, excellent problem solving abilities, highly configurable and integrates well with other systems
Tivoli Netcool/OMNIbus consolidates network status and health data from multiple network domains and subnets. Netcool/OMNIbus supervises and manages network events across a network of virtually any size and complexity. Netcool/OMNIbus gets much of its data from Tivoli Network Manager IP Edition, which collects and stores data from network layers 2 and 3.
Tivoli Network Manager's stored network knowledge includes information about both physical and logical network connections. It accurately and helpfully recognized, for instance, VPN, virtual LAN, asynchronous transfer mode (ATM), frame relay and MPLS connections in addition to our physical, port-to-port device connections.
Together, Netcool/OMNIbus and Network Manager gave us a clear and accurate picture of the test networks we asked them to manage, no matter how complex. Through Netcool/OMNIbus and Network Manager, we configured quite sophisticated threshold tests, such as "Emit an alert if the San Francisco WAN link's utilization exceeds 5% on Saturdays and Sundays, 20% after 8 p.m. during the week, 50% during weekdays or 75% at 10 a.m. and 2 p.m. on weekdays."
For reliability's sake, Netcool/OMNIbus and Network Manager monitored themselves and restarted automatically when we artificially caused a monitoring/management component to fail.
Netcool/OMNIbus and Network Manager support current and evolving standards, including ITIL, COBIT, eTOM, IPv4 and IPv6, and uses FIPS 140-2 approved cryptographic providers.
To our delight, Netcool/OMNIbus and Network Manager worked well in both mixed and pure environments when we confronted them with IPv4 and IPv6 packets.
We also noted that network-intensive organizations that use an operational support system (OSS) to track network inventory, the provisioning of services and the configuration of network components will appreciate Network Manager's ability to integrate with an OSS.
Tivoli Netcool/OMNIbus and Tivoli Network Manager excelled at handling millions and even tens of millions of events per day in our tests. Moreover, for each network problem we artificially induced, Netcool/OMNIbus and Network Manager quickly and accurately sifted through and analyzed the events to distill root causes for us. Netcool/OMNIbus and Network Manager saved us the equivalent of hundreds of hours of network troubleshooting when it pinpointed the actual problem devices that were responsible for a cascade of network error messages. Netcool/OMNIbus and Network Manager even located a fault we caused in a backup data path. If the primary path had failed, the fault would've kept the backup path from taking over for the primary data path.
On the downside, Netcool/OMNIbus' and Network Manager's browser-based user interface, Netcool/Webtop, was somewhat cumbersome and not as responsive as we'd have liked. Netcool/Webtop is a Java application that displays dashboards of maps, charts, tables and event lists. To its credit, when we logged on as super-administrators, we could easily configure Netcool/Webtop to show just those dashboard components we wanted to see.
However, the Netcool/Webtop user interface was a bit sluggish. In comparison, we've seen some complex AJAX-enabled (i.e., JavaScript-based) Web browser interfaces that were snappier and more responsive. IBM provides additional graphical tools in the form of Netcool/Desktop, a native Motif- or Windows-based client that presents an alternative view of network activity. Like Netcool/Webtop's, Netcool/Desktop's display is highly configurable.
IBM supplies more than 1,000 software-based Netcool Probes with Netcool/OMNIbus and Network Manager. These are lightweight agents we easily deployed across the far reaches of our network.
Netcool Probes stand watch over a wide variety of network devices, servers and server processes, and they report status and health information to a central console. We also noted that organizations with vertical-market business applications can painlessly create Netcool Probes that can monitor the running of the business application to alert an administrator when, for example, the application crashes or it begins consuming excessive CPU resources. IBM ships more than 600 MIBs with Netcool/OMNIbus.
Netcool/OMNIbus works hand-in-glove to automatically open and close trouble tickets in help desk trouble-ticket-tracking software such as Siebel, Peregrine and of course Tivoli Service Request Manager.
Netcool/OMNIbus and Network Manager run on Solaris, HP-UX, AIX, Windows Server 2003, Windows Server 2008, Red Hat Enterprise Linux, SLED, SuSE Linux Enterprise Server and VmWare ESX for Red Hat EL.
CA eHealth and NetQoS ReporterAnalyzer: Powerful predictive analysis, excellent reporting features
EHealth's and NetQoS ReporterAnalyzer's strong suits are their ability to handle diverse device types, their ability to do predictive performance analysis and the wealth of useful reports they offer. If eHealth and NetQoS ReporterAnalyzer have a weakness, it's their consumption of computing resources. You might need a somewhat faster server, for instance, on which to run eHealth and NetQoS ReporterAnalyzer.
EHealth is CA's enterprise-level network monitoring and management tool for finding and fixing network faults, while NetQoS ReporterAnalyzer is a network traffic analysis tool that reveals how a particular type of traffic or a specific network node are exceeding thresholds.
At an interval we could configure, eHealth polled our network devices to collect status and health data. EHealth then used a patented set of highly complex algorithms to know which part of the network was failing or was likely to fail soon. This predictive analysis feature is a godsend for organizations that can little afford network downtime and that want to proactively stay ahead of potential network problems.
When eHealth detected a threshold breach that we created, it sent us e-mail and paged us. If we ignored the initial alerts, it escalated matters by e-mailing and paging a second tier of people. Alerts can be triggered for hard outages such as loss of communication with a device or when, for example, a WAN link exceeds a threshold because network utilization is higher than, say, 75%.
We could express quite complex thresholds with eHealth, which used CA's Time-Over-Threshold (TOT) or Deviation-From-Normal (DFN) algorithms to keep false alarms to a minimum. We could specify that we wanted to be alerted if network utilization exceeded a threshold even once, or we could specify that we wanted to be alerted only if high network utilization persisted for a specified period of time.
EHealth's dashboard display provided real-time status information for the network. EHealth also has a central console user interface that graphically depicts the entire network or any portion of it. Clicking on a yellow (minor alert) or red (major alert) network device drills down through eHealth's data to reveal the nature of a problem as well as details about the problem. We liked that we could generate instant reports to help document the problem.
EHealth's reports are informative, easy to understand and easy to produce. We used its reports to help troubleshoot problems, identify unusual network behavior for future investigation, document SLA compliance and identify trends for capacity-planning purposes. Through the simple-to-use reports interface, we could select the network elements or groups of elements we wished to document, specify a chart type (Line, Bar, Stacked Line) and choose a calendar window such as "Today" or "Previous 7 Days." We could also set up custom date and time ranges for our reports.
EHealth's At-a-Glance Reports were our first line of defense when we needed to document a problem so we could collaboratively share the nature of the problem with other network engineers. At-a-Glance Reports provide a high-level, quick view of key data, including network utilization, server utilization (CPU, memory or hard disk), the identity of a failed application and network connectivity errors.
We found eHealth's Trend Reports made quick work of capacity planning chores. For all or any part of the network and for whatever time period we wished, we could configure and schedule reports that showed exactly the device, computer, application or network behaviors we wanted to document. We used these reports initially to produce a baseline of the network. Then, over time, we used these reports' graphs and charts to precisely identify utilization trends that revealed the upgrades we should plan for. We also set up a number of tabular reports to document uptime and availability as well as provide utilization statistics for billing (chargeback) purposes.
We particularly liked eHealth's report customization features, which let us produce, for example, trend reports for a specific user group and/or specific set of network resources, such as databases.
Impressively, CA includes more than 5,000 MIBs in eHealth.
EHealth and NetQoS ReporterAnalyzer run on Windows Server 2003 and Solaris.
Conclusion
All three of these network managers - IBM Tivoli Netcool/OMNIbus and Tivoli Network Manager IP Edition, CA eHealth and CA NetQoS ReporterAnalyzer and HP Automated Network Management Suite - are top-of-the-line, mature and highly capable tools for ensuring maximum availability, uptime and performance.
Barry Nance runs Network Testing Labs and is the author of Network Programming in C, Introduction to Networking, 4th Edition and Client/Server LAN Programming. His e-mail address is barryn@erols.com.
Read more about infrastructure management in Network World's Infrastructure Management section.
Aucun commentaire:
Enregistrer un commentaire
Remarque : Seul un membre de ce blog est autorisé à enregistrer un commentaire.