8-185
Cisco IOS XR Troubleshooting Guide for the Cisco CRS-1 Router
OL-21483-02
Chapter 8 Process Monitoring and Troubleshooting
Troubleshooting High CPU Utilization and Process Timeouts
If the hog is persistent and the node is reset, contact Cisco Technical Support. For contact information
for Cisco Technical Support, see the “Obtaining Documentation and Submitting a Service Request”
section on page viii in the Preface. Copy the error message exactly as it appears on the console or in the
system log and provide the representative with the gathered information.
Note For more information on wdsysmon and memory thresholds, see the “Watchdog System Monitor”
section on page 9-197 in Chapter 9, “Troubleshooting Memory.”
Troubleshooting High CPU Utilization and Process Timeouts
This section describes the troubleshooting of common problems that can occur due to high CPU
utilization, and in some cases causing process timeouts. It includes the following topics:
• General Guidelines for Troubleshooting CPU Utilization Problems, page 8-185
• Troubleshooting a Process Block, page 8-188
• Troubleshooting a Process Crash on Line Cards, page 8-192
• Troubleshooting a Memory Leak, page 8-193
• Troubleshooting a Hardware Failure, page 8-194
• Troubleshooting SNMP Timeouts, page 8-194
• Troubleshooting Communication Among Multiple Processes, page 8-194
General Guidelines for Troubleshooting CPU Utilization Problems
Optimal CPU utilization is vital for the routers to function properly. In general, the following cases can
cause high CPU utilization:
• Normal conditions—One or more processes might be using a large percentage (or all) of the
available CPU due to the following reasons:
–
Routing table convergence calculations (until the routing table converges)
–
SNMP polling
–
Any query that requires a large amount of CPU
–
Communication among multiple processes
• Abnormal conditions—A process might be using excessive CPU due to the following reasons:
–
Process (thread) loop
–
Memory leak
–
Process blocking due to bug or hardware problem that causes other process(es) waiting for a
reply (loop)
There is no single definition of “high CPU utilization.” Utilization depends on many factors, including
the number of clients served and the current configuration on the router. The following example
illustrates one approach to troubleshooting utilization. (Details of the commands are provided in the
sections that follow.)
Example:
Yo u ru n t he top processes command. (It shows the top ten processes in terms of CPU usage.)