Ubuntu Server Crash Troubleshooting Guide
If you’re experiencing unexpected shutdowns, freezes, or performance issues, the steps below will teach you how to identify and resolve the most common issues using logs and diagnostic tools.
Ubuntu Server Crash Troubleshooting Guide
Introduction
This guide is designed to help you diagnose and troubleshoot crashes on Ubuntu Server. If you’re experiencing unexpected shutdowns, freezes, or performance issues, the steps below will teach you how to identify and resolve the most common problems using logs and diagnostic tools.
By the end of this guide, you’ll know:
Where to find system logs
How to read logs to identify problems
Basic troubleshooting steps for common issues
Chapter 1: Understanding Ubuntu Server Crashes
Common Causes of Crashes
Here are some common causes of crashes on Ubuntu Server:
Insufficient memory (RAM): Your server may run out of memory, especially if too many processes are running.
Disk space issues: Low disk space can cause crashes when the server cannot write critical data.
Driver issues: Although servers usually have fewer driver-related issues than desktops, outdated or incompatible hardware drivers can still cause crashes.
Kernel panics: The kernel (the operating system's core) may crash due to misconfiguration, software bugs, or hardware failures.
Chapter 2: How to Check Logs
Where to Find Logs
Ubuntu Server stores detailed logs of system activities that are essential for troubleshooting. These logs are located in the /var/log/
directory. Critical log files to check include:
/var/log/syslog
: The primary system log. It logs boot messages, service events, and general system errors./var/log/kern.log
: Logs related to the Linux kernel. This is useful for diagnosing kernel panics and hardware-related issues./var/log/dmesg
: Logs system boot and hardware information. Use this to check for hardware or startup issues./var/log/auth.log
: Tracks authentication events like login attempts, which can be helpful if you suspect security-related crashes.
Example: Checking System Logs
If your server crashes or becomes unresponsive, the first step is to check the logs. Use the following command to view the latest system logs:
This will display recent log entries. Pay attention to any errors or warnings near the time the crash occurred. For example, an entry like this could indicate a memory issue:
In this case, the server ran out of memory, and the kernel had to terminate a process to recover.
Chapter 3: Basic Troubleshooting Steps
1. Check Available Disk Space
Running out of disk space can cause severe issues for a server. To check available disk space, use the following command:
Look for any partitions that are close to 100%
usage. If your root partition (/
) is complete, consider clearing space by removing unnecessary files or moving data to another storage location.
2. Check for Memory Issues
If your server is low on RAM, it can slow down or crash entirely. Use this command to check memory usage:
Look at the "available" memory column. If this number is deficient, your server may run out of memory. Consider closing resource-heavy processes or upgrading the server's RAM.
3. Check for Kernel Panics
Kernel panics can cause the server to crash or reboot unexpectedly. To diagnose a kernel panic, check the kernel logs with the following command:
Example output:
This means the kernel encountered a critical issue and halted the system. Kernel panics can indicate serious hardware problems or kernel misconfigurations.
4. Verify Software and Drivers
On servers, driver-related crashes are less common, but they can still happen. Ensure your server is up-to-date with the latest software and driver updates by running:
5. Check for Application-Specific Crashes
Sometimes, only a specific application might crash. For example, if your web server (like NGINX or Apache) crashes frequently, check its logs for clues. Logs for most applications are stored in the /var/log/
directory or a subdirectory (e.g., /var/log/nginx/error.log
for NGINX).
You can use the following command to check the most recent lines of an application log:
Look for errors related to the time when the crash occurred.
Chapter 4: Log Rotation and Wiped Logs
Log Rotation
Ubuntu Server uses a process called log rotation to prevent logs from growing too large and consuming too much disk space. Log rotation works by archiving old logs and creating new ones. Archived logs may be compressed and stored with extensions like .gz
. The most recent logs will be in their regular form (e.g., syslog
, kern.log
), while older logs will be named something like syslog.1
or syslog.1.gz
.
How to View Older Logs
To view older logs, you can either open the archived log files directly or use zcat
to read compressed logs. For example:
This command will display the contents of the compressed log file.
Logs After Reboot
After a reboot, some logs (like /var/log/dmesg
) may be wiped, as they are stored in memory and not persisted to disk. However, the system log (/var/log/syslog
) will persist across reboots.
To check boot-specific logs or events, use the following command to view logs from the previous boot:
This command tells journalctl
to show logs from the previous boot (-b -1
), which can be helpful if a crash occurred before the server was restarted.
Why Logs May Disappear
Reboots: Certain logs, such as
dmesg
, are wiped on reboot unless configured otherwise.Log Rotation: Logs may be archived out of the active directory. If logs are older, they may be compressed.
Log Settings: The
logrotate
configuration determines how often logs are rotated and how long they are kept. The configuration file for log rotation is located at/etc/logrotate.conf
and/etc/logrotate.d/
.
If important logs seem to be missing after a crash, reviewing log rotation settings or ensuring persistent logging is enabled may help.
Conclusion
Understanding how to access logs and perform basic diagnostics can help you troubleshoot Ubuntu Server crashes effectively. Monitoring memory usage, disk space, and system logs will also help you prevent and resolve many of the most common issues.