How to Monitor Your VPS: CPU, RAM, Disk, and Uptime Alerting with Netdata and UptimeRobot

How to Monitor Your VPS: CPU, RAM, Disk, and Uptime Alerting with Netdata and UptimeRobot

A VPS without monitoring is infrastructure you are flying blind. You will not know when CPU usage is consistently at 90%, when disk space is running out, or when your site goes down — until a user or client tells you. Monitoring catches these problems early, often before they affect users. This guide sets up a complete monitoring stack using two free tools: Netdata for real-time server metrics and UptimeRobot for external uptime monitoring with instant alerting.

What You Need to Monitor

Effective VPS monitoring covers four categories:

  • Uptime: Is the server reachable? Is the web application responding correctly?
  • Resource utilization: CPU, RAM, swap, and disk usage trends over time
  • I/O performance: Disk read/write rates, I/O wait, network throughput
  • Application health: Is Nginx running? Is MySQL responding? Are error rates elevated?

The two tools in this guide cover all four categories: UptimeRobot handles external uptime checks, Netdata handles everything on the server itself.

Part 1: UptimeRobot — External Uptime Monitoring

Why External Monitoring Matters

Server-side monitoring tools cannot tell you if your server is unreachable from the outside world — they are running on the very server that might be down. External monitoring from a third-party network provides the definitive answer: is your site accessible to users right now?

Setting Up UptimeRobot (Free Tier)

UptimeRobot’s free tier provides:

  • 50 monitors
  • 5-minute check intervals
  • Email, Slack, Discord, webhook, and SMS alerts
  • Public status pages
  1. Create a free account at uptimerobot.com
  2. Click “Add New Monitor”
  3. Select monitor type: HTTP(S) for websites, Port for specific services, Ping for basic connectivity
  4. Enter the URL or IP you want to monitor
  5. Configure alert contacts (email is configured automatically; add Slack or other integrations in Settings → Alert Contacts)

Recommended UptimeRobot Monitors

Add these monitors for a typical VPS deployment:

  • HTTPS monitor for your main domain: https://yourdomain.com — checks that Nginx and your application are responding with HTTP 200
  • Ping monitor for your VPS IP — checks that the server is reachable at the network level (distinguishes server outages from application crashes)
  • Port monitor for SSH (port 22 or your custom port) — confirms the SSH daemon is running
  • Port monitor for SMTP (port 25) — if running a mail server, confirms Postfix is accepting connections

Keyword Monitoring for Application Health

UptimeRobot’s “Keyword” monitor type checks not just whether a page returns HTTP 200, but whether it contains a specific text string. This detects cases where the server is running but the application has an error state:

  1. Add a health endpoint to your application (e.g., /health that returns “OK”)
  2. Create an UptimeRobot Keyword monitor for that URL, checking for the keyword “OK”
  3. If your application crashes and the page returns an error, the keyword won’t be found and you’ll receive an alert

Part 2: Netdata — Real-Time Server Monitoring

Installing Netdata

Netdata provides hundreds of pre-configured metrics collectors that auto-detect running services. A single command installs it:

wget -O /tmp/netdata-install.sh https://my-netdata.io/kickstart.sh
bash /tmp/netdata-install.sh --stable-channel --disable-telemetry

Netdata starts automatically and listens on port 19999. Access the dashboard temporarily by allowing port 19999 through the firewall:

sudo ufw allow 19999/tcp

Visit http://YOUR_VPS_IP:19999 to see the real-time dashboard. Once you have verified it works, close port 19999 and access Netdata via SSH tunnel instead:

# Close port 19999 to the internet
sudo ufw delete allow 19999/tcp

# Access Netdata securely via SSH tunnel from your local machine
ssh -L 19999:localhost:19999 user@YOUR_VPS_IP

Then visit http://localhost:19999 in your local browser.

What Netdata Monitors Automatically

Netdata auto-detects and collects metrics for:

  • CPU (per-core usage, interrupts, context switches)
  • Memory (RAM, swap, page faults)
  • Disk I/O (per-disk throughput, IOPS, utilization, latency)
  • Network (per-interface throughput, packets, errors)
  • Nginx (requests/second, active connections, response codes)
  • MySQL/MariaDB (queries/second, slow queries, connections, InnoDB metrics)
  • PHP-FPM (active workers, requests per second, queue length)
  • Redis (operations/second, memory usage, hit rate)
  • Docker containers (per-container CPU, RAM, network, I/O)
  • System processes (CPU and memory per process)

Configuring Netdata Alerts

Netdata ships with hundreds of pre-configured alert rules. View active alerts:

sudo nano /etc/netdata/health.d/

Customize alert thresholds by creating override files. For example, to alert when disk usage exceeds 80%:

sudo nano /etc/netdata/health.d/disk-custom.conf
alarm: disk_usage_warning
    on: disk.space
lookup: average -10m unaligned of used
 units: %
 every: 1m
  warn: $this > 80
  crit: $this > 90
  info: disk space utilization
    to: sysadmin

Configuring Email Alerts from Netdata

sudo nano /etc/netdata/health_alarm_notify.conf

Find and configure the email section:

SEND_EMAIL="YES"
DEFAULT_RECIPIENT_EMAIL="admin@yourcompany.com"
EMAIL_SENDER="netdata@YOUR_VPS_IP"

Install a mail transfer agent if not already present:

sudo apt install msmtp msmtp-mta -y

Configure msmtp to relay through your email provider (Gmail, SendGrid, Postmark, or your own mail server).

Configuring Slack Alerts from Netdata

Create an incoming webhook in your Slack workspace, then configure Netdata:

sudo nano /etc/netdata/health_alarm_notify.conf
SEND_SLACK="YES"
SLACK_WEBHOOK_URL="https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
DEFAULT_RECIPIENT_SLACK="#alerts"

Test the Slack integration:

sudo -u netdata /usr/libexec/netdata/plugins.d/alarm-notify.sh test slack

Key Metrics to Watch and Their Alert Thresholds

Metric Warning Critical Action
CPU utilization >70% for 10 min >90% for 5 min Identify process, optimize or scale
RAM utilization >80% >95% Check for memory leaks, add RAM
Swap usage >20% >50% Immediate: add RAM or reduce memory usage
Disk usage >80% >90% Clean logs/cache, expand storage
I/O wait >10% >20% Optimize queries, add caching, check storage
Disk I/O utilization >70% >90% Optimize or move to faster storage
Network error rate >0.1% >1% Check network configuration and hardware

Reading Netdata Charts Effectively

Identifying Traffic Spikes

Correlate the Nginx “requests/second” chart with CPU and RAM usage. A traffic spike that causes CPU to hit 90% but RAM remains stable indicates a CPU-bound workload — add caching or scale vertically. A traffic spike that consumes all available RAM indicates insufficient object caching or PHP-FPM pool over-allocation.

Database Performance Analysis

The MySQL/MariaDB section shows slow queries per second. If slow queries increase during traffic spikes, check the slow query log to identify which queries need optimization or indexing:

# Enable slow query log in MariaDB
sudo nano /etc/mysql/mariadb.conf.d/50-server.cnf
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 2
sudo systemctl restart mariadb

# Analyze slow queries
sudo mysqldumpslow -t 10 /var/log/mysql/slow.log

Setting Up a Public Status Page

UptimeRobot’s free status page feature lets you create a public-facing page showing your service uptime history. Share the URL with clients so they can self-check service status during incidents rather than contacting support:

  1. In UptimeRobot dashboard, go to “Status Pages”
  2. Create new status page, add your monitors
  3. Optionally configure a custom domain (e.g., status.yourdomain.com)

Getting Started

Both Netdata and UptimeRobot work immediately after installation on any USA VPS or Hong Kong VPS. Netdata requires approximately 100–200 MB RAM for its collector processes — factor this into your VPS sizing. The UptimeRobot free tier is sufficient for most single-server deployments; the paid tier reduces check intervals to 1 minute if faster detection is required.

Conclusion

Complete VPS monitoring is a two-layer problem: external uptime checks that confirm your service is reachable from the outside world, and internal server metrics that show what is happening inside the box. UptimeRobot solves the external layer in five minutes with zero ongoing cost. Netdata solves the internal layer with automatic detection of hundreds of metrics and configurable alerts. Together, they give you the visibility to diagnose problems proactively rather than reactively — the foundation of reliable VPS operations.

Fast • Reliable • Affordable VPS - DO It Now!

Get top VPS hosting with VPS.DO’s fast, low-cost plans. Try risk-free with our 7-day no-questions-asked refund and start today!