How to Set Up a Headless Chrome Browser on a VPS for Automation and Scraping

How to Set Up a Headless Chrome Browser on a VPS for Automation and Scraping

A headless browser running on a VPS is one of the most powerful automation tools available to developers. Unlike simple HTTP requests, a headless Chrome instance renders JavaScript, executes dynamic content, handles login sessions, fills forms, takes screenshots, and generates PDFs — exactly what you need for modern web automation, scraping JavaScript-heavy sites, and automated testing.

This guide covers installing Chromium and Puppeteer on an Ubuntu VPS, running headless browser automation scripts, and keeping them running 24/7 with PM2.

Use Cases for Headless Chrome on VPS

  • Web scraping — Scrape JavaScript-rendered content that simple HTTP requests can’t reach (SPAs, React apps)
  • Screenshot automation — Generate website screenshots, social media card images, or PDF reports automatically
  • Automated testing — Run end-to-end tests against staging environments on a CI schedule
  • Form automation — Fill and submit forms, simulate user flows
  • PDF generation — Convert web pages or HTML templates to PDFs
  • Price monitoring — Track product prices on JavaScript-heavy e-commerce sites
  • Social media automation — Schedule posts, monitor mentions (where permitted by platform ToS)

💡 Resource note: Headless Chrome is memory-hungry. Allocate at least 2 GB RAM for running a single browser instance; 4 GB for concurrent sessions. VPS.DO’s 500SSD plan (4 GB RAM) handles most automation workloads. View Plans →


Step 1: Update System and Install Dependencies

sudo apt update && sudo apt upgrade -y

# Install Chromium dependencies
sudo apt install -y \
  chromium-browser \
  libgbm1 \
  libxshmfence1 \
  libasound2 \
  libatk1.0-0 \
  libatk-bridge2.0-0 \
  libcups2 \
  libdrm2 \
  libxkbcommon0 \
  libxcomposite1 \
  libxdamage1 \
  libxrandr2 \
  libgconf-2-4 \
  libnss3 \
  libxss1 \
  fonts-liberation \
  xdg-utils

Step 2: Install Node.js and Puppeteer

curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash -
sudo apt install -y nodejs

mkdir -p /var/automation/mybot
cd /var/automation/mybot
npm init -y

# Install Puppeteer (uses bundled Chromium by default)
npm install puppeteer

# Or use puppeteer-core with system Chromium (lighter)
npm install puppeteer-core

Step 3: Your First Puppeteer Script

nano screenshot.js
const puppeteer = require('puppeteer');

async function takeScreenshot(url, outputPath) {
    const browser = await puppeteer.launch({
        headless: 'new',           // Use new headless mode
        args: [
            '--no-sandbox',        // Required on VPS (no display server)
            '--disable-setuid-sandbox',
            '--disable-dev-shm-usage', // Prevent /dev/shm overflow
            '--disable-gpu',
            '--no-zygote',
            '--single-process'     // More stable on limited-RAM VPS
        ]
    });

    const page = await browser.newPage();
    
    // Set viewport
    await page.setViewport({ width: 1280, height: 800 });
    
    // Set user agent
    await page.setUserAgent(
        'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36'
    );
    
    // Navigate with timeout
    await page.goto(url, { waitUntil: 'networkidle2', timeout: 30000 });
    
    // Take screenshot
    await page.screenshot({ path: outputPath, fullPage: true });
    
    await browser.close();
    console.log(`Screenshot saved: ${outputPath}`);
}

takeScreenshot('https://example.com', '/var/automation/screenshots/example.png')
    .catch(console.error);
mkdir -p /var/automation/screenshots
node screenshot.js

Step 4: Web Scraper for JavaScript-Rendered Content

nano scraper.js
const puppeteer = require('puppeteer');
const fs = require('fs');

async function scrapeProducts(url) {
    const browser = await puppeteer.launch({
        headless: 'new',
        args: ['--no-sandbox', '--disable-setuid-sandbox', '--disable-dev-shm-usage']
    });

    const page = await browser.newPage();
    
    // Block images and CSS to speed up scraping
    await page.setRequestInterception(true);
    page.on('request', (req) => {
        if (['image', 'stylesheet', 'font'].includes(req.resourceType())) {
            req.abort();
        } else {
            req.continue();
        }
    });

    await page.goto(url, { waitUntil: 'domcontentloaded', timeout: 30000 });
    
    // Wait for a specific element to appear
    await page.waitForSelector('.product-title', { timeout: 10000 }).catch(() => {});
    
    // Extract data
    const products = await page.evaluate(() => {
        return Array.from(document.querySelectorAll('.product-card')).map(card => ({
            title: card.querySelector('.product-title')?.textContent?.trim() || '',
            price: card.querySelector('.product-price')?.textContent?.trim() || '',
            url: card.querySelector('a')?.href || ''
        }));
    });

    await browser.close();
    
    // Save results
    const output = JSON.stringify(products, null, 2);
    fs.writeFileSync('/var/automation/data/products.json', output);
    console.log(`Scraped ${products.length} products`);
    return products;
}

scrapeProducts('https://example-shop.com/products')
    .catch(console.error);

Step 5: PDF Generation from HTML

nano generate-pdf.js
const puppeteer = require('puppeteer');

async function generatePDF(url, outputPath) {
    const browser = await puppeteer.launch({
        headless: 'new',
        args: ['--no-sandbox', '--disable-setuid-sandbox', '--disable-dev-shm-usage']
    });

    const page = await browser.newPage();
    await page.goto(url, { waitUntil: 'networkidle0' });

    await page.pdf({
        path: outputPath,
        format: 'A4',
        margin: { top: '1cm', bottom: '1cm', left: '1cm', right: '1cm' },
        printBackground: true
    });

    await browser.close();
    console.log(`PDF saved: ${outputPath}`);
}

generatePDF('https://yourdomain.com/invoice/123', '/var/automation/pdfs/invoice-123.pdf');

Step 6: Run Automation Scripts on a Schedule

nano /var/automation/run-scraper.sh
#!/bin/bash
cd /var/automation/mybot
node scraper.js >> /var/log/automation.log 2>&1
echo "Scrape completed: $(date)" >> /var/log/automation.log
chmod +x /var/automation/run-scraper.sh
crontab -e
# Run every 6 hours
0 */6 * * * /bin/bash /var/automation/run-scraper.sh

Step 7: Keep Long-Running Browser Services Alive with PM2

For bots or automation services that run continuously (not just on a schedule):

sudo npm install -g pm2

pm2 start /var/automation/mybot/bot.js --name "browser-bot"
pm2 startup
pm2 save

Performance Tips for Headless Chrome on VPS

Limit concurrent browser instances

Each headless Chrome instance uses 200–400 MB RAM. On a 4 GB VPS, run maximum 5–8 concurrent instances.

Reuse browser instances

// Open browser once, open multiple pages, close browser at end
const browser = await puppeteer.launch({...});
const page1 = await browser.newPage();
const page2 = await browser.newPage();
// ... do work ...
await browser.close(); // Close once at the end

Block unnecessary resources

// Block images, CSS, fonts — not needed for data scraping
await page.setRequestInterception(true);
page.on('request', req => {
    ['image', 'stylesheet', 'font', 'media'].includes(req.resourceType())
        ? req.abort()
        : req.continue();
});

Set a swap file

Headless Chrome crashes when RAM is exhausted. A 2 GB swap file provides a safety net:

sudo fallocate -l 2G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab

Troubleshooting Common Issues

Error: Running as root without –no-sandbox

Always include --no-sandbox and --disable-setuid-sandbox when running Puppeteer as root on a VPS. These flags are required in environments without a display server.

Browser crashes with “out of memory”

Add --disable-dev-shm-usage to Puppeteer args. This prevents Chrome from using /dev/shm which is often too small on VPS environments.

Page.goto() times out

Increase the timeout and use domcontentloaded instead of networkidle2 for faster but less complete page loads:

await page.goto(url, { waitUntil: 'domcontentloaded', timeout: 60000 });

Website blocks headless browser

Set a realistic user agent, add realistic delays between actions, and use puppeteer-extra with the stealth plugin to reduce fingerprinting detection:

npm install puppeteer-extra puppeteer-extra-plugin-stealth
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());

Final Thoughts

Headless Chrome on a VPS is one of the most versatile developer tools available. Whether you’re scraping dynamic content, generating automated reports, running visual regression tests, or building browser automation tools, a persistent VPS environment gives you the always-on reliability that a local machine can’t provide.

VPS.DO’s USA KVM VPS plans — with 4 GB RAM, 1 Gbps networking, and full root access — are well-suited for Puppeteer workloads. The KVM virtualization ensures full kernel access that headless Chrome requires.

Related articles:

Fast • Reliable • Affordable VPS - DO It Now!

Get top VPS hosting with VPS.DO’s fast, low-cost plans. Try risk-free with our 7-day no-questions-asked refund and start today!