There is nothing worse than getting a ‘Process Out of Memory’ error in production and having zero visibility into what caused it. I’ve spent too many late nights digging through raw logs trying to reconstruct a timeline of a memory leak. That’s why I switched to a proactive monitoring stack. In this custom grafana dashboard tutorial for nodejs, I’ll show you exactly how to instrument your application, export metrics to Prometheus, and build a dashboard that actually tells you when things are going wrong.
Before we dive in, it’s important to understand the pipeline: your Node.js app exposes metrics via an HTTP endpoint, Prometheus scrapes that endpoint at regular intervals, and Grafana queries Prometheus to visualize the data. If you are running this in a containerized environment, you might want to check out how to setup Prometheus and Grafana on Kubernetes to handle the infrastructure side of things.
Prerequisites
- A running Node.js application (v16+ recommended).
- Docker and Docker Compose installed on your machine.
- Basic familiarity with PromQL (Prometheus Query Language).
- A basic understanding of scaling prometheus for high cardinality metrics if you plan to track thousands of unique users or IDs.
Step 1: Instrumenting your Node.js App
To get data into Grafana, we first need to expose it. The industry standard for this is the prom-client library. I use this in all my production apps because it provides both default Node.js metrics (like heap size and event loop lag) and the ability to create custom counters.
npm install prom-client express
Now, let’s set up a basic server that exposes a /metrics endpoint. This is the door Prometheus will use to knock and collect data.
const express = require('express');
const client = require('prom-client');
const app = express();
// Create a Registry to register the metrics
const register = new client.Registry();
// Add default metrics (CPU, Memory, etc.)
client.collectDefaultMetrics({ register });
// Create a custom metric to track HTTP requests
const httpRequestCounter = new client.Counter({
name: 'http_requests_total',
help: 'Total number of HTTP requests',
labelNames: ['method', 'route', 'status'],
});
register.registerMetric(httpRequestCounter);
app.get('/api/data', (req, res) => {
// Increment our custom counter
httpRequestCounter.inc({ method: 'GET', route: '/api/data', status: 200 });
res.send({ data: 'Hello World' });
});
// The endpoint Prometheus scrapes
app.get('/metrics', async (req, res) => {
res.setHeader('Content-Type', register.contentType);
res.send(await register.metrics());
});
app.listen(3000, () => console.log('Server running on port 3000'));
Step 2: Setting up Prometheus and Grafana
The fastest way to get the monitoring stack running locally is via Docker Compose. I’ve found that using a prometheus.yml config file is the most reliable way to ensure Prometheus knows where your Node.js app is living.
Create a prometheus.yml file:
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'nodejs-app'
static_configs:
- targets: ['host.docker.internal:3000']
Then, use this docker-compose.yml to spin up both services:
version: '3.8'
services:
prometheus:
image: prom/prometheus
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
ports:
- "9090:9090"
grafana:
image: grafana/grafana
ports:
- "3000:3000"
depends_on:
- prometheus
Run docker-compose up -d and your infrastructure is ready. As shown in the image below, once you log into Grafana, your first step is to connect the Prometheus data source.
Step 3: Building the Custom Dashboard
Now we get to the visual part. Log into Grafana (default: admin/admin), go to Connections → Data Sources, and add Prometheus. Use http://prometheus:9090 as the URL.
Visualizing Memory Usage
I always start with memory. Create a new dashboard, add a Time Series panel, and use this PromQL query:
nodejs_heap_size_used_bytes
Set the unit to Data (IEC) → bytes. This will give you a clear line graph of how your app’s memory is trending over time.
Tracking Request Rates
To see how many requests your app is handling per second, use the rate() function on our custom counter:
sum(rate(http_requests_total[5m])) by (route)
This tells Grafana: “Calculate the per-second rate of requests over the last 5 minutes, grouped by the API route.” This is the most powerful way to spot traffic spikes before they crash your server.
Pro Tips for Better Monitoring
- Use Gauges for Current State: Use
client.Gaugefor things that go up and down, like the number of active WebSocket connections. - Alerting is Key: Don’t just stare at the dashboard. Set up Grafana Alerts to ping your Slack or Discord when
nodejs_eventloop_lag_secondsexceeds 0.2s. - Avoid High Cardinality: Never put a User ID or Email in a metric label. This will bloat your Prometheus DB and slow down your queries.
Troubleshooting
Prometheus can’t find the app: If you are using Docker on Mac/Windows, ensure you use host.docker.internal instead of localhost in your prometheus.yml. Docker containers cannot see the host’s localhost by default.
Metrics are ‘NaN’ or empty: Check if your Node.js app is actually serving the /metrics page. Open http://localhost:3000/metrics in your browser; if you don’t see a wall of text, the prom-client isn’t configured correctly.
What’s Next?
Now that you have basic visibility, you can move toward distributed tracing. I recommend looking into OpenTelemetry to see how requests flow across multiple microservices. If you’re managing a large cluster, remember to optimize your storage—read more about scaling prometheus for high cardinality metrics to avoid crashing your monitoring server.