k6 Load Testing Tutorial: Scale Your APIs with Confidence

I’ve been there: you deploy a new feature, the staging environment looks perfect, and then—the moment you hit production—the API latency spikes and the server starts throwing 504 Gateway Timeouts. It’s a nightmare scenario that happens when we treat performance as an afterthought. That’s why I started using k6; it’s developer-centric, uses JavaScript, and integrates perfectly into CI/CD pipelines.

In this k6 load testing tutorial, I’m going to show you how to move from ‘hoping it works’ to ‘knowing it works’ by simulating real-world traffic patterns. Whether you’re comparing Postman vs Insomnia for API testing for functional checks or looking for a robust way to stress test, k6 is the tool for the job.

Prerequisites

Before we dive into the code, make sure you have the following installed on your machine:

Node.js: While k6 is written in Go, we write tests in JavaScript.
A target API: You can use a local server or a public testing API (like JSONPlaceholder).
k6 binary: Installed via Homebrew (brew install k6) or your preferred package manager.

Step 1: Writing Your First k6 Script

One of the reasons I prefer k6 over other testing frameworks for microservices is that you don’t need to learn a proprietary GUI. You just write a JS file. Create a file named loadtest.js:

import http from 'k6/http';
import { sleep, check } from 'k6';

export default function () {
  // 1. Send a GET request to the target API
  const res = http.get('https://test-api.ajmani.dev/users');

  // 2. Validate the response (Check if status is 200)
  check(res, {
    'status is 200': (r) => r.status === 200,
    'response time < 500ms': (r) => r.timings.duration < 500,
  });

  // 3. Wait 1 second between iterations to simulate human behavior
  sleep(1);
}

In this script, we’re performing a basic health check. We use check to ensure the API isn’t just responding, but responding correctly and fast.

Step 2: Configuring Load Profiles (The ‘Magic’ of k6)

Running a script with one user is just a functional test. To actually load test, we need options. You can define these inside the script to control how the virtual users (VUs) scale.

export const options = {
  stages: [
    { duration: '30s', target: 20 }, // Ramp-up: 0 to 20 users in 30s
    { duration: '1m', target: 20 },  // Stay: hold 20 users for 1 minute
    { duration: '20s', target: 0 },  // Ramp-down: 20 to 0 users in 20s
  ],
  thresholds: {
    http_req_duration: ['p(95)<200'], // 95% of requests must be under 200ms
  },
};

I highly recommend using stages. In my experience, jumping from 0 to 1,000 users instantly often crashes the network layer before the application layer, which gives you skewed results. As shown in the terminal output image below, these stages allow you to see exactly where the system starts to degrade.

Step 3: Executing the Test and Analyzing Results

Now, run the test from your terminal:

k6 run loadtest.js

As the test runs, k6 will stream real-time metrics to your console. Pay close attention to the http_req_duration. This is the gold standard for measuring user experience.

p(95): The 95th percentile. This means 95% of your users experienced this latency or better. This is much more useful than an ‘average,’ which hides outliers.
Checks: The percentage of requests that passed your check logic.

k6 terminal output showing performance metrics and successful check percentages

Pro Tips for Advanced Load Testing

1. Parametrize Your Data

Don’t hit the same endpoint with the same ID 10,000 times; you’ll just hit the database cache. Use a JSON file to feed unique IDs to your users:

import { SharedArray } from 'k6/data';

const data = new SharedArray('users', function () {
  return JSON.parse(open('./users.json'));
});

export default function () {
  const user = data[Math.floor(Math.random() * data.length)];
  http.get(`https://api.example.com/users/${user.id}`);
}

2. Test the ‘Cold Start’

If you’re using Serverless functions (AWS Lambda, Vercel), the first request is always slow. I always run a ‘warm-up’ phase in my k6 options to ensure I’m measuring the system’s peak performance, not just the provider’s boot time.

Troubleshooting Common k6 Issues

Issue	Cause	Solution
`socket: timeout`	Server is overwhelmed or firewall is blocking VUs	Increase server resources or check security groups
High CPU on local machine	Too many VUs for your local RAM/CPU	Run k6 in a Docker container on a larger EC2 instance
Unexpected 429 errors	Rate limiting is active on the API	Coordinate with DevOps to whitelist your test IP

What’s Next?

Now that you’ve mastered the basics of this k6 load testing tutorial, you should integrate these tests into your GitHub Actions or GitLab CI. Setting a threshold (like we did in Step 2) allows you to automatically fail a build if a PR increases API latency by more than 10%.

If you’re still figuring out your overall quality strategy, check out my guide on testing frameworks for microservices to see how load testing fits with unit and integration tests.