k6 API Load Testing Tutorial: How to Stress Test Your Endpoints Like a Pro

There is nothing more stressful than deploying a new feature and watching your API crash the moment a marketing campaign hits. In my experience, the difference between a stable system and a 500 Internal Server Error is almost always a lack of rigorous load testing. That’s why I’ve switched most of my projects to k6—it’s developer-centric, uses JavaScript, and integrates perfectly into CI/CD pipelines.

In this k6 api load testing tutorial, I’ll show you exactly how to move from a basic ‘hello world’ request to a complex stress test that simulates thousands of concurrent users. Whether you’re fighting latency issues or planning for a massive product launch, k6 provides the visibility you need.

Prerequisites

Before we dive into the code, make sure you have the following ready:

k6 Installed: You can install it via Homebrew (brew install k6) or using the official binary.
A Target API: I’ll be using a public test API, but you should use a staging environment. Never run a stress test against a production database unless you have a very good reason and a backup.
Basic JavaScript Knowledge: Since k6 scripts are written in JS, you should be comfortable with functions and objects.

If you’re still deciding on your toolkit, check out my list of the best open source api testing tools to see how k6 compares to alternatives like JMeter.

Step 1: Writing Your First Load Test Script

Unlike many tools that require a GUI, k6 uses “Tests as Code.” This means your load tests can be version-controlled in Git right alongside your application code. Create a file named loadtest.js and add the following:

import http from 'k6/http';
import { check, sleep } from 'k6';

export default function () {
  // 1. Send a GET request to your API
  const res = http.get('https://test-api.k6.io/public/crocodiles/');

  // 2. Validate the response
  check(res, {
    'status is 200': (r) => r.status === 200,
    'response time < 500ms': (r) => r.timings.duration < 500,
  });

  // 3. Think time: simulate real user behavior
  sleep(1);
}

In this snippet, I’m using the check function to ensure the API isn’t just responding, but responding correctly and quickly. Without checks, k6 will tell you the request was sent, but it won’t tell you if the server returned a 500 error.

Step 2: Defining Load Profiles (Virtual Users)

Running a script once is just a smoke test. To actually perform load testing, we need to define Options. This is where we specify how many Virtual Users (VUs) to spawn and for how long.

Update your script by adding an options object at the top:

export const options = {
  stages: [
    { duration: '30s', target: 20 }, // Ramp-up: 0 to 20 users in 30s
    { duration: '1m', target: 20 },  // Stay: hold 20 users for 1 minute
    { duration: '20s', target: 0 },  // Ramp-down: back to 0 users
  ],
  thresholds: {
    http_req_duration: ['p(95)<500'], // 95% of requests must be under 500ms
  },
};

The thresholds property is a game-changer. It allows you to define “pass/fail” criteria. If the 95th percentile (p95) of your response time exceeds 500ms, k6 will exit with a non-zero code, which automatically fails your CI/CD pipeline. This is essential for optimizing api response time best practices.

Step 3: Executing the Test and Analyzing Results

Now, run the test from your terminal:

k6 run loadtest.js

As the test runs, you’ll see a live dashboard in your terminal. Pay close attention to the following metrics:

http_req_duration: The total time for the request. This is your primary latency metric.
http_req_failed: The percentage of failed requests. Anything above 0% during a normal load test is a red flag.
iterations: How many times the default function was executed.

As shown in the terminal output during my own tests, the p(99) value is often more important than the average. The average hides the outliers—the users who experienced a 5-second lag while everyone else had 100ms.

k6 terminal output showing load test results with p95 and p99 metrics

Pro Tips for Advanced Testing

1. Parametrize Your Requests

Don’t hit the same endpoint with the same ID 10,000 times. Your database will cache the result, and you’ll get fake performance numbers. Use a JSON file to load different IDs for each VU.

2. Simulate Network Throttling

Remember that your server is in a data center, but your users are on 4G in a subway. While k6 doesn’t throttle the network directly, you can simulate varied behavior by adding random sleep() intervals using Math.random().

3. Use the k6 Cloud for Distributed Testing

If you need to simulate 50,000 users, your laptop’s CPU will bottleneck before the API does. In these cases, I recommend using the k6 Cloud or running k6 in a Kubernetes cluster to distribute the load.

Troubleshooting Common k6 Issues

Issue	Cause	Solution
High `http_req_connecting`	TCP handshake bottleneck	Check keep-alive settings or load balancer limits.
Out of Memory (OOM)	Too many VUs on small RAM	Reduce VU count or increase system swap.
429 Too Many Requests	Rate limiting active	Adjust your ramp-up stages to be more gradual.

What’s Next?

Now that you’ve mastered the basics of this k6 api load testing tutorial, the next step is to integrate this into your GitHub Actions or GitLab CI pipeline. By running a small load test on every Pull Request, you can catch performance regressions before they ever reach your users.

Want to dive deeper into performance? Explore our guides on reducing API latency or discover more open source tools to complement your k6 setup.