Configure per-API-key rate limits for the Runtime API
This guide explains how to configure rate limits for the internal API key that
connects the OpenHands server to the Runtime API. This is an administrator task
typically performed after initial deployment if you need to enforce request limits.
By default, the internal API key has no rate limit. This means the OpenHands Server
can make unlimited requests to the Runtime API.You may want to add a rate limit if:
You’re experiencing resource contention in the Runtime API
You want to prevent runaway automation from overwhelming the system
You need to enforce fair usage across multiple OpenHands Server instances
We provide a script that handles all the steps: retrieving credentials from Kubernetes,
authenticating to the Runtime API, and updating the rate limit.
Save this script as set-rate-limit.sh and make it executable with chmod +x set-rate-limit.sh:
#!/bin/bash## set-rate-limit.sh## Configure or check the rate limit for the internal API key used between# the OpenHands Server and the Runtime API.## This script runs commands inside the runtime-api pod using kubectl exec,# so it works regardless of whether the Runtime API is exposed externally.## Usage:# ./set-rate-limit.sh --check # Check current rate limit# ./set-rate-limit.sh <rate-limit> # Set rate limit## Examples:# ./set-rate-limit.sh --check # Display current rate limit# ./set-rate-limit.sh 500 # Set limit to 500 requests per minute# ./set-rate-limit.sh null # Remove limit (allow unlimited)## Prerequisites:# - kubectl configured with access to the openhands namespace#set -e# ==============================================================================# Configuration# ==============================================================================NAMESPACE="openhands"RUNTIME_API_URL="http://localhost:5000" # Internal URL within the pod# ==============================================================================# Parse command line arguments# ==============================================================================if [ $# -lt 1 ]; then echo "Usage: $0 [--check | <rate-limit>]" echo "" echo "Options:" echo " --check Display the current rate limit without changing it" echo "" echo "Arguments:" echo " rate-limit Requests per minute (integer), or 'null' to remove the limit" echo "" echo "Examples:" echo " $0 --check # Check current rate limit" echo " $0 500 # Set limit to 500 requests per minute" echo " $0 null # Remove limit (allow unlimited requests)" exit 1fiCHECK_ONLY=falseRATE_LIMIT=""if [ "$1" == "--check" ]; then CHECK_ONLY=true echo "Checking current rate limit..."else RATE_LIMIT="$1" # Validate rate limit is either a number or "null" if [ "$RATE_LIMIT" != "null" ] && ! [[ "$RATE_LIMIT" =~ ^[0-9]+$ ]]; then echo "Error: rate-limit must be a positive integer or 'null'" exit 1 fi echo "Rate limit to set: $RATE_LIMIT"fiecho ""# ==============================================================================# Step 1: Find the runtime-api pod# ==============================================================================echo "Step 1: Finding runtime-api pod..."# Get the name of a running runtime-api podPOD=$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/name=runtime-api \ -o jsonpath='{.items[0].metadata.name}' 2>/dev/null)if [ -z "$POD" ]; then echo "Error: Could not find a runtime-api pod in namespace '$NAMESPACE'" echo "Make sure the runtime-api deployment is running." exit 1fiecho " ✓ Found pod: $POD"# ==============================================================================# Step 2: Retrieve the admin password from Kubernetes secrets# ==============================================================================echo "Step 2: Retrieving admin password from Kubernetes secret..."# The admin password was created during installation and stored in the# 'admin-password' secret in the openhands namespaceADMIN_PASSWORD=$(kubectl get secret admin-password -n "$NAMESPACE" \ -o jsonpath='{.data.admin-password}' | base64 -d)if [ -z "$ADMIN_PASSWORD" ]; then echo "Error: Could not retrieve admin password from Kubernetes secret." echo "Make sure the 'admin-password' secret exists in the '$NAMESPACE' namespace." exit 1fiecho " ✓ Admin password retrieved"# ==============================================================================# Step 3: Run the rate limit update inside the pod# ==============================================================================# Determine the action description for outputif [ "$CHECK_ONLY" = true ]; then echo "Step 3: Connecting to runtime-api pod and checking rate limit..."else echo "Step 3: Connecting to runtime-api pod and updating rate limit..."fi# We'll execute a Python script inside the pod that:# 1. Gets a challenge from the local API# 2. Computes the PBKDF2 hash# 3. Authenticates and gets a JWT token# 4. Finds the default API key# 5. Optionally updates its rate limit (if not --check mode)# Pass CHECK_ONLY and RATE_LIMIT to the Python script# For check-only mode, we pass "CHECK" as the rate limitif [ "$CHECK_ONLY" = true ]; then RATE_LIMIT_ARG="'CHECK'"else RATE_LIMIT_ARG="$RATE_LIMIT"fikubectl exec -n "$NAMESPACE" "$POD" -- python3 -c "import jsonimport hashlibimport binasciiimport urllib.requestimport urllib.errorRUNTIME_API_URL = '$RUNTIME_API_URL'ADMIN_PASSWORD = '''$ADMIN_PASSWORD'''RATE_LIMIT_ARG = $RATE_LIMIT_ARG # This will be an int, None (from 'null'), or 'CHECK'CHECK_ONLY = RATE_LIMIT_ARG == 'CHECK'RATE_LIMIT = None if CHECK_ONLY else RATE_LIMIT_ARGdef api_request(path, method='GET', data=None, token=None): \"\"\"Make an HTTP request to the Runtime API.\"\"\" url = f'{RUNTIME_API_URL}{path}' headers = {'Content-Type': 'application/json'} if token: headers['Authorization'] = f'Bearer {token}' req = urllib.request.Request(url, method=method, headers=headers) if data: req.data = json.dumps(data).encode('utf-8') try: with urllib.request.urlopen(req) as response: return json.loads(response.read().decode('utf-8')) except urllib.error.HTTPError as e: error_body = e.read().decode('utf-8') raise Exception(f'HTTP {e.code}: {error_body}')# Step 3a: Get authentication challengeprint(' Getting authentication challenge...')challenge_resp = api_request('/api/admin/challenge')challenge = challenge_resp['challenge']salt = challenge_resp['salt']# Step 3b: Compute PBKDF2 hash# The salt is: salt + challenge (concatenated as strings, then UTF-8 encoded)# Parameters: 10000 iterations, 32-byte outputcombined_salt = (salt + challenge).encode('utf-8')dk = hashlib.pbkdf2_hmac('sha256', ADMIN_PASSWORD.encode(), combined_salt, 10000, dklen=32)hash_hex = binascii.hexlify(dk).decode()# Step 3c: Authenticate and get JWT tokenprint(' Authenticating...')login_resp = api_request('/api/admin/login', method='POST', data={ 'challenge': challenge, 'hash': hash_hex})token = login_resp['token']print(' ✓ Authentication successful')# Step 3d: Get all API keys and find the 'default' keyprint(' Finding default API key...')keys = api_request('/api/admin/api-keys', token=token)default_key = Nonefor key in keys: if key.get('name') == 'default': default_key = key breakif not default_key: print(' Error: Could not find API key named \"default\"') print(f' Available keys: {[k.get(\"name\") for k in keys]}') exit(1)key_id = default_key['id']current_limit = default_key.get('max_requests_per_minute')current_display = 'unlimited' if current_limit is None else current_limitprint(f' ✓ Found default key (ID: {key_id})')print()print('================================================')print(f'Current rate limit: {current_display}')print('================================================')# Step 3e: Update the rate limit (only if not in check-only mode)if not CHECK_ONLY: new_display = 'unlimited' if RATE_LIMIT is None else RATE_LIMIT print() print(f' Updating rate limit to {new_display}...') updated_key = api_request(f'/api/admin/api-keys/{key_id}', method='PUT', token=token, data={ 'max_requests_per_minute': RATE_LIMIT }) final_limit = updated_key.get('max_requests_per_minute') final_display = 'unlimited' if final_limit is None else final_limit print(f' ✓ Rate limit updated successfully') print() print('================================================') print(f'New rate limit: {final_display}') print('================================================')"
If you upgraded your deployment but are still experiencing 429 errors, the most likely
cause is that you’re running an older version of the Runtime API that has hardcoded
rate limits.
Prior to Helm chart version 0.2.8, the Runtime API had a hardcoded limit of
100 requests per minute on all endpoints. This was not configurable — every
deployment was subject to this limit regardless of settings.Starting with chart version 0.2.8 (image sha-1a920e8), rate limiting was changed to:
No rate limit by default — the internal API key is created without a limit
Configurable per-key — administrators can optionally set limits via the admin API
Verify what image is actually running in your cluster:
kubectl get deployment -n openhands -l app.kubernetes.io/name=runtime-api \ -o jsonpath='{.items[*].spec.template.spec.containers[*].image}'
You should see ghcr.io/openhands/runtime-api:sha-1a920e8 (or sha-7857be8 or newer).If you see an older image tag (like sha-20ec8b3 or earlier), you’re running the old
code with hardcoded limits.
429 errors after setting a limit: Your limit may be too low. Check the logs to see
how many requests are being made, then adjust the limit accordingly.Authentication failures: JWT tokens expire after 24 hours. If you get 401 errors,
repeat the authentication steps to get a new token.“Admin functionality is disabled” error: The ADMIN_PASSWORD environment variable
may not be set in the Runtime API deployment. Check the deployment configuration.