BenLTscale

Step-by-Step BenLTscale Benchmarking: Real-World Example and Results

Overview

This article walks through a practical benchmarking run using BenLTscale — a hypothetical distributed load-testing tool — showing setup, test design, execution, analysis, and conclusions. Assumed defaults: testing a REST API endpoint that returns JSON, target environment hosted on AWS, 100 concurrent users peak, 1,000 total requests, test duration 10 minutes.

Goals

  • Primary: Measure throughput (RPS), average latency, and error rate under load.
  • Secondary: Identify bottlenecks, validate autoscaling, and collect resource metrics (CPU, memory).

Test environment

  • SUT (system under test): API server behind an Application Load Balancer (ALB), autoscaling group (2–10 instances), t3.medium equivalent.
  • BenLTscale controller: single manager node (m5.large) coordinating workers.
  • BenLTscale workers: 5 distributed workers (c5.large equivalent) in same region.
  • Monitoring: Prometheus + Grafana for metrics, CloudWatch for autoscaling events, and server-side logs.

Test design

  • Endpoint: POST /v1/orders (JSON payload ~2 KB).
  • Authentication: Bearer token header.
  • Warm-up: 60 seconds at 10% load.
  • Ramp-up: linear increase to 100 concurrent users over 4 minutes.
  • Steady state: maintain 100 concurrent users for 4 minutes.
  • Ramp-down: 60 seconds.
  • Total duration: ~10 minutes.
  • Assertions: error rate <1%, 95th percentile latency <500 ms.

BenLTscale configuration (example)

  • Test plan: 5 workers, each spawning 20 virtual users, total 100.
  • Think time: random 200–500 ms between requests.
  • Request timeout: 10s.
  • Payload generator: fixed sample order JSON.
  • Metrics export: push to Prometheus gateway every 5s.

Example BenLTscale YAML snippet:

yaml

test: name: orders-load-test duration: 10m warmup: 1m ramp_up: 4m ramp_down: 1m workers: 5 users_per_worker: 20 request: endpoint: https://api.example.com/v1/orders method: POST headers: Authorization: “Bearer body_file: order_sample.json think_time: [200,500] metrics: prometheus_push_interval: 5s export_tags: [instance_id, region]

Execution steps

  1. Provision BenLTscale controller and workers in same region as SUT.
  2. Upload test plan and payload to controller.
  3. Start Prometheus and Grafana dashboards; ensure CloudWatch export is enabled.
  4. Run a short smoke test (10 users, 1 minute) to validate authentication and payload.
  5. Execute the full test plan.
  6. Collect BenLTscale logs, Prometheus metrics, server logs, and autoscaling events.

Real-world results (example run)

Summary metrics:

  • Total requests: 1,000
  • Average throughput: 100 RPS (during steady state)
  • Average latency: 210 ms
  • Median latency (50th): 180 ms
  • 95th percentile latency: 470 ms
  • 99th percentile latency: 820 ms
  • Error rate: 0.8% (8 errors, mostly 503 from occasional instance restarts)
  • CPU average (instances): 68% during steady state
  • Memory average: 54%

Grafana snapshot highlights:

  • Smooth ramp-up in RPS with small spikes at 3:30 and 7:10.
  • Latency correlated with a scale-in event at 6:45 causing 503s for ~30s.
  • Request queue length briefly increased from 0 to 15 during scale event.

Analysis

  • Performance met the 95th percentile latency goal (470 ms < 500 ms).
  • 99th percentile exceeded target due to transient errors during autoscaling.
  • Error rate under 1% but concentrated around a scale-in. Root cause likely graceful shutdown not draining traffic quickly enough.
  • CPU at 68% indicates healthy utilization; reducing instance size might risk higher latency under spikes.

Actionable recommendations

  • Implement connection draining with a longer timeout during instance termination to avoid 503s.
  • Add a brief cool-down before scale-in or adjust autoscaling policy to scale earlier using CPU + request queue metrics.
  • Reduce think time slightly or increase worker count for more realistic pacing if production traffic has shorter intervals.
  • Retest after changes, adding a longer steady-state run (30m) and higher peak concurrency to validate stability.

Conclusion

This BenLTscale benchmark showed the system meets primary latency targets but revealed autoscaling-induced tail-latency issues. Apply connection draining and autoscaling tuning, then rerun the test with an extended steady state and higher load to confirm improvements.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *