BenLTscale

Step-by-Step BenLTscale Benchmarking: Real-World Example and Results

Overview

This article walks through a practical benchmarking run using BenLTscale — a hypothetical distributed load-testing tool — showing setup, test design, execution, analysis, and conclusions. Assumed defaults: testing a REST API endpoint that returns JSON, target environment hosted on AWS, 100 concurrent users peak, 1,000 total requests, test duration 10 minutes.

Goals

Primary: Measure throughput (RPS), average latency, and error rate under load.
Secondary: Identify bottlenecks, validate autoscaling, and collect resource metrics (CPU, memory).

Test environment

SUT (system under test): API server behind an Application Load Balancer (ALB), autoscaling group (2–10 instances), t3.medium equivalent.
BenLTscale controller: single manager node (m5.large) coordinating workers.
BenLTscale workers: 5 distributed workers (c5.large equivalent) in same region.
Monitoring: Prometheus + Grafana for metrics, CloudWatch for autoscaling events, and server-side logs.

Test design

Endpoint: POST /v1/orders (JSON payload ~2 KB).
Authentication: Bearer token header.
Warm-up: 60 seconds at 10% load.
Ramp-up: linear increase to 100 concurrent users over 4 minutes.
Steady state: maintain 100 concurrent users for 4 minutes.
Ramp-down: 60 seconds.
Total duration: ~10 minutes.
Assertions: error rate <1%, 95th percentile latency <500 ms.

BenLTscale configuration (example)

Test plan: 5 workers, each spawning 20 virtual users, total 100.
Think time: random 200–500 ms between requests.
Request timeout: 10s.
Payload generator: fixed sample order JSON.
Metrics export: push to Prometheus gateway every 5s.

Example BenLTscale YAML snippet:

yaml
test:
name: orders-load-test   duration: 10m   warmup: 1m   ramp_up: 4m   ramp_down: 1m   workers: 5
  users_per_worker: 20
request:
  endpoint: https://api.example.com/v1/orders   method: POST   headers:
    Authorization: “Bearer ”
  body_file: order_sample.json   think_time: [200,500]
metrics:
  prometheus_push_interval: 5s   export_tags: [instance_id, region]

Execution steps

Provision BenLTscale controller and workers in same region as SUT.
Upload test plan and payload to controller.
Start Prometheus and Grafana dashboards; ensure CloudWatch export is enabled.
Run a short smoke test (10 users, 1 minute) to validate authentication and payload.
Execute the full test plan.
Collect BenLTscale logs, Prometheus metrics, server logs, and autoscaling events.

Real-world results (example run)

Summary metrics:

Total requests: 1,000
Average throughput: 100 RPS (during steady state)
Average latency: 210 ms
Median latency (50th): 180 ms
95th percentile latency: 470 ms
99th percentile latency: 820 ms
Error rate: 0.8% (8 errors, mostly 503 from occasional instance restarts)
CPU average (instances): 68% during steady state
Memory average: 54%

Grafana snapshot highlights:

Smooth ramp-up in RPS with small spikes at 3:30 and 7:10.
Latency correlated with a scale-in event at 6:45 causing 503s for ~30s.
Request queue length briefly increased from 0 to 15 during scale event.

Analysis

Performance met the 95th percentile latency goal (470 ms < 500 ms).
99th percentile exceeded target due to transient errors during autoscaling.
Error rate under 1% but concentrated around a scale-in. Root cause likely graceful shutdown not draining traffic quickly enough.
CPU at 68% indicates healthy utilization; reducing instance size might risk higher latency under spikes.

Actionable recommendations

Implement connection draining with a longer timeout during instance termination to avoid 503s.
Add a brief cool-down before scale-in or adjust autoscaling policy to scale earlier using CPU + request queue metrics.
Reduce think time slightly or increase worker count for more realistic pacing if production traffic has shorter intervals.
Retest after changes, adding a longer steady-state run (30m) and higher peak concurrency to validate stability.

Conclusion

This BenLTscale benchmark showed the system meets primary latency targets but revealed autoscaling-induced tail-latency issues. Apply connection draining and autoscaling tuning, then rerun the test with an extended steady state and higher load to confirm improvements.

Step-by-Step BenLTscale Benchmarking: Real-World Example and Results

Overview

Goals

Test environment

Test design

BenLTscale configuration (example)

Execution steps

Real-world results (example run)

Analysis

Actionable recommendations

Conclusion

Comments

Leave a Reply Cancel reply

More posts

Migrating from WDFlow to HubFlow: What Changes and How to Adapt

Boost Productivity: 10 Stream Deck Profiles for Workflows

Choosing Your First Piano: Acoustic vs. Digital—Which Is Right?

Mastering DirBuster: Tips, Wordlists, and Best Practices