User GuideΒΆ Guides for configuring and running benchmarks, interpreting results, and integrating with external tools. Configuration System Configuration methods Polymorphic options Exporting JSON schema Common configuration sections Environment variables Stop conditions Output directory Trace recording Validation Splitting configuration across files Workload recipes Replay a request log (CSV or JSONL) Replay conversation datasets Replay timed multi-turn traces Replay shared-prefix traces Multi-turn conversations (synthetic) Agentic workloads (branching sessions) LM-Eval accuracy benchmarks Throughput saturation test See also Output Files Output directory structure Configuration file Metrics directory Metric distribution files Traces directory Health check results WandB files See also Capacity Search How it works Running capacity search Rate-based vs concurrency-based searches Configuration reference Defining SLOs Output structure WandB integration Example: Production capacity planning Example: Concurrency capacity search Configuration Sweeps The !expand tag Cartesian product expansion Basic example Output structure Sweep summary Cross-file expansion Common sweep patterns Server Management Supported servers Basic configuration Server configuration options GPU resource management Server logs Example: Full managed benchmark Example: Comparing servers Weights & Biases Integration Enabling WandB Configuration options What gets logged Using with advanced features Viewing results in WandB Filtering and comparing runs Offline mode Environment variables Example: Complete WandB config See also Microbenchmarks Prefill vs decode Prefill microbenchmark Decode microbenchmark Stress microbenchmark Common options Output directory structure