SglangServerConfigΒΆ
class SglangServerConfig ( BaseServerConfig )
Polymorphic Type:
type: sglangAll
BaseServerConfigtypes:
vllm: VllmServerConfig
vajra: VajraServerConfig
sglang: SglangServerConfig
Fields:
env_pathOptional [ str ] =NonePath to a Python environment directory (virtualenv/conda).
modelstr ="meta-llama/Meta-Llama-3-8B-Instruct"Model name or path.
hoststr ="localhost"Host address for the server
portint =8000Port number for the server
api_keystr ="token-abc123"API key for server authentication
gpu_idsOptional [ list [ int ] ] =NoneList of GPU IDs to use (None means auto-assign)
startup_timeoutint =300Timeout in seconds for server startup
health_check_intervalfloat =2.0Interval in seconds between health checks
require_contiguous_gpusbool =TrueRequire contiguous GPU allocation (e.g., GPUs 0,1,2 instead of 0,2,5)
tensor_parallel_sizeint =1Number of GPUs for tensor parallelism
dtypestr ="auto"Data type for model weights (auto, float16, bfloat16, etc.)
max_model_lenOptional [ int ] =NoneMaximum model context length
additional_argsOptional [ str ] ="{}"Additional engine-specific arguments as JSON string, dict, or None.