vLLM Inference
Overview
Managing application via Apolo CLI
apolo app-template get llm-inference > llm.yaml# Application template configuration for: llm-inference
# Fill in the values below to configure your application.
# To use values from another app, use the following format:
# my_param:
# type: "app-instance-ref"
# instance_id: "<app-instance-id>"
# path: "<path-from-get-values-response>"
template_name: llm-inference
template_version: v25.7.0
input:
# Select the resource preset used per service replica.
preset:
# The name of the preset.
name: <>
# Enable access to your application over the internet using HTTPS.
ingress_http:
# Enable or disable HTTP ingress.
enabled: true
# Hugging Face Model Configuration.
hugging_face_model:
# The name of the Hugging Face model.
model_hf_name: <>
# The Hugging Face API token.
hf_token: <>
Usage
References:
Last updated
Was this helpful?