Live workflow syntax
Last updated
Was this helpful?
Last updated
Was this helpful?
The live workflow is always located at the .apolo/live.yml
or .apolo/live.yaml
file in the flow's root. The following YAML attributes are supported:
kind
Required The workflow kind, must be live
for live workflows.
Expression contexts: This attribute cannot contain expressions.
id
Optional Identifier of the workflow. By default, the id
is live
. It's available as a ${{ flow.flow_id }}
in experssions.
Note: Don't confuse this with ${{ flow.project_id }}
, which is defined in the file.
Expression contexts: This attribute only allows expressions that don't access contexts.
title
Optional Workflow title, any valid string is allowed. It's accessible via ${{ flow.title }}
. If this is not set manually, the default workflow title live
will be used.
Expression contexts: This attribute only allows expressions that don't access contexts.
defaults
Optional section A map of default settings that will apply to all jobs in the workflow. You can override these global default settings for specific jobs.
defaults.env
When two or more environment variables are defined with the same name, apolo-flow
uses the most specific environment variable. For example, an environment variable defined in a job will override the workflow's default.
Example:
This attribute also supports dictionaries as values:
defaults.life_span
A float
number representing the amount of seconds (3600
represents an hour)
A string of the following format: 1d6h15m
(1 day, 6 hours, 15 minutes)
For lifespan-disabling emulation, use an arbitrary large value (e.g. 365d
). Keep in mind that this may be dangerous, as a forgotten job will consume cluster resources.
lifespan shorter than 1 minute is forbidden.
Example:
defaults.preset
Example:
defaults.volumes
Volumes that will be mounted to all jobs by default.
Example:
Default volumes are not passed to actions.
defaults.schedule_timeout
The attribute accepts the following values:
A float
number representing the amount of seconds (3600
represents an hour)
A string of the following format: 1d6h15m45s
(1 day, 6 hours, 15 minutes, 45 seconds)
The cluster-wide timeout is used if both default.schedule_timeout
and jobs.<job-id>.schedule_timeout
are omitted.
Example:
defaults.tags
Example:
This attribute supports lists as values.
defaults.workdir
Example:
images
Optional section A mapping of image definitions used by the live workflow.
images.<image-id>
The key image-id
is a string and its value is a map of the job's configuration data. You must replace <image-id>
with a string that is unique to the images
object. <image-id>
must start with a letter and contain only alphanumeric characters or underscore symbols _
. Dashes -
are not allowed.
images.<image-id>.ref
Example of self-hosted image:
Example of external image:
Example of auto-calculated stable hash:
images.<image-id>.context
Optional The Docker context used to build an image, a local path relative to the flow's root folder. The context should contain the Dockerfile
and any additional files and folders that should be copied to the image.
Example:
apolo-flow
cannot build images without the context.
images.<image-id>.dockerfile
Optional An docker file name used to build the image. If not set, a Dockerfile
name will be used by default.
Example:
images.<image-id>.build_preset
Optional A name of the resource preset used to build the docker image. Consider using it if, for instance, a GPU is required to build dependencies within the image.
Example:
images.<image-id>.build_args
Example:
images.<image-id>.env
A mapping of environment variables passed to the image builder.
Example:
This attribute also supports dictionaries as values:
images.<image-id>.volumes
A list of volume references mounted to the image building process.
Example:
This attribute also supports lists as values:
volumes
Optional section A mapping of volume definitions available in the live workflow. A volume defines a link between the Apolo storage folder, a remote folder that can be mounted to a live job, and a local folder.
volumes.<volume-id>
The key volume-id
is a string and its value is a map of the volume's configuration data. You must replace <volume-id>
with a string that is unique to the volumes
object. The <volume-id>
must start with a letter and contain only alphanumeric characters or underscore symbols _
. Dashes -
are not allowed.
volumes.<volume-id>.remote
Example:
volumes.<volume-id>.mount
Required The mount path inside a job.
Example:
volumes.<volume-id>.local
Optional Volumes can also be associated with folders on a local machine. A local path should be relative to the flow's root and will be used for uploading/downloading content to the storage.
Volumes without a set local
attribute cannot be used by the apolo-flow upload
and apolo-flow download
commands.
Example:
apolo-flow upload
and apolo-flow download
will not work for volumes whose remote is the Apolo Disk due to specifics of how disks work.
volumes.<volume-id>.read_only
Optional The volume is mounted as read-only by default if this attribute is set, read-write mode is used otherwise.
Example:
jobs
jobs.<job-id>
Each job must have an associated ID. The key job-id
is a string and its value is a map of the job's configuration data or action call. You must replace <job-id>
with a string that is unique to the jobs
object. The <job-id>
must start with a letter and contain only alphanumeric characters or underscore symbols _
. Dash -
is not allowed.
The attributes described in this section can be applied both to plain jobs and action calls. To simplify reading, this section uses the term "job" instead of "job or action call".
jobs.<job-id>.params
Params is a mapping of key-value pairs that have default value and could be overridden from a command line by using apolo-flow run <job-id> --param name1 val1 --param name2 val2
.
This attribute describes a set of names and default values of parameters accepted by a job.
Parameters can be specified in short and long forms.
The short form is compact, but only allows to specify the parameter's name and default value:
The long form also allows to specify parameter descriptions. This can be useful for apolo-flow run
command introspection, shell autocompletion, and generation of more detailed error messages.
Example:
Expression contexts: This attribute only allows expressions that don't access contexts.
The attributes described in this section are only applicable to plain jobs that are executed by running docker images on the Apolo platform.
jobs.<job-id>.image
Example with a constant image string:
Example with a reference to images
section:
jobs.<job-id>.cmd
Example:
jobs.<job-id>.bash
Optional This attribute contains a bash
script to run.
Using cmd
to run bash scripts can be tedious: you need to apply quotas to the executed script and set proper bash flags allowing to fail on error.
The bash
attribute is essentially a shortcut for cmd: bash -euo pipefail -c <shell_quoted_attr>
.
This form is especially handy for executing complex multi-line bash scripts.
cmd
, bash
, and python
are mutually exclusive.
bash
should be pre-installed on the image to make this attribute work.
Example:
jobs.<job-id>.python
This attribute contains a python
script to run.
Python is usually considered to be one of the best languages for scientific calculation. If you prefer writing simple inlined commands in python
instead of bash
, this notation is great for you.
The python
attribute is essentially a shortcut for cmd: python3 -uc <shell_quoted_attr>
.
The cmd
, bash
, and python
are mutually exclusive.
python3
should be pre-installed on the image to make this attribute work.
Example:
jobs.<job-id>.browse
Optional Open a job's Http URL in a browser after the job startup. false
by default.
Use this attribute in scenarios like starting a Jupyter Notebook job and opening the notebook session in a browser.
Example:
jobs.<job-id>.detach
Optional By default, apolo-flow run <job-id>
keeps the terminal attached to the spawned job. This can help with viewing the job's logs and running commands in its embedded bash session.
Enable the detach
attribute to disable this behavior.
Example:
jobs.<job-id>.entrypoint
Example:
jobs.<job-id>.env
Example:
This attribute also supports dictionaries as values:
jobs.<job-id>.http_auth
Optional Control whether the HTTP port exposed by the job requires the Apolo Platform authentication for access.
You may want to disable the authentication to allow everybody to access your job's exposed web resource.
By default, jobs have HTTP protection enabled.
Example:
jobs.<job-id>.http_port
Optional The job's HTTP port number that will be exposed globally on the platform.
By default, the Apolo Platform exposes the job's internal 80
port as an HTTPS-protected resource. This will be listed in the oputput of the apolo-flow status <job-id>
command as Http URL.
You may want to expose a different local port. Use 0
to disable the feature entirely.
Example:
Only HTTP traffic is allowed. The platform will automatically encapsulate it into TLS to provide an HTTPS connection.
jobs.<job-id>.life_span
Optional The time period after which a job will be automatically killed.
By default, jobs live 1 day. You may want to change this period by customizing the attribute.
The value could be:
A float number representing an amount of seconds (3600
for an hour)
An expression in the following format: 1d6h15m
(1 day, 6 hours, 15 minutes)
Use an arbitrary large value (e.g. 365d
) for lifespan-disabling emulation. Keep in mind that this can be dangerous, as a forgotten job will consume cluster resources.
Example:
jobs.<job-id>.name
Optional Allows you to specify a job's name. This name becomes a part of the job's internal hostname and exposed HTTP URL, and the job can then be controlled by its name through the low-level apolo
tool.
The name is completely optional.
Example:
If the name is not specified in the name
attribute, the default name for the job will be automatically generated as follows:
The [-<MULTI_SUFFIX>]
part makes sure that a job will have a unique name even if it's a multi job.
jobs.<job-id>.multi
Optional By default, a job can only have one running instance at a time. Calling apolo-flow run <job-id>
for the same job ID will attach to the already running job instead of creating a new one. This can be overridden by enabling the multi
attribute.
Example:
Expression contexts: This attribute only allows expressions that don't access contexts.
jobs.<job-id>.pass_config
Optional Set this attribute to true
if you want to pass the Apolo config used to execute the apolo-flow run ...
command into the spawned job. This can be useful if you're using a job image with Apolo CLI installed and want to execute apolo ...
commands inside the running job.
By default, the config is not passed.
Example:
The lifetime of passed credentials is bound to the job's lifetime. It will be impossible to use them when the job is terminated.
jobs.<job-id>.restart
Optional Control the job behavior when main process exits.
Possible values: never
(default), on-failure
and always
.
Set this attribute to on-failure
if you want your job to be restarted if the main process exits with non-zero exit code. If you set this attribute to always,
the job will be restarted even if the main process exits with 0. In this case you will need to terminate the job manually or it will be automatically terminated when it's lifespan ends. never
implies the platform does not restart the job and this value is used by default.
Example:
jobs.<job-id>.port-forward
Optional You can define a list of TCP tunnels for the job.
Each port forward entry is a string of a <LOCAL_PORT>:<REMOTE_PORT>
format.
When the job starts, all enumerated remote ports on the job's side are bound to the developer's box and are available under the corresponding local port numbers.
You can use this feature for remote debugging, accessing a database running in the job, etc.
Example:
This attribute also supports lists as values:
jobs.<job-id>.preset
jobs.<job-id>.schedule_timeout
Optional Use this attribute if you want to increase the schedule timeout. This will prevent a job from failing if the Apolo cluster is under high load and requested resources are likely to not be available at the moment.
If the Apolo cluster has no resources to launch a job immediately, this job is pushed into the waiting queue. If the job is still not started at the end of the schedule timeout, it will be failed.
The default system-wide schedule timeout is controlled by the cluster administrator and is usually about 5-10 minutes.
The value of this attribute can be:
A float
number representing an amount of seconds
A string in the following format: 1d6h15m45s
(1 day, 6 hours, 15 minutes, 45 seconds)
Example:
jobs.<job-id>.tags
Optional A list of additional job tags.
Example:
This attribute also supports lists as values:
jobs.<job-id>.title
Optional A job's title. Equal to <job-id>
by default if not overridden.
jobs.<job-id>.volumes
Optional A list of job volumes. You can specify a plain string for a volume reference or use the ${{ volumes.<volume-id>.ref }}
expression.
Example:
This attribute also supports lists as values:
jobs.<job-id>.workdir
Optional A working directory to use inside the job.
Example:
jobs.<job-id>.action
Required A URL that selects an action to run. It supports two schemes:
workspace:
or ws:
for action files that are stored locally
github:
or gh:
for actions that are bound to a Github repository
The ws:
scheme expects a valid filesystem path to the action file.
The gh:
scheme expects the following format: {owner}/{repo}@{tag}
. Here, {owner}
is the owner of the Github repository, {repo}
is the repository's name, and {tag}
is the commit tag. Commit tags are used to allow versioning of actions.
Example of the ws:
scheme
Example of the gh:
scheme
Expression contexts: This attribute only allows expressions that don't access contexts.
jobs.<job-id>.args
Example:
A mapping of environment variables that will be set in all jobs of the workflow. You can also set environment variables that are only available to specific jobs. For more information, see .
Expression contexts: .
The default lifespan for jobs ran by the workflow. It can be overridden by . If not set manually, the default job lifespan is 1 day. The lifespan value can be one of the following:
Expression contexts: .
The default preset used by all jobs if not overridden by . The system-wide default preset is used if both defaults.preset
and jobs.<job-id>.preset
are omitted.
Expression contexts: .
The default timeout for job scheduling. See for more information.
Expression contexts: .
A list of tags that are added to every job created by the workflow. A specific job's definition can extend this global list by using .
Expression contexts: .
The default working directory for jobs created by this workflow. See for more information.
Expression contexts: .
apolo-flow build <image-id>
creates an image from the passed Dockerfile
and uploads it to the Apolo Registry. The ${{ images.img_id.ref }}
expression can be used for pointing an image from a .
Required Image reference that can be used in the expression.
You can use the image definition to address images hosted on as an external source (while you can't use apolo-flow
to build this image). All other attributes except for ref
don't work for external images.
Use the embedded function to generate the built image's tag based on its content.
Expression contexts: .
Expression contexts: .
Expression contexts: .
Expression contexts: .
A list of optional build arguments passed to the image builder. See for details.
Expression contexts: .
Expression contexts: .
Expression contexts: .
Volumes can be synchronized between local and storage versions with the apolo-flow upload
and apolo-flow download
commands and they can be mounted to a job by using the attribute.
Required The volume URI on the ('storage:path/on/storage') or ('disk:').
Expression contexts: .
Expression contexts: .
Expression contexts: .
Expression contexts: .
A live workflow can run jobs by their identifiers using the apolo-flow run <job-id>
command. Each job runs remotely on the Apolo Platform. Jobs could be defined in two different ways: (1) directly in this file or in a separate file and called as an .
The parameters can be used in expressions for calculating other job attributes, e.g. .
Required Each job is executed remotely on the Apolo cluster using a Docker image. This image can be hosted on (python:3.9
or ubuntu:20.04
) or on the Apolo Registry (image:my_image:v2.3
). If the image is hosted on the Apolo Registry, the image name must start with the image:
prefix.
You may often want to use the reference to .
Expression contexts: , , , , , , (if is set).
Optional A job executes either a command, a bash script, or a python script. The cmd
, bash,
and python
are mutually exclusive: only one of the three is allowed at the same time. If none of these three attributes are specified, the from the is used.
The cmd
attribute points to the command with optional arguments that is available in the executed .
Expression contexts: , , , , , , (if is set).
Expression contexts: , , , , , , (if is set).
Expression contexts: , , , , , , (if is set).
Expression contexts: , , , , , , (if is set).
Expression contexts: , , , , , , (if is set).
Optional You can override a Docker image if needed or set it if one wasn't already specified. Unlike the Docker ENTRYPOINT
instruction which has a shell and exec form, the entrypoint
attribute only accepts a single string defining an executable to be run.
Expression contexts: , , , , , , (if is set).
Optional Set environment variables for <job-id>
to use in the executed job. You can also set environment variables for the entire workflow. For more information, see .
When two ore more environment variables are defined with the same name, apolo-flow
uses the most specific environment variable. For example, an environment variable defined in a task will override the .
Expression contexts: , , , , , , (if is set).
Expression contexts: , , , , , , (if is set).
Expression contexts: , , , , , , (if is set).
The value is used if the attribute is not set.
Expression contexts: , , , , , , (if is set).
Expression contexts: , , , , , , (if is set).
Expression contexts: , , , , , , (if is set).
Expression contexts: , , , , , , (if is set).
Optional The preset to execute the job with. is used if the preset is not specified for the job.
Expression contexts: , , , , , , (if is set).
See if you want to set a workflow-wide schedule timeout for all jobs.
Expression contexts: , , , , , , (if is set).
Each live job is tagged. A job's tags are taken from this attribute, , and system tags (project:<project-id>
and job:<job-id>
).
Expression contexts: , , , , , , (if is set).
Expression contexts: , , , , , , (if is set).
This attribute takes precedence if specified. Otherwise, takes priority. If none of the previous are specified, a definition from the image is used.
Expression contexts: , , , , , , (if is set).
The attributes described in this section are only applicable to action calls. An action is a reusable part that can be integrated into a workflow. Refer to the to learn more about actions.
Optional Mapping of values that will be passed to the actions as arguments. This should correspond to defined in the action file. Each value should be a string.
Expression contexts: , , , , , , (if is set).