Live workflow syntax
Live workflow
The live workflow is always located at the .apolo/live.yml
or .apolo/live.yaml
file in the flow's root. The following YAML attributes are supported:
kind
kind
Required The workflow kind, must be live
for live workflows.
Expression contexts: This attribute cannot contain expressions.
id
id
Optional Identifier of the workflow. By default, the id
is live
. It's available as a ${{ flow.flow_id }}
in experssions.
Note: Don't confuse this with ${{ flow.project_id }}
, which is defined in the project configuration file.
Expression contexts: This attribute only allows expressions that don't access contexts.
title
title
Optional Workflow title, any valid string is allowed. It's accessible via ${{ flow.title }}
. If this is not set manually, the default workflow title live
will be used.
Expression contexts: This attribute only allows expressions that don't access contexts.
defaults
defaults
Optional section A map of default settings that will apply to all jobs in the workflow. You can override these global default settings for specific jobs.
defaults.env
defaults.env
A mapping of environment variables that will be set in all jobs of the workflow. You can also set environment variables that are only available to specific jobs. For more information, see jobs.<job-id>.env
.
When two or more environment variables are defined with the same name, apolo-flow
uses the most specific environment variable. For example, an environment variable defined in a job will override the workflow's default.
Example:
This attribute also supports dictionaries as values:
Expression contexts: flow
context.
defaults.life_span
defaults.life_span
The default lifespan for jobs ran by the workflow. It can be overridden by jobs.<job-id>.life_span
. If not set manually, the default job lifespan is 1 day. The lifespan value can be one of the following:
A
float
number representing the amount of seconds (3600
represents an hour)A string of the following format:
1d6h15m
(1 day, 6 hours, 15 minutes)
For lifespan-disabling emulation, use an arbitrary large value (e.g. 365d
). Keep in mind that this may be dangerous, as a forgotten job will consume cluster resources.
lifespan shorter than 1 minute is forbidden.
Example:
Expression contexts: flow
context.
defaults.preset
defaults.preset
The default preset used by all jobs if not overridden by jobs.<job-id>.preset
. The system-wide default preset is used if both defaults.preset
and jobs.<job-id>.preset
are omitted.
Example:
defaults.volumes
defaults.volumes
Volumes that will be mounted to all jobs by default.
Example:
Default volumes are not passed to actions.
Expression contexts: flow
context.
defaults.schedule_timeout
defaults.schedule_timeout
The default timeout for job scheduling. See jobs.<job-id>.schedule_timeout
for more information.
The attribute accepts the following values:
A
float
number representing the amount of seconds (3600
represents an hour)A string of the following format:
1d6h15m45s
(1 day, 6 hours, 15 minutes, 45 seconds)
The cluster-wide timeout is used if both default.schedule_timeout
and jobs.<job-id>.schedule_timeout
are omitted.
Example:
Expression contexts: flow
context.
defaults.tags
defaults.tags
A list of tags that are added to every job created by the workflow. A specific job's definition can extend this global list by using jobs.<job-id>.tags
.
Example:
This attribute supports lists as values.
Expression contexts: flow
context.
defaults.workdir
defaults.workdir
The default working directory for jobs created by this workflow. See jobs.<job-id>.workdir
for more information.
Example:
Expression contexts: flow
context.
images
images
Optional section A mapping of image definitions used by the live workflow.
apolo-flow build <image-id>
creates an image from the passed Dockerfile
and uploads it to the Apolo Registry. The ${{ images.img_id.ref }}
expression can be used for pointing an image from a jobs.<job-id>.image
.
The images
section is not required. A job can specify the image name in a plain string without referring to the ${{ images.my_image.ref }}
context.
However, this section exists for convenience: there is no need to repeat yourself if you can just point the image reference everywhere in the YAML.
images.<image-id>
images.<image-id>
The key image-id
is a string and its value is a map of the job's configuration data. You must replace <image-id>
with a string that is unique to the images
object. <image-id>
must start with a letter and contain only alphanumeric characters or underscore symbols _
. Dashes -
are not allowed.
images.<image-id>.ref
images.<image-id>.ref
Required Image reference that can be used in the jobs.<job-id>.image
expression.
Example of self-hosted image:
You can use the image definition to address images hosted on Docker Hub as an external source (while you can't use apolo-flow
to build this image). All other attributes except for ref
don't work for external images.
Example of external image:
Use the embedded hash_files()
function to generate the built image's tag based on its content.
Example of auto-calculated stable hash:
Expression contexts: flow
context.
images.<image-id>.context
images.<image-id>.context
Optional The Docker context used to build an image, a local path relative to the flow's root folder. The context should contain the Dockerfile
and any additional files and folders that should be copied to the image.
Example:
The flow's root folder is the folder that contains the '.apolo' directory. Its path might be referenced via ${{ flow.workspace }}/
.
apolo-flow
cannot build images without the context.
Expression contexts: flow
context.
images.<image-id>.dockerfile
images.<image-id>.dockerfile
Optional An docker file name used to build the image. If not set, a Dockerfile
name will be used by default.
Example:
Expression contexts: flow
context.
images.<image-id>.build_preset
images.<image-id>.build_preset
Optional A name of the resource preset used to build the docker image. Consider using it if, for instance, a GPU is required to build dependencies within the image.
Example:
Expression contexts: flow
context.
images.<image-id>.build_args
images.<image-id>.build_args
A list of optional build arguments passed to the image builder. See Docker documentation for details.
Example:
Expression contexts: flow
context.
images.<image-id>.env
images.<image-id>.env
A mapping of environment variables passed to the image builder.
Example:
This attribute also supports dictionaries as values:
You can also map platform secrets as the values of environment variables and later utilize them when building an image.
Let's assume you have a secret:github_password
which gives you access to a needed private repository. In this case, map it as an environment variable GH_PASS: secret:github_password
into the builder job and pass it further as --build-arg GH_PASS=$GH_PASS
while building the container.
Expression contexts: flow
context.
images.<image-id>.volumes
images.<image-id>.volumes
A list of volume references mounted to the image building process.
Example:
This attribute also supports lists as values:
You can also map platform secrets as files and later utilize them when building an image.
Let's assume you have a secret:aws_account_credentials
file which gives you access to an S3 bucket needed during building. In this case, attach it as a volume - secret:aws_account_credentials:/kaniko_context/aws_account_credentials
into the builder job. A file with credentials will appear in the root of the build context, since the build context is mounted in the /kaniko_context
folder within the builder job.
Expression contexts: flow
context.
volumes
volumes
Optional section A mapping of volume definitions available in the live workflow. A volume defines a link between the Apolo storage folder, a remote folder that can be mounted to a live job, and a local folder.
Volumes can be synchronized between local and storage versions with the apolo-flow upload
and apolo-flow download
commands and they can be mounted to a job by using the jobs.<job-id>.volumes
attribute.
The volumes
section is optional. A job can mount a volume by a direct reference string.
However, this section is very handy to use in a bundle with run
, upload
, and download
commands: define a volume once and refer to it everywhere by name instead of using full definition details.
volumes.<volume-id>
volumes.<volume-id>
The key volume-id
is a string and its value is a map of the volume's configuration data. You must replace <volume-id>
with a string that is unique to the volumes
object. The <volume-id>
must start with a letter and contain only alphanumeric characters or underscore symbols _
. Dashes -
are not allowed.
volumes.<volume-id>.remote
volumes.<volume-id>.remote
Required The volume URI on the Apolo Storage ('storage:path/on/storage') or Apolo Disk ('disk:').
Example:
Expression contexts: flow
context.
volumes.<volume-id>.mount
volumes.<volume-id>.mount
Required The mount path inside a job.
Example:
Expression contexts: flow
context.
volumes.<volume-id>.local
volumes.<volume-id>.local
Optional Volumes can also be associated with folders on a local machine. A local path should be relative to the flow's root and will be used for uploading/downloading content to the storage.
Volumes without a set local
attribute cannot be used by the apolo-flow upload
and apolo-flow download
commands.
Example:
apolo-flow upload
and apolo-flow download
will not work for volumes whose remote is the Apolo Disk due to specifics of how disks work.
Expression contexts: flow
context.
volumes.<volume-id>.read_only
volumes.<volume-id>.read_only
Optional The volume is mounted as read-only by default if this attribute is set, read-write mode is used otherwise.
Example:
Expression contexts: flow
context.
jobs
jobs
A live workflow can run jobs by their identifiers using the apolo-flow run <job-id>
command. Each job runs remotely on the Apolo Platform. Jobs could be defined in two different ways: (1) directly in this file or in a separate file and called as an action
.
jobs.<job-id>
jobs.<job-id>
Each job must have an associated ID. The key job-id
is a string and its value is a map of the job's configuration data or action call. You must replace <job-id>
with a string that is unique to the jobs
object. The <job-id>
must start with a letter and contain only alphanumeric characters or underscore symbols _
. Dash -
is not allowed.
Attributes for jobs and action calls
The attributes described in this section can be applied both to plain jobs and action calls. To simplify reading, this section uses the term "job" instead of "job or action call".
jobs.<job-id>.params
jobs.<job-id>.params
Params is a mapping of key-value pairs that have default value and could be overridden from a command line by using apolo-flow run <job-id> --param name1 val1 --param name2 val2
.
This attribute describes a set of names and default values of parameters accepted by a job.
Parameters can be specified in short and long forms.
The short form is compact, but only allows to specify the parameter's name and default value:
The long form also allows to specify parameter descriptions. This can be useful for apolo-flow run
command introspection, shell autocompletion, and generation of more detailed error messages.
The parameters can be used in expressions for calculating other job attributes, e.g. jobs.<job-id>.cmd
.
Example:
Expression contexts: This attribute only allows expressions that don't access contexts.
Attributes for jobs
The attributes described in this section are only applicable to plain jobs that are executed by running docker images on the Apolo platform.
jobs.<job-id>.image
jobs.<job-id>.image
Required Each job is executed remotely on the Apolo cluster using a Docker image. This image can be hosted on Docker Hub (python:3.9
or ubuntu:20.04
) or on the Apolo Registry (image:my_image:v2.3
). If the image is hosted on the Apolo Registry, the image name must start with the image:
prefix.
Example with a constant image string:
You may often want to use the reference to images.<image-id>
.
Example with a reference to images
section:
Expression contexts: flow
context, env
context, tags
context, volumes
context, images
context, params
context, multi
context (if jobs.<job-id>.multi
is set).
jobs.<job-id>.cmd
jobs.<job-id>.cmd
Optional A job executes either a command, a bash script, or a python script. The cmd
, bash,
and python
are mutually exclusive: only one of the three is allowed at the same time. If none of these three attributes are specified, the CMD
from the jobs.<job-id>.image
is used.
The cmd
attribute points to the command with optional arguments that is available in the executed jobs.<job-id>.image
.
Example:
Expression contexts: flow
context, env
context, tags
context, volumes
context, images
context, params
context, multi
context (if jobs.<job-id>.multi
is set).
jobs.<job-id>.bash
jobs.<job-id>.bash
Optional This attribute contains a bash
script to run.
Using cmd
to run bash scripts can be tedious: you need to apply quotas to the executed script and set proper bash flags allowing to fail on error.
The bash
attribute is essentially a shortcut for cmd: bash -euo pipefail -c <shell_quoted_attr>
.
This form is especially handy for executing complex multi-line bash scripts.
cmd
, bash
, and python
are mutually exclusive.
bash
should be pre-installed on the image to make this attribute work.
Example:
Expression contexts: flow
context, env
context, tags
context, volumes
context, images
context, params
context, multi
context (if jobs.<job-id>.multi
is set).
jobs.<job-id>.python
jobs.<job-id>.python
This attribute contains a python
script to run.
Python is usually considered to be one of the best languages for scientific calculation. If you prefer writing simple inlined commands in python
instead of bash
, this notation is great for you.
The python
attribute is essentially a shortcut for cmd: python3 -uc <shell_quoted_attr>
.
The cmd
, bash
, and python
are mutually exclusive.
python3
should be pre-installed on the image to make this attribute work.
Example:
Expression contexts: flow
context, env
context, tags
context, volumes
context, images
context, params
context, multi
context (if jobs.<job-id>.multi
is set).
jobs.<job-id>.browse
jobs.<job-id>.browse
Optional Open a job's Http URL in a browser after the job startup. false
by default.
Use this attribute in scenarios like starting a Jupyter Notebook job and opening the notebook session in a browser.
Example:
Expression contexts: flow
context, env
context, tags
context, volumes
context, images
context, params
context, multi
context (if jobs.<job-id>.multi
is set).
jobs.<job-id>.detach
jobs.<job-id>.detach
Optional By default, apolo-flow run <job-id>
keeps the terminal attached to the spawned job. This can help with viewing the job's logs and running commands in its embedded bash session.
Enable the detach
attribute to disable this behavior.
Example:
Expression contexts: flow
context, env
context, tags
context, volumes
context, images
context, params
context, multi
context (if jobs.<job-id>.multi
is set).
jobs.<job-id>.entrypoint
jobs.<job-id>.entrypoint
Optional You can override a Docker image ENTRYPOINT
if needed or set it if one wasn't already specified. Unlike the Docker ENTRYPOINT
instruction which has a shell and exec form, the entrypoint
attribute only accepts a single string defining an executable to be run.
Example:
Expression contexts: flow
context, env
context, tags
context, volumes
context, images
context, params
context, multi
context (if jobs.<job-id>.multi
is set).
jobs.<job-id>.env
jobs.<job-id>.env
Optional Set environment variables for <job-id>
to use in the executed job. You can also set environment variables for the entire workflow. For more information, see defaults.env
.
When two ore more environment variables are defined with the same name, apolo-flow
uses the most specific environment variable. For example, an environment variable defined in a task will override the workflow default.
Example:
This attribute also supports dictionaries as values:
Expression contexts: flow
context, env
context, tags
context, volumes
context, images
context, params
context, multi
context (if jobs.<job-id>.multi
is set).
jobs.<job-id>.http_auth
jobs.<job-id>.http_auth
Optional Control whether the HTTP port exposed by the job requires the Apolo Platform authentication for access.
You may want to disable the authentication to allow everybody to access your job's exposed web resource.
By default, jobs have HTTP protection enabled.
Example:
Expression contexts: flow
context, env
context, tags
context, volumes
context, images
context, params
context, multi
context (if jobs.<job-id>.multi
is set).
jobs.<job-id>.http_port
jobs.<job-id>.http_port
Optional The job's HTTP port number that will be exposed globally on the platform.
By default, the Apolo Platform exposes the job's internal 80
port as an HTTPS-protected resource. This will be listed in the oputput of the apolo-flow status <job-id>
command as Http URL.
You may want to expose a different local port. Use 0
to disable the feature entirely.
Example:
Only HTTP traffic is allowed. The platform will automatically encapsulate it into TLS to provide an HTTPS connection.
Expression contexts: flow
context, env
context, tags
context, volumes
context, images
context, params
context, multi
context (if jobs.<job-id>.multi
is set).
jobs.<job-id>.life_span
jobs.<job-id>.life_span
Optional The time period after which a job will be automatically killed.
By default, jobs live 1 day. You may want to change this period by customizing the attribute.
The value could be:
A float number representing an amount of seconds (
3600
for an hour)An expression in the following format:
1d6h15m
(1 day, 6 hours, 15 minutes)
Use an arbitrary large value (e.g. 365d
) for lifespan-disabling emulation. Keep in mind that this can be dangerous, as a forgotten job will consume cluster resources.
The defaults.life_span
value is used if the attribute is not set.
Example:
Expression contexts: flow
context, env
context, tags
context, volumes
context, images
context, params
context, multi
context (if jobs.<job-id>.multi
is set).
jobs.<job-id>.name
jobs.<job-id>.name
Optional Allows you to specify a job's name. This name becomes a part of the job's internal hostname and exposed HTTP URL, and the job can then be controlled by its name through the low-level apolo
tool.
The name is completely optional.
Example:
If the name is not specified in the name
attribute, the default name for the job will be automatically generated as follows:
The [-<MULTI_SUFFIX>]
part makes sure that a job will have a unique name even if it's a multi job.
Expression contexts: flow
context, env
context, tags
context, volumes
context, images
context, params
context, multi
context (if jobs.<job-id>.multi
is set).
jobs.<job-id>.multi
jobs.<job-id>.multi
Optional By default, a job can only have one running instance at a time. Calling apolo-flow run <job-id>
for the same job ID will attach to the already running job instead of creating a new one. This can be overridden by enabling the multi
attribute.
Example:
Expression contexts: This attribute only allows expressions that don't access contexts.
jobs.<job-id>.pass_config
jobs.<job-id>.pass_config
Optional Set this attribute to true
if you want to pass the Apolo config used to execute the apolo-flow run ...
command into the spawned job. This can be useful if you're using a job image with Apolo CLI installed and want to execute apolo ...
commands inside the running job.
By default, the config is not passed.
Example:
The lifetime of passed credentials is bound to the job's lifetime. It will be impossible to use them when the job is terminated.
Expression contexts: flow
context, env
context, tags
context, volumes
context, images
context, params
context, multi
context (if jobs.<job-id>.multi
is set).
jobs.<job-id>.restart
jobs.<job-id>.restart
Optional Control the job behavior when main process exits.
Possible values: never
(default), on-failure
and always
.
Set this attribute to on-failure
if you want your job to be restarted if the main process exits with non-zero exit code. If you set this attribute to always,
the job will be restarted even if the main process exits with 0. In this case you will need to terminate the job manually or it will be automatically terminated when it's lifespan ends. never
implies the platform does not restart the job and this value is used by default.
Example:
Expression contexts: flow
context, env
context, tags
context, volumes
context, images
context, params
context, multi
context (if jobs.<job-id>.multi
is set).
jobs.<job-id>.port-forward
jobs.<job-id>.port-forward
Optional You can define a list of TCP tunnels for the job.
Each port forward entry is a string of a <LOCAL_PORT>:<REMOTE_PORT>
format.
When the job starts, all enumerated remote ports on the job's side are bound to the developer's box and are available under the corresponding local port numbers.
You can use this feature for remote debugging, accessing a database running in the job, etc.
Example:
This attribute also supports lists as values:
jobs.<job-id>.preset
jobs.<job-id>.preset
Optional The preset to execute the job with. defaults.preset
is used if the preset is not specified for the job.
Expression contexts: flow
context, env
context, tags
context, volumes
context, images
context, params
context, multi
context (if jobs.<job-id>.multi
is set).
jobs.<job-id>.schedule_timeout
jobs.<job-id>.schedule_timeout
Optional Use this attribute if you want to increase the schedule timeout. This will prevent a job from failing if the Apolo cluster is under high load and requested resources are likely to not be available at the moment.
If the Apolo cluster has no resources to launch a job immediately, this job is pushed into the waiting queue. If the job is still not started at the end of the schedule timeout, it will be failed.
The default system-wide schedule timeout is controlled by the cluster administrator and is usually about 5-10 minutes.
The value of this attribute can be:
A
float
number representing an amount of secondsA string in the following format:
1d6h15m45s
(1 day, 6 hours, 15 minutes, 45 seconds)
See defaults.schedule_timeout
if you want to set a workflow-wide schedule timeout for all jobs.
Example:
Expression contexts: flow
context, env
context, tags
context, volumes
context, images
context, params
context, multi
context (if jobs.<job-id>.multi
is set).
jobs.<job-id>.tags
jobs.<job-id>.tags
Optional A list of additional job tags.
Each live job is tagged. A job's tags are taken from this attribute, defaults.tags
, and system tags (project:<project-id>
and job:<job-id>
).
Example:
This attribute also supports lists as values:
jobs.<job-id>.title
jobs.<job-id>.title
Optional A job's title. Equal to <job-id>
by default if not overridden.
Expression contexts: flow
context, env
context, tags
context, volumes
context, images
context, params
context, multi
context (if jobs.<job-id>.multi
is set).
jobs.<job-id>.volumes
jobs.<job-id>.volumes
Optional A list of job volumes. You can specify a plain string for a volume reference or use the ${{ volumes.<volume-id>.ref }}
expression.
Example:
This attribute also supports lists as values:
Expression contexts: flow
context, env
context, tags
context, volumes
context, images
context, params
context, multi
context (if jobs.<job-id>.multi
is set).
jobs.<job-id>.workdir
jobs.<job-id>.workdir
Optional A working directory to use inside the job.
This attribute takes precedence if specified. Otherwise, defaults.workdir
takes priority. If none of the previous are specified, a WORKDIR
definition from the image is used.
Example:
Expression contexts: flow
context, env
context, tags
context, volumes
context, images
context, params
context, multi
context (if jobs.<job-id>.multi
is set).
Attributes for actions calls
The attributes described in this section are only applicable to action calls. An action is a reusable part that can be integrated into a workflow. Refer to the actions reference to learn more about actions.
jobs.<job-id>.action
jobs.<job-id>.action
Required A URL that selects an action to run. It supports two schemes:
workspace:
orws:
for action files that are stored locallygithub:
orgh:
for actions that are bound to a Github repository
The ws:
scheme expects a valid filesystem path to the action file.
The gh:
scheme expects the following format: {owner}/{repo}@{tag}
. Here, {owner}
is the owner of the Github repository, {repo}
is the repository's name, and {tag}
is the commit tag. Commit tags are used to allow versioning of actions.
Example of the ws:
scheme
Example of the gh:
scheme
Expression contexts: This attribute only allows expressions that don't access contexts.
jobs.<job-id>.args
jobs.<job-id>.args
Optional Mapping of values that will be passed to the actions as arguments. This should correspond to inputs
defined in the action file. Each value should be a string.
Example:
Expression contexts: flow
context, env
context, tags
context, volumes
context, images
context, params
context, multi
context (if jobs.<job-id>.multi
is set).
Last updated