Getting Started
Introduction
There are two things you will need to do before you start working with Apolo:
After this, you're free to explore the platform and it's functionality. As a good starting point, we've included a section about development on GPU with Jupyter Notebooks.
Understanding the main concepts
On the Apolo level, you will work with jobs, environments, and storage. To be more specific, a job (an execution unit) runs in a given environment (Docker container) on a given preset (a combination of CPU, GPU, and memory resources allocated for this job) with several storage instances (block or object storage) attached.
Here are some examples.
Hello, World!
Run a job on CPU which prints “Hello, World!” and shuts down:
apolo run --preset cpu-small --name test ubuntu -- echo Hello, World!
Executing this command will result in an output like this:
√ Job ID: job-7dd12c3c-ae8d-4492-bdb9-99509fda4f8c
√ Name: test
- Status: pending Creating
- Status: pending Scheduling
- Status: pending ContainerCreating
√ Http URL: https://test--jane-doe.jobs.default.org.neu.ro
√ The job will die in a day. See --life-span option documentation for details.
√ Status: succeeded
√ =========== Job is running in terminal mode ===========
√ (If you don't see a command prompt, try pressing enter)
√ (Use Ctrl-P Ctrl-Q key sequence to detach from the job)
Hello, World!
A simple GPU job
Run a job on GPU in the default Apolo environment (ghcr.io/neuro-inc/base
) that checks if CUDA is available in this environment:
apolo run --preset gpu-small --name test ghcr.io/neuro-inc/base -- python -c "import torch; print(torch.cuda.is_available());"
We used the gpu-small
preset for this job. To see the full list of presets you can use, run the following command:
apolo config show
Working with platform storage
Create a new demo
directory in the root directory of your platform storage:
apolo mkdir -p storage:demo
Run a job that mounts the demo
directory from platform storage to the /demo
directory in the job container and creates a file in it:
apolo run --volume storage:demo:/demo:rw ubuntu -- bash -c "echo Hello >> /demo/hello.txt"
Check that the file you have just created is actually on the storage:
apolo ls storage:demo
Developing on GPU with Jupyter Notebooks
Development in Jupyter Notebooks is a good example of how the Apolo Platform can be used. While you can run a Jupyter Notebooks session in one command through CLI or in one click in the Console, we recommend project-based development. To simplify the process, we provide a project template which is based on the cookiecutter package. This template provides the basic necessary folder structure and integrations with several recommended tools.
Initializing a Apolo cookiecutter flow
First, you will need to install the cookiecutter package via pip or pipx:
pipx install cookiecutter
Now, to initialize a new Apolo flow using cookiecutter template, run:
cookiecutter gh:neuro-inc/cookiecutter-neuro-project --checkout release
This command will prompt you to enter some info about your new flow:
project_name [Neuro Project]: New Cookiecutter Project
project_dir [new cookiecutter project]:
project_id [new_cookiecutter_project]:
code_directory [modules]:
preserve Neuro Flow template hints [yes]:
To navigate to the flow directory, run:
cd new-cookiecutter-project
Flow structure
The structure of the project's folder will look like this:
new-cookiecutter-project
├── .github/ <- Github workflows and a dependabot.yml file
├── .neuro/ <- apolo and apolo-flow CLI configuration files
├── config/ <- configuration files for various integrations
├── data/ <- training and testing datasets (we don't keep it under source control)
├── notebooks/ <- Jupyter notebooks
├── modules/ <- models' source code
├── results/ <- training artifacts
├── .gitignore <- default .gitignore file for a Python ML project
├── .neuro.toml <- autogenerated config file for Apolo CLI
├── .neuroignore <- a file telling Apolo CLI which files to ignore while uploading to the platform storage
├── HELP.md <- autogenerated template reference
├── README.md <- autogenerated informational file
├── Dockerfile <- description of the docker image used for training in your flow
├── apt.txt <- list of system packages to be installed in the training environment
├── requirements.txt <- list of Python dependencies to be installed in the training environment
├── setup.cfg <- linter settings (Python code quality checking)
└── update_actions.py <- script used to update apolo-flow actions in one of the GitHub workflows
The template contains the .neuro/live.yaml
configuration file for apolo-flow
. This file guarantees a proper connection between the flow structure, the base environment that we provide, and actions with storage and jobs. For example, the upload
command synchronizes sub-folders on your local machine with sub-folders on the persistent platform storage, and those sub-folders are synchronized with the corresponding sub-folders in job containers.
Setting up the environment and running Jupyter
To set up the project environment, run:
apolo-flow build train
apolo-flow mkvolumes
When these commands are executed, system packages from apt.txt
and pip dependencies from requirements.txt
are installed to the base environment. It supports CUDA by default and contains the most popular ML/AI frameworks such as Tensorflow and Pytorch.
For Jupyter Notebooks to run properly, the train.py
script and the notebook itself should be available on the storage. Upload the code
directory containing this file to the storage by using the following command:
apolo-flow upload ALL
Now you need to choose a preset on which you want to run your Jupyter jobs. To view the list of presets available on the current cluster, run:
apolo config show
To start a Jupyter Notebooks session run:
apolo-flow run jupyter
This command will open Jupyter Notebooks interface in your default browser.
Now, when you edit notebooks, they are updated on your platform storage. To download them locally (for example, to save them under a version control system), run:
apolo-flow download notebooks
Don’t forget to terminate your job when you no longer need it (the files won’t disappear after that):
apolo-flow kill jupyter
To check how many credits you have left, run:
apolo config show
Last updated
Was this helpful?