Text Embeddings Inference

The Text Embeddings Apparrow-up-right transforms raw text into dense, high-dimensional vectors using state-of-the-art embedding models such as BERT, RoBERTa, or other models. These embeddings capture semantic meaning and can be used as input for downstream ML tasks or stored in vector databases.

Supported Models

Text Embeddings Inference currently supports Nomic, BERT, CamemBERT, XLM-RoBERTa models with absolute positions, JinaBERT model with Alibi positions and Mistral, Alibaba GTE, Qwen2 models with Rope positions, MPNet, and ModernBERT.

More detailed description can be found in Github Repoarrow-up-right

Key Features

Apolo deployment

Field
Description

Resource Preset

Required. Apolo preset for resources. E.g. gpu-xlarge, H100X1, mi210x2. Sets CPU, memory, GPU count, and GPU provider.

Hugging Face Model

Required. Provide a Model Name in specified field. And Higging Face token if model is gated. E.g. sentence-transformers/all-mpnet-base-v2

Enable HTTP Ingress

Exposes an application externally over HTTPS

Web Console UI

Step1 - Select the Preset you want to use (Currently only GPU-accelerated presets are supported)

Step2 - Select Model from HuggingFacearrow-up-right repositories

Part 1 Text Embeddings Inference installation process
Part 2 Text Embeddings Inference installation process

If Model is gatedarrow-up-right, please provide the HuggingFace token, as a string of Apolo Secret.

Step3 - Install and wait for the application to be deployed. Once installed, you can find the API endpoint URL in the Outputs section of the app details page.

Application details

Usage

References

Last updated

Was this helpful?