LiteLLM is a unified API layer that standardizes calls across 100+ LLM providers using the OpenAI format. It supports load balancing, fallbacks, spend tracking, and rate limiting. You can use LiteLLM to route requests to MARA Cloud alongside other providers.
Prerequisites
- Python 3.8+
- A MARA Cloud API key. See API Keys and URLs to generate one.
Setup
Install LiteLLM:
bash
pip install litellmConfiguration
Use LiteLLM's
openai/ prefix with a custom API base to route requests to MARA Cloud:python
import litellm
response = litellm.completion(
model="openai/MiniMax-M2.5",
api_base="https://api.cloud.mara.com/v1",
api_key="your-mara-api-key",
messages=[
{"role": "user", "content": "Explain what LiteLLM does in two sentences."}
],
)
print(response.choices[0].message.content)Using with LiteLLM Proxy
If you're running LiteLLM as a proxy server, add MARA Cloud to your
config.yaml:yaml
model_list:
- model_name: mara-minimax-m25
litellm_params:
model: openai/MiniMax-M2.5
api_base: https://api.cloud.mara.com/v1
api_key: your-mara-api-keyStart the proxy:
bash
litellm --config config.yamlAll requests to
mara-minimax-m25 will now be routed to MARA Cloud.Learn more
- Model Catalog - Browse all available models.
- LiteLLM Documentation - Official LiteLLM docs.