Supported AI Providers

Catalio supports a wide range of AI/LLM providers, giving you flexibility to choose the right provider for your organization’s needs. Whether you prioritize performance, cost, data residency, or self-hosting, Catalio has you covered.

Provider Overview

Provider	Chat	Embeddings	Vision	Requires Endpoint	Best For
OpenAI	Yes	Yes	Yes	No	General use, best quality
Anthropic	Yes	No	Yes	No	Long context, safety
Azure OpenAI	Yes	Yes	Yes	Yes	Enterprise compliance
Google Gemini	Yes	Yes	Yes	No	Multimodal, Google integration
Groq	Yes	No	No	No	Ultra-fast inference
xAI (Grok)	Yes	No	No	No	Alternative reasoning
Ollama	Yes	Yes	Yes	Yes	Self-hosted, privacy
OpenRouter	Yes	Yes	Yes	No	Multi-model access, flexibility
GitHub Copilot	Yes	No	Yes	No	GitHub Enterprise integration

OpenAI

Website: platform.openai.com

OpenAI provides the industry-leading GPT family of models, offering excellent quality across chat, embeddings, and vision capabilities.

Supported Models

Chat Models:

GPT-4o (recommended for complex analysis)
GPT-4o-mini (fast, cost-effective)
GPT-4 Turbo
GPT-3.5 Turbo
o1, o1-mini (reasoning models)

Embedding Models:

text-embedding-3-small (recommended, 1536 dimensions)
text-embedding-3-large (3072 dimensions)
text-embedding-ada-002 (legacy)

Vision-Capable Models:

GPT-4o
GPT-4o-mini
GPT-4 Turbo

Configuration

Plaintext

Provider: OpenAI
API Key: sk-proj-XXXXXXXXXXXXXXXXX
Default Chat Model: gpt-4o
Default Embedding Model: text-embedding-3-small

Pricing (Approximate)

GPT-4o: $2.50-$10.00 per 1M tokens
GPT-4o-mini: $0.15-$0.60 per 1M tokens
text-embedding-3-small: $0.02 per 1M tokens

Best Use Cases

Complex requirement analysis requiring nuanced reasoning
High-quality semantic search via embeddings
Vision-based document analysis
General-purpose AI features

Anthropic (Claude)

Website: console.anthropic.com

Anthropic’s Claude models excel at long-context understanding, making them ideal for analyzing large requirement portfolios. Claude prioritizes safety and helpful, harmless responses.

Supported Models

Chat Models:

Claude Opus 4 (most capable)
Claude Sonnet 4 (balanced performance)
Claude 3.5 Sonnet (previous generation, excellent)
Claude 3 Opus
Claude 3 Haiku (fast, cost-effective)

Vision-Capable Models:

Claude 3+ models (all support vision)

Note: Anthropic does not support embedding generation. Use OpenAI, Gemini, or Ollama for embeddings if you need semantic search.

Configuration

Plaintext

Provider: Anthropic
API Key: sk-ant-api03-XXXXXXXXXXXXXXXXX
Default Chat Model: claude-sonnet-4-5-20250929

Pricing (Approximate)

Claude Opus 4: $15-$75 per 1M tokens
Claude Sonnet 4: $3-$15 per 1M tokens
Claude 3 Haiku: $0.25-$1.25 per 1M tokens

Best Use Cases

Analyzing large requirement portfolios (200K token context)
Long-form document processing
Safety-critical applications
Nuanced business reasoning

Azure OpenAI

Website: azure.microsoft.com/products/ai-services/openai-service

Azure OpenAI provides OpenAI’s models hosted in your Azure tenant, offering enterprise-grade security, compliance certifications, and data residency guarantees.

Supported Models

Same models as OpenAI, deployed as named deployments in your Azure resource:

GPT-4o, GPT-4 Turbo, GPT-3.5 Turbo
text-embedding-3-small, text-embedding-ada-002

Configuration

Plaintext

Provider: Azure OpenAI
API Key: [Your Azure OpenAI API Key]
Endpoint URL: https://your-resource.openai.azure.com
Default Chat Model: gpt-4o (deployment name)
Default Embedding Model: text-embedding-3-small (deployment name)

Important: Azure OpenAI requires an endpoint URL pointing to your Azure OpenAI resource.

Enterprise Features

SOC 2, HIPAA, FedRAMP compliance
Data residency (EU, US, Asia regions)
Private Link and VNet integration
Microsoft Entra ID authentication
Customer-managed encryption keys

Best Use Cases

Regulated industries (healthcare, finance, government)
European organizations requiring GDPR compliance
Enterprises with existing Azure infrastructure
Organizations requiring data residency guarantees

Google Gemini

Website: ai.google.dev

Google’s Gemini models offer strong multimodal capabilities, including native image understanding and embedding generation.

Supported Models

Chat Models:

Gemini 2.0 Pro (recommended)
Gemini 2.0 Flash
Gemini 1.5 Pro
Gemini 1.5 Flash

Embedding Models:

gemini-embedding-001

Vision-Capable Models:

Gemini 1.5+ models (all support vision)
Gemini 2.0 models

Configuration

Plaintext

Provider: Google Gemini
API Key: [Your Google AI API Key]
Default Chat Model: gemini-2.0-pro
Default Embedding Model: gemini-embedding-001

Best Use Cases

Multimodal analysis (images + text)
Organizations already using Google Cloud
Applications requiring strong multilingual support

Groq

Website: groq.com

Groq provides ultra-fast inference for open-source models like Llama and Mixtral using custom hardware. Response times are significantly faster than traditional cloud providers.

Supported Models

Chat Models:

llama3-70b-8192 (recommended)
llama3-8b-8192
mixtral-8x7b-32768
gemma-7b-it
gemma2-9b-it

Note: Groq does not support embeddings or vision. Use another provider for semantic search.

Configuration

Plaintext

Provider: Groq
API Key: gsk_XXXXXXXXXXXXXXXXX
Default Chat Model: llama3-70b-8192

Pricing (Approximate)

Llama 3 70B: $0.59-$0.79 per 1M tokens
Llama 3 8B: $0.05-$0.08 per 1M tokens

Best Use Cases

Real-time applications requiring fast responses
High-volume, simple tasks
Cost-effective chat without embeddings
Testing with open-source models

xAI (Grok)

Website: x.ai

xAI’s Grok models offer alternative reasoning approaches and real-time knowledge. Uses an OpenAI-compatible API format.

Supported Models

Chat Models:

grok-2-1212 (recommended)
grok-2
grok-beta

Note: xAI does not support embeddings or vision.

Configuration

Plaintext

Provider: xAI (Grok)
API Key: xai-XXXXXXXXXXXXXXXXX
Default Chat Model: grok-2-1212

Best Use Cases

Alternative reasoning perspectives
Applications benefiting from real-time knowledge
Organizations wanting LLM diversity

Ollama (Self-Hosted)

Website: ollama.com

Ollama enables running open-source models locally on your own infrastructure. Perfect for maximum data privacy, air-gapped environments, or cost-sensitive high-volume usage.

Supported Models

Models depend on what you install locally:

Chat Models:

llama3, llama3:70b
mistral, mixtral
phi3
codellama
Many others

Embedding Models:

nomic-embed-text (recommended)
mxbai-embed-large
all-minilm

Vision-Capable Models:

llava
bakllava

Configuration

Plaintext

Provider: Ollama
Endpoint URL: http://localhost:11434 (or your server address)
Default Chat Model: llama3
Default Embedding Model: nomic-embed-text

Important: Ollama requires an endpoint URL. No API key is needed for local Ollama, but you may configure authentication for remote deployments.

Infrastructure Requirements

CPU: 8+ cores recommended
RAM: 16GB minimum, 32GB+ for larger models
GPU: Optional but significantly improves performance
Storage: 10-100GB depending on models installed

Best Use Cases

Air-gapped or offline environments
Maximum data privacy (data never leaves your infrastructure)
High-volume usage without per-token costs
Testing and development without API costs

OpenRouter

Website: openrouter.ai

OpenRouter provides unified access to 100+ models from multiple providers through a single API. Pay per use with no commitments.

Supported Models

Access to all major providers through OpenRouter:

OpenAI: GPT-4o, GPT-4 Turbo, GPT-3.5 Turbo
Anthropic: Claude 3/4 family
Meta: Llama 3
Mistral: Mixtral, Mistral Large
Google: Gemini Pro
Many more

Embedding Models:

openai/text-embedding-3-small
openai/text-embedding-ada-002

Configuration

Plaintext

Provider: OpenRouter
API Key: sk-or-v1-XXXXXXXXXXXXXXXXX
Default Chat Model: openai/gpt-4o
Default Embedding Model: openai/text-embedding-3-small

Pricing

Pay-as-you-go with per-model pricing. OpenRouter adds a small margin to provider costs for the unified access layer.

Best Use Cases

Trying multiple models without separate accounts
Fallback routing when primary provider is unavailable
Cost optimization across providers
Accessing models not directly available in your region

GitHub Copilot Enterprise

Website: github.com/features/copilot

GitHub Copilot Enterprise provides access to GPT-4 models through GitHub’s infrastructure. Ideal for organizations already using GitHub Enterprise.

Supported Models

Chat Models:

gpt-4o
gpt-4
gpt-3.5-turbo

Vision-Capable Models:

gpt-4o

Note: GitHub Copilot does not support embeddings. Use another provider for semantic search.

Configuration

Plaintext

Provider: GitHub Copilot
API Key: [Your GitHub Copilot Token]
Default Chat Model: gpt-4o

Requirements

GitHub Enterprise Cloud subscription
GitHub Copilot Enterprise license
Appropriate permissions in your GitHub organization

Best Use Cases

Organizations already invested in GitHub Enterprise
Unified billing through GitHub
Development-focused AI features

Choosing the Right Provider

For Most Organizations: OpenAI

Start with OpenAI for the best balance of quality, features, and ease of setup. OpenAI provides:

Excellent chat and embedding models
Vision capabilities
Straightforward API access
Comprehensive documentation

For Enterprise Compliance: Azure OpenAI

Choose Azure OpenAI when you need:

SOC 2, HIPAA, or FedRAMP compliance
Data residency guarantees (EU, specific regions)
Integration with existing Azure infrastructure
Customer-managed encryption keys

For Maximum Privacy: Ollama

Choose Ollama for self-hosted deployment when:

Data must never leave your infrastructure
You operate in air-gapped environments
You need to eliminate per-token costs
You want full control over the AI stack

For Long Context: Anthropic

Choose Anthropic Claude when:

You need to analyze large documents or portfolios (200K tokens)
Safety and alignment are priorities
You prefer alternative reasoning approaches

For Speed: Groq

Choose Groq when:

Response time is critical
You need cost-effective high-volume processing
Embedding support is not required

For Flexibility: OpenRouter

Choose OpenRouter when:

You want to try multiple providers easily
You need fallback options across providers
You want unified billing across multiple models

Multiple Provider Strategy

Many organizations benefit from using multiple providers:

Feature	Primary Provider	Rationale
Chat/Analysis	OpenAI GPT-4o	Best quality for complex reasoning
Embeddings	OpenAI	Industry-leading embeddings
Long Documents	Anthropic Claude	200K token context window
Cost-Sensitive	Groq	Fast, affordable for simple tasks
Compliance	Azure OpenAI	When required by regulations

Configure multiple providers in Catalio and assign each to specific features based on your needs.

Next Steps

Setting Up LLM API Keys - Detailed configuration guide
Bring Your Own LLM (BYOLLM) - Configure your own AI provider
AI Features and Data Privacy - How Catalio protects your data

Last Updated: December 2025