Catalio supports a wide range of AI/LLM providers, giving you flexibility to choose the right provider for your organization’s needs. Whether you prioritize performance, cost, data residency, or self-hosting, Catalio has you covered.
Provider Overview
| Provider | Chat | Embeddings | Vision | Requires Endpoint | Best For |
|---|---|---|---|---|---|
| OpenAI | Yes | Yes | Yes | No | General use, best quality |
| Anthropic | Yes | No | Yes | No | Long context, safety |
| Azure OpenAI | Yes | Yes | Yes | Yes | Enterprise compliance |
| Google Gemini | Yes | Yes | Yes | No | Multimodal, Google integration |
| Groq | Yes | No | No | No | Ultra-fast inference |
| xAI (Grok) | Yes | No | No | No | Alternative reasoning |
| Ollama | Yes | Yes | Yes | Yes | Self-hosted, privacy |
| OpenRouter | Yes | Yes | Yes | No | Multi-model access, flexibility |
| GitHub Copilot | Yes | No | Yes | No | GitHub Enterprise integration |
OpenAI
Website: platform.openai.com
OpenAI provides the industry-leading GPT family of models, offering excellent quality across chat, embeddings, and vision capabilities.
Supported Models
Chat Models:
- GPT-4o (recommended for complex analysis)
- GPT-4o-mini (fast, cost-effective)
- GPT-4 Turbo
- GPT-3.5 Turbo
- o1, o1-mini (reasoning models)
Embedding Models:
- text-embedding-3-small (recommended, 1536 dimensions)
- text-embedding-3-large (3072 dimensions)
- text-embedding-ada-002 (legacy)
Vision-Capable Models:
- GPT-4o
- GPT-4o-mini
- GPT-4 Turbo
Configuration
Provider: OpenAI
API Key: sk-proj-XXXXXXXXXXXXXXXXX
Default Chat Model: gpt-4o
Default Embedding Model: text-embedding-3-small
Pricing (Approximate)
- GPT-4o: $2.50-$10.00 per 1M tokens
- GPT-4o-mini: $0.15-$0.60 per 1M tokens
- text-embedding-3-small: $0.02 per 1M tokens
Best Use Cases
- Complex requirement analysis requiring nuanced reasoning
- High-quality semantic search via embeddings
- Vision-based document analysis
- General-purpose AI features
Anthropic (Claude)
Website: console.anthropic.com
Anthropic’s Claude models excel at long-context understanding, making them ideal for analyzing large requirement portfolios. Claude prioritizes safety and helpful, harmless responses.
Supported Models
Chat Models:
- Claude Opus 4 (most capable)
- Claude Sonnet 4 (balanced performance)
- Claude 3.5 Sonnet (previous generation, excellent)
- Claude 3 Opus
- Claude 3 Haiku (fast, cost-effective)
Vision-Capable Models:
- Claude 3+ models (all support vision)
Note: Anthropic does not support embedding generation. Use OpenAI, Gemini, or Ollama for embeddings if you need semantic search.
Configuration
Provider: Anthropic
API Key: sk-ant-api03-XXXXXXXXXXXXXXXXX
Default Chat Model: claude-sonnet-4-5-20250929
Pricing (Approximate)
- Claude Opus 4: $15-$75 per 1M tokens
- Claude Sonnet 4: $3-$15 per 1M tokens
- Claude 3 Haiku: $0.25-$1.25 per 1M tokens
Best Use Cases
- Analyzing large requirement portfolios (200K token context)
- Long-form document processing
- Safety-critical applications
- Nuanced business reasoning
Azure OpenAI
Website: azure.microsoft.com/products/ai-services/openai-service
Azure OpenAI provides OpenAI’s models hosted in your Azure tenant, offering enterprise-grade security, compliance certifications, and data residency guarantees.
Supported Models
Same models as OpenAI, deployed as named deployments in your Azure resource:
- GPT-4o, GPT-4 Turbo, GPT-3.5 Turbo
- text-embedding-3-small, text-embedding-ada-002
Configuration
Provider: Azure OpenAI
API Key: [Your Azure OpenAI API Key]
Endpoint URL: https://your-resource.openai.azure.com
Default Chat Model: gpt-4o (deployment name)
Default Embedding Model: text-embedding-3-small (deployment name)
Important: Azure OpenAI requires an endpoint URL pointing to your Azure OpenAI resource.
Enterprise Features
- SOC 2, HIPAA, FedRAMP compliance
- Data residency (EU, US, Asia regions)
- Private Link and VNet integration
- Microsoft Entra ID authentication
- Customer-managed encryption keys
Best Use Cases
- Regulated industries (healthcare, finance, government)
- European organizations requiring GDPR compliance
- Enterprises with existing Azure infrastructure
- Organizations requiring data residency guarantees
Google Gemini
Website: ai.google.dev
Google’s Gemini models offer strong multimodal capabilities, including native image understanding and embedding generation.
Supported Models
Chat Models:
- Gemini 2.0 Pro (recommended)
- Gemini 2.0 Flash
- Gemini 1.5 Pro
- Gemini 1.5 Flash
Embedding Models:
- gemini-embedding-001
Vision-Capable Models:
- Gemini 1.5+ models (all support vision)
- Gemini 2.0 models
Configuration
Provider: Google Gemini
API Key: [Your Google AI API Key]
Default Chat Model: gemini-2.0-pro
Default Embedding Model: gemini-embedding-001
Best Use Cases
- Multimodal analysis (images + text)
- Organizations already using Google Cloud
- Applications requiring strong multilingual support
Groq
Website: groq.com
Groq provides ultra-fast inference for open-source models like Llama and Mixtral using custom hardware. Response times are significantly faster than traditional cloud providers.
Supported Models
Chat Models:
- llama3-70b-8192 (recommended)
- llama3-8b-8192
- mixtral-8x7b-32768
- gemma-7b-it
- gemma2-9b-it
Note: Groq does not support embeddings or vision. Use another provider for semantic search.
Configuration
Provider: Groq
API Key: gsk_XXXXXXXXXXXXXXXXX
Default Chat Model: llama3-70b-8192
Pricing (Approximate)
- Llama 3 70B: $0.59-$0.79 per 1M tokens
- Llama 3 8B: $0.05-$0.08 per 1M tokens
Best Use Cases
- Real-time applications requiring fast responses
- High-volume, simple tasks
- Cost-effective chat without embeddings
- Testing with open-source models
xAI (Grok)
Website: x.ai
xAI’s Grok models offer alternative reasoning approaches and real-time knowledge. Uses an OpenAI-compatible API format.
Supported Models
Chat Models:
- grok-2-1212 (recommended)
- grok-2
- grok-beta
Note: xAI does not support embeddings or vision.
Configuration
Provider: xAI (Grok)
API Key: xai-XXXXXXXXXXXXXXXXX
Default Chat Model: grok-2-1212
Best Use Cases
- Alternative reasoning perspectives
- Applications benefiting from real-time knowledge
- Organizations wanting LLM diversity
Ollama (Self-Hosted)
Website: ollama.com
Ollama enables running open-source models locally on your own infrastructure. Perfect for maximum data privacy, air-gapped environments, or cost-sensitive high-volume usage.
Supported Models
Models depend on what you install locally:
Chat Models:
- llama3, llama3:70b
- mistral, mixtral
- phi3
- codellama
- Many others
Embedding Models:
- nomic-embed-text (recommended)
- mxbai-embed-large
- all-minilm
Vision-Capable Models:
- llava
- bakllava
Configuration
Provider: Ollama
Endpoint URL: http://localhost:11434 (or your server address)
Default Chat Model: llama3
Default Embedding Model: nomic-embed-text
Important: Ollama requires an endpoint URL. No API key is needed for local Ollama, but you may configure authentication for remote deployments.
Infrastructure Requirements
- CPU: 8+ cores recommended
- RAM: 16GB minimum, 32GB+ for larger models
- GPU: Optional but significantly improves performance
- Storage: 10-100GB depending on models installed
Best Use Cases
- Air-gapped or offline environments
- Maximum data privacy (data never leaves your infrastructure)
- High-volume usage without per-token costs
- Testing and development without API costs
OpenRouter
Website: openrouter.ai
OpenRouter provides unified access to 100+ models from multiple providers through a single API. Pay per use with no commitments.
Supported Models
Access to all major providers through OpenRouter:
- OpenAI: GPT-4o, GPT-4 Turbo, GPT-3.5 Turbo
- Anthropic: Claude 3/4 family
- Meta: Llama 3
- Mistral: Mixtral, Mistral Large
- Google: Gemini Pro
- Many more
Embedding Models:
- openai/text-embedding-3-small
- openai/text-embedding-ada-002
Configuration
Provider: OpenRouter
API Key: sk-or-v1-XXXXXXXXXXXXXXXXX
Default Chat Model: openai/gpt-4o
Default Embedding Model: openai/text-embedding-3-small
Pricing
Pay-as-you-go with per-model pricing. OpenRouter adds a small margin to provider costs for the unified access layer.
Best Use Cases
- Trying multiple models without separate accounts
- Fallback routing when primary provider is unavailable
- Cost optimization across providers
- Accessing models not directly available in your region
GitHub Copilot Enterprise
Website: github.com/features/copilot
GitHub Copilot Enterprise provides access to GPT-4 models through GitHub’s infrastructure. Ideal for organizations already using GitHub Enterprise.
Supported Models
Chat Models:
- gpt-4o
- gpt-4
- gpt-3.5-turbo
Vision-Capable Models:
- gpt-4o
Note: GitHub Copilot does not support embeddings. Use another provider for semantic search.
Configuration
Provider: GitHub Copilot
API Key: [Your GitHub Copilot Token]
Default Chat Model: gpt-4o
Requirements
- GitHub Enterprise Cloud subscription
- GitHub Copilot Enterprise license
- Appropriate permissions in your GitHub organization
Best Use Cases
- Organizations already invested in GitHub Enterprise
- Unified billing through GitHub
- Development-focused AI features
Choosing the Right Provider
For Most Organizations: OpenAI
Start with OpenAI for the best balance of quality, features, and ease of setup. OpenAI provides:
- Excellent chat and embedding models
- Vision capabilities
- Straightforward API access
- Comprehensive documentation
For Enterprise Compliance: Azure OpenAI
Choose Azure OpenAI when you need:
- SOC 2, HIPAA, or FedRAMP compliance
- Data residency guarantees (EU, specific regions)
- Integration with existing Azure infrastructure
- Customer-managed encryption keys
For Maximum Privacy: Ollama
Choose Ollama for self-hosted deployment when:
- Data must never leave your infrastructure
- You operate in air-gapped environments
- You need to eliminate per-token costs
- You want full control over the AI stack
For Long Context: Anthropic
Choose Anthropic Claude when:
- You need to analyze large documents or portfolios (200K tokens)
- Safety and alignment are priorities
- You prefer alternative reasoning approaches
For Speed: Groq
Choose Groq when:
- Response time is critical
- You need cost-effective high-volume processing
- Embedding support is not required
For Flexibility: OpenRouter
Choose OpenRouter when:
- You want to try multiple providers easily
- You need fallback options across providers
- You want unified billing across multiple models
Multiple Provider Strategy
Many organizations benefit from using multiple providers:
| Feature | Primary Provider | Rationale |
|---|---|---|
| Chat/Analysis | OpenAI GPT-4o | Best quality for complex reasoning |
| Embeddings | OpenAI | Industry-leading embeddings |
| Long Documents | Anthropic Claude | 200K token context window |
| Cost-Sensitive | Groq | Fast, affordable for simple tasks |
| Compliance | Azure OpenAI | When required by regulations |
Configure multiple providers in Catalio and assign each to specific features based on your needs.
Next Steps
- Setting Up LLM API Keys - Detailed configuration guide
- Bring Your Own LLM (BYOLLM) - Configure your own AI provider
- AI Features and Data Privacy - How Catalio protects your data
Last Updated: December 2025