Banner image for Supported AI Providers
Integrations 8 min read

Supported AI Providers

Complete reference of all AI/LLM providers supported by Catalio, including capabilities, model availability, and configuration requirements.

Updated

Catalio supports a wide range of AI/LLM providers, giving you flexibility to choose the right provider for your organization’s needs. Whether you prioritize performance, cost, data residency, or self-hosting, Catalio has you covered.

Provider Overview

Provider Chat Embeddings Vision Requires Endpoint Best For
OpenAI Yes Yes Yes No General use, best quality
Anthropic Yes No Yes No Long context, safety
Azure OpenAI Yes Yes Yes Yes Enterprise compliance
Google Gemini Yes Yes Yes No Multimodal, Google integration
Groq Yes No No No Ultra-fast inference
xAI (Grok) Yes No No No Alternative reasoning
Ollama Yes Yes Yes Yes Self-hosted, privacy
OpenRouter Yes Yes Yes No Multi-model access, flexibility
GitHub Copilot Yes No Yes No GitHub Enterprise integration

OpenAI

Website: platform.openai.com

OpenAI provides the industry-leading GPT family of models, offering excellent quality across chat, embeddings, and vision capabilities.

Supported Models

Chat Models:

  • GPT-4o (recommended for complex analysis)
  • GPT-4o-mini (fast, cost-effective)
  • GPT-4 Turbo
  • GPT-3.5 Turbo
  • o1, o1-mini (reasoning models)

Embedding Models:

  • text-embedding-3-small (recommended, 1536 dimensions)
  • text-embedding-3-large (3072 dimensions)
  • text-embedding-ada-002 (legacy)

Vision-Capable Models:

  • GPT-4o
  • GPT-4o-mini
  • GPT-4 Turbo

Configuration

Plaintext
Provider: OpenAI
API Key: sk-proj-XXXXXXXXXXXXXXXXX
Default Chat Model: gpt-4o
Default Embedding Model: text-embedding-3-small

Pricing (Approximate)

  • GPT-4o: $2.50-$10.00 per 1M tokens
  • GPT-4o-mini: $0.15-$0.60 per 1M tokens
  • text-embedding-3-small: $0.02 per 1M tokens

Best Use Cases

  • Complex requirement analysis requiring nuanced reasoning
  • High-quality semantic search via embeddings
  • Vision-based document analysis
  • General-purpose AI features

Anthropic (Claude)

Website: console.anthropic.com

Anthropic’s Claude models excel at long-context understanding, making them ideal for analyzing large requirement portfolios. Claude prioritizes safety and helpful, harmless responses.

Supported Models

Chat Models:

  • Claude Opus 4 (most capable)
  • Claude Sonnet 4 (balanced performance)
  • Claude 3.5 Sonnet (previous generation, excellent)
  • Claude 3 Opus
  • Claude 3 Haiku (fast, cost-effective)

Vision-Capable Models:

  • Claude 3+ models (all support vision)

Note: Anthropic does not support embedding generation. Use OpenAI, Gemini, or Ollama for embeddings if you need semantic search.

Configuration

Plaintext
Provider: Anthropic
API Key: sk-ant-api03-XXXXXXXXXXXXXXXXX
Default Chat Model: claude-sonnet-4-5-20250929

Pricing (Approximate)

  • Claude Opus 4: $15-$75 per 1M tokens
  • Claude Sonnet 4: $3-$15 per 1M tokens
  • Claude 3 Haiku: $0.25-$1.25 per 1M tokens

Best Use Cases

  • Analyzing large requirement portfolios (200K token context)
  • Long-form document processing
  • Safety-critical applications
  • Nuanced business reasoning

Azure OpenAI

Website: azure.microsoft.com/products/ai-services/openai-service

Azure OpenAI provides OpenAI’s models hosted in your Azure tenant, offering enterprise-grade security, compliance certifications, and data residency guarantees.

Supported Models

Same models as OpenAI, deployed as named deployments in your Azure resource:

  • GPT-4o, GPT-4 Turbo, GPT-3.5 Turbo
  • text-embedding-3-small, text-embedding-ada-002

Configuration

Plaintext
Provider: Azure OpenAI
API Key: [Your Azure OpenAI API Key]
Endpoint URL: https://your-resource.openai.azure.com
Default Chat Model: gpt-4o (deployment name)
Default Embedding Model: text-embedding-3-small (deployment name)

Important: Azure OpenAI requires an endpoint URL pointing to your Azure OpenAI resource.

Enterprise Features

  • SOC 2, HIPAA, FedRAMP compliance
  • Data residency (EU, US, Asia regions)
  • Private Link and VNet integration
  • Microsoft Entra ID authentication
  • Customer-managed encryption keys

Best Use Cases

  • Regulated industries (healthcare, finance, government)
  • European organizations requiring GDPR compliance
  • Enterprises with existing Azure infrastructure
  • Organizations requiring data residency guarantees

Google Gemini

Website: ai.google.dev

Google’s Gemini models offer strong multimodal capabilities, including native image understanding and embedding generation.

Supported Models

Chat Models:

  • Gemini 2.0 Pro (recommended)
  • Gemini 2.0 Flash
  • Gemini 1.5 Pro
  • Gemini 1.5 Flash

Embedding Models:

  • gemini-embedding-001

Vision-Capable Models:

  • Gemini 1.5+ models (all support vision)
  • Gemini 2.0 models

Configuration

Plaintext
Provider: Google Gemini
API Key: [Your Google AI API Key]
Default Chat Model: gemini-2.0-pro
Default Embedding Model: gemini-embedding-001

Best Use Cases

  • Multimodal analysis (images + text)
  • Organizations already using Google Cloud
  • Applications requiring strong multilingual support

Groq

Website: groq.com

Groq provides ultra-fast inference for open-source models like Llama and Mixtral using custom hardware. Response times are significantly faster than traditional cloud providers.

Supported Models

Chat Models:

  • llama3-70b-8192 (recommended)
  • llama3-8b-8192
  • mixtral-8x7b-32768
  • gemma-7b-it
  • gemma2-9b-it

Note: Groq does not support embeddings or vision. Use another provider for semantic search.

Configuration

Plaintext
Provider: Groq
API Key: gsk_XXXXXXXXXXXXXXXXX
Default Chat Model: llama3-70b-8192

Pricing (Approximate)

  • Llama 3 70B: $0.59-$0.79 per 1M tokens
  • Llama 3 8B: $0.05-$0.08 per 1M tokens

Best Use Cases

  • Real-time applications requiring fast responses
  • High-volume, simple tasks
  • Cost-effective chat without embeddings
  • Testing with open-source models

xAI (Grok)

Website: x.ai

xAI’s Grok models offer alternative reasoning approaches and real-time knowledge. Uses an OpenAI-compatible API format.

Supported Models

Chat Models:

  • grok-2-1212 (recommended)
  • grok-2
  • grok-beta

Note: xAI does not support embeddings or vision.

Configuration

Plaintext
Provider: xAI (Grok)
API Key: xai-XXXXXXXXXXXXXXXXX
Default Chat Model: grok-2-1212

Best Use Cases

  • Alternative reasoning perspectives
  • Applications benefiting from real-time knowledge
  • Organizations wanting LLM diversity

Ollama (Self-Hosted)

Website: ollama.com

Ollama enables running open-source models locally on your own infrastructure. Perfect for maximum data privacy, air-gapped environments, or cost-sensitive high-volume usage.

Supported Models

Models depend on what you install locally:

Chat Models:

  • llama3, llama3:70b
  • mistral, mixtral
  • phi3
  • codellama
  • Many others

Embedding Models:

  • nomic-embed-text (recommended)
  • mxbai-embed-large
  • all-minilm

Vision-Capable Models:

  • llava
  • bakllava

Configuration

Plaintext
Provider: Ollama
Endpoint URL: http://localhost:11434 (or your server address)
Default Chat Model: llama3
Default Embedding Model: nomic-embed-text

Important: Ollama requires an endpoint URL. No API key is needed for local Ollama, but you may configure authentication for remote deployments.

Infrastructure Requirements

  • CPU: 8+ cores recommended
  • RAM: 16GB minimum, 32GB+ for larger models
  • GPU: Optional but significantly improves performance
  • Storage: 10-100GB depending on models installed

Best Use Cases

  • Air-gapped or offline environments
  • Maximum data privacy (data never leaves your infrastructure)
  • High-volume usage without per-token costs
  • Testing and development without API costs

OpenRouter

Website: openrouter.ai

OpenRouter provides unified access to 100+ models from multiple providers through a single API. Pay per use with no commitments.

Supported Models

Access to all major providers through OpenRouter:

  • OpenAI: GPT-4o, GPT-4 Turbo, GPT-3.5 Turbo
  • Anthropic: Claude 3/4 family
  • Meta: Llama 3
  • Mistral: Mixtral, Mistral Large
  • Google: Gemini Pro
  • Many more

Embedding Models:

  • openai/text-embedding-3-small
  • openai/text-embedding-ada-002

Configuration

Plaintext
Provider: OpenRouter
API Key: sk-or-v1-XXXXXXXXXXXXXXXXX
Default Chat Model: openai/gpt-4o
Default Embedding Model: openai/text-embedding-3-small

Pricing

Pay-as-you-go with per-model pricing. OpenRouter adds a small margin to provider costs for the unified access layer.

Best Use Cases

  • Trying multiple models without separate accounts
  • Fallback routing when primary provider is unavailable
  • Cost optimization across providers
  • Accessing models not directly available in your region

GitHub Copilot Enterprise

Website: github.com/features/copilot

GitHub Copilot Enterprise provides access to GPT-4 models through GitHub’s infrastructure. Ideal for organizations already using GitHub Enterprise.

Supported Models

Chat Models:

  • gpt-4o
  • gpt-4
  • gpt-3.5-turbo

Vision-Capable Models:

  • gpt-4o

Note: GitHub Copilot does not support embeddings. Use another provider for semantic search.

Configuration

Plaintext
Provider: GitHub Copilot
API Key: [Your GitHub Copilot Token]
Default Chat Model: gpt-4o

Requirements

  • GitHub Enterprise Cloud subscription
  • GitHub Copilot Enterprise license
  • Appropriate permissions in your GitHub organization

Best Use Cases

  • Organizations already invested in GitHub Enterprise
  • Unified billing through GitHub
  • Development-focused AI features

Choosing the Right Provider

For Most Organizations: OpenAI

Start with OpenAI for the best balance of quality, features, and ease of setup. OpenAI provides:

  • Excellent chat and embedding models
  • Vision capabilities
  • Straightforward API access
  • Comprehensive documentation

For Enterprise Compliance: Azure OpenAI

Choose Azure OpenAI when you need:

  • SOC 2, HIPAA, or FedRAMP compliance
  • Data residency guarantees (EU, specific regions)
  • Integration with existing Azure infrastructure
  • Customer-managed encryption keys

For Maximum Privacy: Ollama

Choose Ollama for self-hosted deployment when:

  • Data must never leave your infrastructure
  • You operate in air-gapped environments
  • You need to eliminate per-token costs
  • You want full control over the AI stack

For Long Context: Anthropic

Choose Anthropic Claude when:

  • You need to analyze large documents or portfolios (200K tokens)
  • Safety and alignment are priorities
  • You prefer alternative reasoning approaches

For Speed: Groq

Choose Groq when:

  • Response time is critical
  • You need cost-effective high-volume processing
  • Embedding support is not required

For Flexibility: OpenRouter

Choose OpenRouter when:

  • You want to try multiple providers easily
  • You need fallback options across providers
  • You want unified billing across multiple models

Multiple Provider Strategy

Many organizations benefit from using multiple providers:

Feature Primary Provider Rationale
Chat/Analysis OpenAI GPT-4o Best quality for complex reasoning
Embeddings OpenAI Industry-leading embeddings
Long Documents Anthropic Claude 200K token context window
Cost-Sensitive Groq Fast, affordable for simple tasks
Compliance Azure OpenAI When required by regulations

Configure multiple providers in Catalio and assign each to specific features based on your needs.

Next Steps


Last Updated: December 2025