AI Coding Assistants
This section covers how to set up and configure AI coding assistants to work with LLM Gateway.
Approved Assistants
The following AI coding assistants have been tested and approved for use with LLMGW:
| Assistant | Description | Guide |
|---|---|---|
| Claude Code | Anthropic’s official CLI for Claude | Setup Guide |
| Claude Code VS Code Extension | Claude Code extension for VS Code (Beta) | Setup Guide |
| Cline | VS Code extension for AI-assisted coding | Setup Guide |
Prerequisites
Before setting up any AI coding assistant, ensure you have:
-
LLMGW API Token - Request your personal user-based token from your PM or administrator. Note that user-based tokens are different from API keys in LLMGW.
-
LLMGW Endpoint - The base URL depends on your assistant:
- Claude Code (Bedrock):
https://<llmgw-deployment-url>/aws-bedrock - Cline (OpenAI Compatible):
https://<llmgw-deployment-url>/openai
- Claude Code (Bedrock):
-
Project Assignment - Your token must be associated with a project that has appropriate spend limits configured.
How It Works
AI coding assistants connect to LLMGW using the OpenAI-compatible API endpoint (/openai) or AWS Bedrock endpoint (/aws-bedrock). This allows assistants that support these API formats to work seamlessly with LLMGW.
Endpoint Model Access
Different endpoints provide access to different model providers:
-
/openai- OpenAI-compatible endpoint providing access to:- Azure OpenAI models (GPT-4.1 Turbo, GPT-4o, GPT-5 Nano, etc.)
- Azure DeepSeek models
-
/aws-bedrock- AWS Bedrock endpoint providing access to:- Anthropic Claude models (Claude Sonnet 4.5, Claude Sonnet 4, Claude Haiku 4.5, etc.)
- Other AWS Bedrock-supported models
What LLMGW Provides
LLMGW handles:
- Authentication - Validates your API token
- Spend Tracking - Monitors usage against your budget limits
- Load Balancing - Routes requests to optimal model instances
- Model Selection - Allows you to use different models via model groups
Monitoring Your Usage
For details on how to check your current spend and remaining budget limits from the command line, see the internal MS Teams channel “Access token-API requests”.
Available Models
For details on available models, their pricing, and recommended configurations, see the internal MS Teams channel “Access token-API requests”.