To use Google Cloud’s Vertex AI as the inference provider, set inferenceProvider to vertex and supply the project, region, and credentials described below.
Prerequisites
- A Google Cloud project with the Vertex AI API enabled
- Claude models enabled for that project in the Vertex AI Model Garden
- Credentials for calling Vertex AI (see Authentication)
Configuration keys
| Setting | Required | Description |
|---|
GCP project ID
inferenceVertexProjectId | Yes | Google Cloud project ID. |
GCP region
inferenceVertexRegion | Yes | Google Cloud region for the Vertex AI endpoint, for example us-east5 or europe-west4. On supported builds, global is also accepted. |
GCP credentials file path
inferenceVertexCredentialsFile | No | Absolute path to a service-account key JSON or Application Default Credentials file. No ~ or environment-variable expansion. If set, this file is used and Google sign-in is disabled. |
Vertex AI base URL
inferenceVertexBaseUrl | No | Override the public regional endpoint, for example with a Private Service Connect address. Must be https://. |
Vertex OAuth client ID
inferenceVertexOAuthClientId | No | Client ID of a Desktop-app OAuth client in your Google Cloud project. Enables per-user Google sign-in. |
Vertex OAuth client secret
inferenceVertexOAuthClientSecret | No | Client secret paired with the client ID above. |
Vertex OAuth scopes
inferenceVertexOAuthScopes | No | Space-separated OAuth scopes for Google sign-in. Defaults to openid email https://www.googleapis.com/auth/cloud-platform. |
You must also set inferenceModels to a list of Vertex publisher model IDs, for example claude-sonnet-4@20250514. See the Configuration reference.
Authentication
Vertex AI uses Google Cloud Application Default Credentials, which the app passes to the google-auth-library inside the session sandbox. Choose one of the following.
Credentials file
Distribute a credentials JSON file to each device and point inferenceVertexCredentialsFile at its absolute path. The file can be any format that Application Default Credentials accepts:
- A service-account key JSON file. Grant the service account the Vertex AI User role (
roles/aiplatform.user) on the project.
- An
authorized_user file produced by gcloud auth application-default login.
- An
external_account Workload Identity Federation file. Use a credential_source.file or credential_source.url source; credential_source.executable is not supported because the sandbox does not set GOOGLE_EXTERNAL_ACCOUNT_ALLOW_EXECUTABLES.
If inferenceVertexCredentialsFile is not set and Google sign-in is not configured, the library falls back to the standard Application Default Credentials search path on the device (~/.config/gcloud/application_default_credentials.json, then the environment’s metadata server).
Google sign-in
Instead of a shared credentials file, each user can sign in with their own Google Workspace identity. You create a Desktop-app OAuth client in your Google Cloud project and distribute its client ID and secret via managed configuration; the app then shows a Sign in with Google page and stores the user’s refresh token encrypted on the device.
See Sign in with Google for Google Cloud’s Vertex AI for the full setup.
inferenceCredentialHelper is not invoked when inferenceProvider is vertex, because Vertex authentication is file-based rather than token-based. Use one of the two options above.
Example
<key>inferenceProvider</key>
<string>vertex</string>
<key>inferenceVertexProjectId</key>
<string>your-gcp-project</string>
<key>inferenceVertexRegion</key>
<string>us-east5</string>
<key>inferenceVertexCredentialsFile</key>
<string>/etc/claude/vertex-sa.json</string>
<key>inferenceModels</key>
<string>["claude-sonnet-4@20250514"]</string>