Overview
The AI API provides endpoints for chat completions and image generation using various AI models through OpenRouter integration.
For authenticated function invocations:
Authorization: Bearer your-jwt-token-or-anon-key
Content-Type: application/json
For admin endpoints:
Authorization: Bearer admin-jwt-token-Or-API-Key
Content-Type: application/json
List Available Models
Get all available AI models for text and image generation. Requires admin authentication.
Example
curl "https://your-app.insforge.app/api/ai/models" \
-H "Authorization: Bearer admin-jwt-token-Or-API-Key"
Response
{
"text": [
{
"provider": "openrouter",
"configured": true,
"models": [
{
"id": "openai/gpt-4",
"name": "GPT-4",
"description": "OpenAI's most capable model",
"context_length": 8192,
"max_completion_tokens": 4096,
"pricing": {
"prompt": "0.00003",
"completion": "0.00006"
}
},
{
"id": "anthropic/claude-3.5-haiku",
"name": "Claude 3.5 Haiku",
"description": "Anthropic's fast and efficient model",
"context_length": 200000,
"max_completion_tokens": 4096
}
]
}
],
"image": [
{
"provider": "openrouter",
"configured": true,
"models": [
{
"id": "openai/dall-e-3",
"name": "DALL-E 3",
"description": "OpenAI's image generation model"
},
{
"id": "google/gemini-2.5-flash-image-preview",
"name": "Gemini 2.5 Flash Image",
"description": "Google's multimodal image generation"
}
]
}
]
}
Chat Completion
Generate AI chat completion.
POST /api/ai/chat/completion
Request Body
| Field | Type | Required | Description |
|---|
model | string | Yes | Model identifier (e.g., openai/gpt-4) |
messages | array | Yes | Array of chat messages |
stream | boolean | No | Enable streaming response (default: false) |
temperature | number | No | Sampling temperature (0-2) |
maxTokens | integer | No | Maximum tokens to generate |
topP | number | No | Nucleus sampling parameter (0-1) |
systemPrompt | string | No | System prompt to guide behavior |
{
"role": "user|assistant|system",
"content": "message text"
}
Example (Basic)
curl -X POST "https://your-app.insforge.app/api/ai/chat/completion" \
-H "Authorization: Bearer your-jwt-token-or-anon-key" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-3.5-haiku",
"messages": [
{"role": "user", "content": "What is the capital of France?"}
]
}'
Response
{
"success": true,
"content": "The capital of France is Paris.",
"metadata": {
"model": "anthropic/claude-3.5-haiku",
"usage": {
"promptTokens": 15,
"completionTokens": 8,
"totalTokens": 23
}
}
}
Example (With Parameters)
curl -X POST "https://your-app.insforge.app/api/ai/chat/completion" \
-H "Authorization: Bearer your-jwt-token-or-anon-key" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms."}
],
"temperature": 0.7,
"maxTokens": 1000
}'
Example (Multi-turn Conversation)
curl -X POST "https://your-app.insforge.app/api/ai/chat/completion" \
-H "Authorization: Bearer your-jwt-token-or-anon-key" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-3.5-haiku",
"messages": [
{"role": "user", "content": "What is Kotlin?"},
{"role": "assistant", "content": "Kotlin is a modern programming language..."},
{"role": "user", "content": "What are its main features?"}
]
}'
Streaming Chat Completion
Enable streaming for real-time responses using Server-Sent Events (SSE).
POST /api/ai/chat/completion
Request
curl -X POST "https://your-app.insforge.app/api/ai/chat/completion" \
-H "Authorization: Bearer your-jwt-token-or-anon-key" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-3.5-haiku",
"messages": [
{"role": "user", "content": "Tell me a story about a robot."}
],
"stream": true
}'
Response
Server-Sent Events stream:
data: {"chunk": "Once"}
data: {"chunk": " upon"}
data: {"chunk": " a"}
data: {"chunk": " time"}
data: {"chunk": "..."}
data: {"tokenUsage": {"promptTokens": 15, "completionTokens": 50, "totalTokens": 65}}
data: {"done": true}
Image Generation
Generate images using AI models.
POST /api/ai/image/generation
Request Body
| Field | Type | Required | Description |
|---|
model | string | Yes | Image model identifier (e.g., openai/dall-e-3) |
prompt | string | Yes | Text prompt describing the image |
Example
curl -X POST "https://your-app.insforge.app/api/ai/image/generation" \
-H "Authorization: Bearer your-jwt-token-or-anon-key" \
-H "Content-Type: application/json" \
-d '{
"model": "google/gemini-2.5-flash-image-preview",
"prompt": "A serene mountain landscape at sunset with a lake reflection"
}'
Response
{
"model": "google/gemini-2.5-flash-image-preview",
"images": [
{
"type": "image_url",
"image_url": {
"url": "..."
}
}
],
"text": null,
"count": 1,
"metadata": {
"model": "google/gemini-2.5-flash-image-preview",
"provider": "openrouter"
},
"nextActions": "Images have been generated successfully. Use the returned URLs or base64 data to access them."
}
Image URLs can be either:
- Direct URLs (e.g.,
https://...)
- Base64 data URLs (e.g.,
data:image/png;base64,...)
Check the URL format to determine how to handle the image.
Generate Embeddings
Generate vector embeddings for text input using AI models.
Request Body
| Field | Type | Required | Description |
|---|
model | string | Yes | Embedding model identifier (e.g., openai/text-embedding-3-small) |
input | string | string[] | Yes | Single text or array of texts to embed |
encoding_format | string | No | Output format: float (default) or base64 |
dimensions | integer | No | Number of dimensions for output embeddings (model-dependent) |
Example
curl -X POST "https://your-app.insforge.app/api/ai/embeddings" \
-H "Authorization: Bearer your-jwt-token-or-anon-key" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/text-embedding-3-small",
"input": "Hello world"
}'
Response
{
"object": "list",
"data": [
{
"object": "embedding",
"embedding": [0.0023, -0.0142, 0.0234, 0.0156, ...],
"index": 0
}
],
"metadata": {
"model": "openai/text-embedding-3-small",
"usage": {
"promptTokens": 2,
"totalTokens": 2
}
}
}
Response (Multiple Texts)
{
"object": "list",
"data": [
{
"object": "embedding",
"embedding": [0.0023, -0.0142, ...],
"index": 0
},
{
"object": "embedding",
"embedding": [0.0045, -0.0089, ...],
"index": 1
},
{
"object": "embedding",
"embedding": [0.0012, -0.0234, ...],
"index": 2
}
],
"metadata": {
"model": "openai/text-embedding-3-small",
"usage": {
"promptTokens": 8,
"totalTokens": 8
}
}
}
When using encoding_format: "base64", the embedding field will be a base64-encoded string instead of an array of numbers. This can reduce response size for large embeddings.
Get Remaining Credits
Get remaining credits for the current API key from OpenRouter. Requires admin authentication.
Example
curl "https://your-app.insforge.app/api/ai/credits" \
-H "Authorization: Bearer admin-jwt-token-Or-API-Key"
Response
{
"credits": 50.25,
"usage": 149.75
}
Admin Endpoints
List AI Configurations
GET /api/ai/configurations
Create AI Configuration
POST /api/ai/configurations
Request Body
| Field | Type | Required | Description |
|---|
inputModality | array | Yes | Input modalities (text, image) |
outputModality | array | Yes | Output modalities (text, image) |
provider | string | Yes | Provider name (e.g., openrouter) |
modelId | string | Yes | Model identifier |
systemPrompt | string | No | Default system prompt |
Update AI Configuration
PATCH /api/ai/configurations/{id}
Delete AI Configuration
DELETE /api/ai/configurations/{id}
Usage Statistics
Get Usage Summary
GET /api/ai/usage/summary?startDate=2024-01-01&endDate=2024-01-31
Response
{
"totalRequests": 1500,
"totalTokens": 250000,
"totalCost": 12.50,
"byModel": {
"openai/gpt-4": {
"requests": 500,
"tokens": 100000,
"cost": 8.00
},
"anthropic/claude-3.5-haiku": {
"requests": 1000,
"tokens": 150000,
"cost": 4.50
}
}
}
Get Usage Records
GET /api/ai/usage?startDate=2024-01-01&endDate=2024-01-31&limit=50&offset=0
Response
[
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"configId": "456e4567-e89b-12d3-a456-426614174000",
"modelId": "openai/gpt-4",
"promptTokens": 100,
"completionTokens": 50,
"totalTokens": 150,
"cost": 0.0075,
"createdAt": "2024-01-15T10:30:00Z"
}
]
Error Responses
Model Not Found (400)
{
"error": "MODEL_NOT_FOUND",
"message": "Model 'invalid-model' is not available",
"statusCode": 400
}
Missing Required Field (400)
{
"error": "VALIDATION_ERROR",
"message": "model is required",
"statusCode": 400
}
Rate Limit Exceeded (429)
{
"error": "RATE_LIMIT_EXCEEDED",
"message": "Too many requests. Please try again later.",
"statusCode": 429
}
Provider Error (500)
{
"error": "PROVIDER_ERROR",
"message": "Failed to get response from AI provider",
"details": "OpenRouter API returned error",
"statusCode": 500
}
Popular Models
Text Models
| Model ID | Provider | Description |
|---|
openai/gpt-4 | OpenAI | Most capable GPT model |
openai/gpt-4-turbo | OpenAI | Faster GPT-4 variant |
anthropic/claude-3.5-haiku | Anthropic | Fast and efficient |
anthropic/claude-3-opus | Anthropic | Most capable Claude model |
google/gemini-pro | Google | Google’s multimodal model |
Image Models
| Model ID | Provider | Description |
|---|
openai/dall-e-3 | OpenAI | High-quality image generation |
google/gemini-2.5-flash-image-preview | Google | Multimodal image generation |
Embedding Models
| Model ID | Provider | Description |
|---|
openai/text-embedding-3-small | OpenAI | Fast, efficient embedding model |
openai/text-embedding-3-large | OpenAI | Higher quality embeddings |
google/gemini-embedding-001 | Google | Google’s embedding model |