Documentation
EdgeMind is a client-side, edge-deployable AI runtime and API platform. This documentation will help you get started with integrating AI capabilities into your applications.
🔌 API First
OpenAI-compatible REST API with streaming support
🌐 Edge Native
Deploy to Cloudflare Workers for global low-latency
📦 Multi-Provider
OpenAI, Claude, Ollama, and Workers AI support
Quick Start
1. Install the SDK
npm install @edgemind/js
2. Initialize the client
import { EdgeMind } from '@edgemind/js';
const client = new EdgeMind({
apiKey: process.env.EDGEMIND_API_KEY
});3. Make your first request
const response = await client.chat.completions({
model: 'gpt-4o',
messages: [
{ role: 'user', content: 'Hello!' }
]
});
console.log(response.choices[0].message.content);Chat Completions API
The Chat Completions endpoint follows the OpenAI API format for easy migration.
Endpoint
POST /api/v1/chat/completionsRequest Body
{
"model": "gpt-4o", // Required: Model ID
"messages": [ // Required: Array of messages
{ "role": "system", "content": "You are helpful." },
{ "role": "user", "content": "Hello!" }
],
"temperature": 0.7, // Optional: 0-2, default 0.7
"max_tokens": 1024, // Optional: Max tokens in response
"stream": false // Optional: Enable streaming
}Response
{
"id": "chatcmpl_xxx",
"object": "chat.completion",
"created": 1714032456,
"model": "gpt-4o",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you?"
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 20,
"completion_tokens": 15,
"total_tokens": 35
}
}Embeddings API
Generate embeddings for text search and similarity tasks.
Endpoint
POST /api/v1/embeddingsRequest Body
{
"model": "text-embedding-3-small", // Optional: Embedding model
"input": "The quick brown fox", // Required: Text to embed
"encoding_format": "float" // Optional: "float" or "base64"
}Models API
List and query available AI models.
Endpoint
GET /api/v1/modelsQuery Parameters
provider- Filter by provider (openai, anthropic, workers-ai, ollama)mode- Filter by capability (chat, embeddings, images, audio)free- Show only free models
RAG Query API
Retrieval Augmented Generation for contextual answers.
Endpoint
POST /api/v1/rag/queryRequest Body
{
"query": "What is machine learning?", // Required: Question
"collection": "docs", // Required: Vector collection
"topK": 5, // Optional: Results to return
"includeSources": true, // Optional: Return source chunks
"filter": { "category": "tutorials" } // Optional: Metadata filter
}Available Models
| Model | Provider | Context | Cost |
|---|---|---|---|
| gpt-4o | OpenAI | 128K | $2.50/1M in |
| gpt-4o-mini | OpenAI | 128K | $0.15/1M in |
| claude-3-5-sonnet | Anthropic | 200K | $3.00/1M in |
| @cf/meta/llama-3.3-70b-instruct | Cloudflare | 128K | Free |
| @cf/meta/llama-3.1-8b-instruct | Cloudflare | 128K | Free |