Documentation

EdgeMind is a client-side, edge-deployable AI runtime and API platform. This documentation will help you get started with integrating AI capabilities into your applications.

🔌 API First

OpenAI-compatible REST API with streaming support

🌐 Edge Native

Deploy to Cloudflare Workers for global low-latency

📦 Multi-Provider

OpenAI, Claude, Ollama, and Workers AI support

Quick Start

1. Install the SDK

npm install @edgemind/js

2. Initialize the client

import { EdgeMind } from '@edgemind/js';

const client = new EdgeMind({
  apiKey: process.env.EDGEMIND_API_KEY
});

3. Make your first request

const response = await client.chat.completions({
  model: 'gpt-4o',
  messages: [
    { role: 'user', content: 'Hello!' }
  ]
});

console.log(response.choices[0].message.content);

Chat Completions API

The Chat Completions endpoint follows the OpenAI API format for easy migration.

Endpoint

POST /api/v1/chat/completions

Request Body

{
  "model": "gpt-4o",           // Required: Model ID
  "messages": [                // Required: Array of messages
    { "role": "system", "content": "You are helpful." },
    { "role": "user", "content": "Hello!" }
  ],
  "temperature": 0.7,         // Optional: 0-2, default 0.7
  "max_tokens": 1024,          // Optional: Max tokens in response
  "stream": false              // Optional: Enable streaming
}

Response

{
  "id": "chatcmpl_xxx",
  "object": "chat.completion",
  "created": 1714032456,
  "model": "gpt-4o",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! How can I help you?"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 15,
    "total_tokens": 35
  }
}

Embeddings API

Generate embeddings for text search and similarity tasks.

Endpoint

POST /api/v1/embeddings

Request Body

{
  "model": "text-embedding-3-small",  // Optional: Embedding model
  "input": "The quick brown fox",     // Required: Text to embed
  "encoding_format": "float"         // Optional: "float" or "base64"
}

Models API

List and query available AI models.

Endpoint

GET /api/v1/models

Query Parameters

  • provider - Filter by provider (openai, anthropic, workers-ai, ollama)
  • mode - Filter by capability (chat, embeddings, images, audio)
  • free - Show only free models

RAG Query API

Retrieval Augmented Generation for contextual answers.

Endpoint

POST /api/v1/rag/query

Request Body

{
  "query": "What is machine learning?",    // Required: Question
  "collection": "docs",                     // Required: Vector collection
  "topK": 5,                               // Optional: Results to return
  "includeSources": true,                  // Optional: Return source chunks
  "filter": { "category": "tutorials" }    // Optional: Metadata filter
}

Available Models

ModelProviderContextCost
gpt-4oOpenAI128K$2.50/1M in
gpt-4o-miniOpenAI128K$0.15/1M in
claude-3-5-sonnetAnthropic200K$3.00/1M in
@cf/meta/llama-3.3-70b-instructCloudflare128KFree
@cf/meta/llama-3.1-8b-instructCloudflare128KFree