5.1 Release Notes

Patch Releases

All patch release notes for 5.1.x are available on the releases page.

AI Models Integration

Harper 5.1 introduces a built-in models layer that provides a unified interface for AI model backends. This enables embedding generation and text generation from directly within Harper applications, without managing external API connections per-application.

The models layer is exposed via scope.models in application code:

// Generate embeddings
const vector = await scope.models.embed('text to embed', { model: 'my-embedding-model' });

// Generate text
const response = await scope.models.generate([{ role: 'user', content: 'Hello' }], { model: 'my-chat-model' });

// Streaming generation
for await (const chunk of scope.models.generateStream(messages, { model: 'my-chat-model' })) {
	// ...
}

Supported backends are Anthropic, AWS Bedrock, OpenAI, and Ollama, configured under the models key in harper-config.yaml:

models:
  anthropic:
    apiKey: your-api-key
  openai:
    apiKey: your-api-key
    baseUrl: https://api.openai.com/v1 # optional override
  ollama:
    baseUrl: http://localhost:11434
  bedrock:
    region: us-east-1

`@embed` Schema Directive

The @embed directive automates vector embedding at the schema level, eliminating the need for application code to compute and store embeddings on every write. Add it to any Float array field with a source pointing to the text field to embed:

type Document @table {
	id: ID @primaryKey
	content: String
	embedding: [Float] @embed(source: "content", model: "my-embedding-model") @indexed(type: "HNSW")
}

On every write, Harper automatically calls the specified model to compute the embedding from source and stores it on the record. The @indexed(type: "HNSW") index is attached automatically when not explicitly specified.

MCP Server

Harper 5.1 includes a built-in Model Context Protocol server, allowing LLM clients such as Claude Desktop, Cursor, and Zed to connect directly to a Harper instance and interact with its data and operations.

The MCP server exposes two profiles:

Operations profile — wraps Harper's operations catalog as tools, with a curated default allow-list of read-only operations
Application profile — auto-generates tools from a Harper application's Resource verb methods

The harper mcp CLI command provides a stdio bridge for use with MCP clients:

# Generate a config block for Claude Desktop
harper mcp print-config --client claude-desktop

# Run diagnostics against a running instance
harper mcp doctor

MCP is enabled by adding an mcp block to harper-config.yaml. See the MCP reference documentation for full configuration options, authentication, and tool customization.

Deployment Tracking

deploy_component now records a full audit trail for every deployment in the system.hdb_deployment system table. Each deployment gets a deployment_id and tracks phases (prepare → load → replicate → restart → success), per-node outcomes, and a bounded event log capturing install output.

The response from deploy_component now includes a deployment_id:

{
	"deployment_id": "a3f8c2...",
	"message": "Component deployed successfully"
}

New operations provide access to deployment history:

list_deployments — query deployment history with filters
get_deployment — fetch a single deployment record; supports live SSE streaming for in-progress deploys
get_deployment_payload — retrieve the stored tarball for a deployment
delete_deployment_payload — free storage by removing the payload blob after deployment

See Deployment Operations in the Operations API reference for details.

HNSW int8 Quantization

HNSW vector indexes now support int8 quantization, reducing index storage by approximately 3× and improving search throughput approximately 5× with around 1% recall loss at recall@10:

type Document @table {
	embedding: [Float] @indexed(type: "HNSW", quantization: "int8")
}

Search uses asymmetric scoring: queries use full-precision float vectors while the index graph uses int8, and results are reranked against full-precision vectors before returning. The full-precision vector is always stored on the record itself.

Per-query ef can be overridden at query time for applications that need to tune the recall/latency tradeoff dynamically.

Replication Improvements

Several replication reliability improvements are included in 5.1:

Resumable bulk clone — interrupted full-table copies resume from where they left off rather than restarting from the beginning
Client-side receive watchdog — dead WebSocket connections are now detected and recovered without waiting for a server-side timeout
Wedge detection and recovery — stalled replication streams are detected and re-subscribed automatically
isLeader flag on add_node — explicitly request a full-table copy when joining a cluster, independent of the normal subscription logic
replication.pingInterval / replication.pingTimeout — configurable keepalive intervals for replication connections (values in milliseconds)

Please see the migration guide for suggestions on how to migrate from v4 to v5.

Patch Releases​

AI Models Integration​

@embed Schema Directive​

MCP Server​

Deployment Tracking​

HNSW int8 Quantization​

Replication Improvements​