5.1 Release Notes
Patch Releases
All patch release notes for 5.1.x are available on the releases page.
AI Models Integration
Harper 5.1 introduces a built-in models layer that provides a unified interface for AI model backends. This enables embedding generation and text generation from directly within Harper applications, without managing external API connections per-application.
The models layer is exposed via scope.models in application code:
// Generate embeddings
const vector = await scope.models.embed('text to embed', { model: 'my-embedding-model' });
// Generate text
const response = await scope.models.generate([{ role: 'user', content: 'Hello' }], { model: 'my-chat-model' });
// Streaming generation
for await (const chunk of scope.models.generateStream(messages, { model: 'my-chat-model' })) {
// ...
}
Supported backends are Anthropic, AWS Bedrock, OpenAI, and Ollama, configured under the models key in harper-config.yaml:
models:
anthropic:
apiKey: your-api-key
openai:
apiKey: your-api-key
baseUrl: https://api.openai.com/v1 # optional override
ollama:
baseUrl: http://localhost:11434
bedrock:
region: us-east-1
@embed Schema Directive
The @embed directive automates vector embedding at the schema level, eliminating the need for application code to compute and store embeddings on every write. Add it to any Float array field with a source pointing to the text field to embed:
type Document @table {
id: ID @primaryKey
content: String
embedding: [Float] @embed(source: "content", model: "my-embedding-model") @indexed(type: "HNSW")
}
On every write, Harper automatically calls the specified model to compute the embedding from source and stores it on the record. The @indexed(type: "HNSW") index is attached automatically when not explicitly specified.
MCP Server
Harper 5.1 includes a built-in Model Context Protocol server, allowing LLM clients such as Claude Desktop, Cursor, and Zed to connect directly to a Harper instance and interact with its data and operations.
The MCP server exposes two profiles:
- Operations profile — wraps Harper's operations catalog as tools, with a curated default allow-list of read-only operations
- Application profile — auto-generates tools from a Harper application's Resource verb methods
The harper mcp CLI command provides a stdio bridge for use with MCP clients:
# Generate a config block for Claude Desktop
harper mcp print-config --client claude-desktop
# Run diagnostics against a running instance
harper mcp doctor
MCP is enabled by adding an mcp block to harper-config.yaml. See the MCP reference documentation for full configuration options, authentication, and tool customization.
Deployment Tracking
deploy_component now records a full audit trail for every deployment in the system.hdb_deployment system table. Each deployment gets a deployment_id and tracks phases (prepare → load → replicate → restart → success), per-node outcomes, and a bounded event log capturing install output.
The response from deploy_component now includes a deployment_id:
{
"deployment_id": "a3f8c2...",
"message": "Component deployed successfully"
}
New operations provide access to deployment history:
list_deployments— query deployment history with filtersget_deployment— fetch a single deployment record; supports live SSE streaming for in-progress deploysget_deployment_payload— retrieve the stored tarball for a deploymentdelete_deployment_payload— free storage by removing the payload blob after deployment
See Deployment Operations in the Operations API reference for details.
HNSW int8 Quantization
HNSW vector indexes now support int8 quantization, reducing index storage by approximately 3× and improving search throughput approximately 5× with around 1% recall loss at recall@10:
type Document @table {
embedding: [Float] @indexed(type: "HNSW", quantization: "int8")
}
Search uses asymmetric scoring: queries use full-precision float vectors while the index graph uses int8, and results are reranked against full-precision vectors before returning. The full-precision vector is always stored on the record itself.
Per-query ef can be overridden at query time for applications that need to tune the recall/latency tradeoff dynamically.
Replication Improvements
Several replication reliability improvements are included in 5.1:
- Resumable bulk clone — interrupted full-table copies resume from where they left off rather than restarting from the beginning
- Client-side receive watchdog — dead WebSocket connections are now detected and recovered without waiting for a server-side timeout
- Wedge detection and recovery — stalled replication streams are detected and re-subscribed automatically
isLeaderflag onadd_node— explicitly request a full-table copy when joining a cluster, independent of the normal subscription logicreplication.pingInterval/replication.pingTimeout— configurable keepalive intervals for replication connections (values in milliseconds)
Please see the migration guide for suggestions on how to migrate from v4 to v5.