Version: v5

Schema

Harper uses GraphQL Schema Definition Language (SDL) to declaratively define table structure. Schema definitions are loaded from .graphql files in a component directory and control table creation, attribute types, indexing, and relationships.

Overview

Added in: v4.2.0

Schemas are defined using standard GraphQL type definitions with Harper-specific directives. A schema definition:

Ensures required tables exist when a component is deployed
Enforces attribute types and required constraints
Controls which attributes are indexed
Defines relationships between tables
Configures computed properties, expiration, and audit behavior

Schemas are flexible by default — records may include additional properties beyond those declared in the schema. Use the @sealed directive to prevent this.

A minimal example:

type Dog @table {
	id: Long @primaryKey
	name: String
	breed: String
	age: Int
}

type Breed @table {
	id: Long @primaryKey
	name: String @indexed
}

Loading Schemas

In a component's config.yaml, specify the schema file with the graphqlSchema plugin:

graphqlSchema:
  files: 'schema.graphql'

Keep in mind that both plugins and applications can specify schemas.

Type Directives

Type directives apply to the entire table type definition.

`@table`

Marks a GraphQL type as a Harper database table. The type name becomes the table name by default.

type MyTable @table {
	id: Long @primaryKey
}

Optional arguments:

Argument	Type	Default	Description
`table`	`String`	type name	Override the table name
`database`	`String`	`"data"`	Database to place the table in
`expiration`	`Int`	—	Seconds until a record goes stale (useful for caching tables)
`eviction`	`Int`	`0`	Additional seconds after `expiration` before a record is physically removed
`scanInterval`	`Int`	`(expiration + eviction) / 4`	Seconds between eviction scans
`replicate`	`Boolean`	true	Enable replication of this table

expiration, eviction, and scanInterval

These three arguments work together to control the full lifecycle of a cached record:

expiration — When elapsed, a record is considered stale. The next request for a stale record triggers a fetch from the source. The record may still be served while revalidation is in progress.
eviction — Additional time after expiration before the record is physically removed from the table. Setting eviction > 0 lets you serve the stale record while revalidation happens and controls how long after expiration the data is kept on disk.
scanInterval — How often Harper scans the table for records to evict. Defaults to one quarter of expiration + eviction.

You can provide a single expiration value and all three behaviors share the same TTL. To tune them independently:

# Expire after 5 minutes, evict after 1 hour, scan every 10 minutes
type WeatherCache @table(expiration: 300, eviction: 3300, scanInterval: 600) {
	id: ID @primaryKey
	temperature: Float
}

How `scanInterval` Determines the Eviction Cycle

scanInterval determines fixed clock-aligned times when eviction runs. Harper divides the clock into evenly spaced anchors based on the interval, calculated in the server's local timezone. As a result:

The server's startup time does not affect when eviction runs.
Eviction timings are deterministic and timezone-aware.
For any given configuration, the eviction schedule is the same across restarts and across servers in the same local timezone.

Example: 1-hour expiration — default scanInterval = 15 minutes (one quarter of expiration). Eviction schedule:

00:00, 00:15, 00:30, 00:45, 01:00, ...

If the server starts at 12:05, the first eviction runs at 12:15 — not 12:20. The schedule is clock-aligned, not startup-aligned.

Example: 1-day expiration — default scanInterval = 6 hours. Eviction schedule:

00:00, 06:00, 12:00, 18:00, ...

Eviction with Indexing

Eviction removes non-indexed record data, but it does not remove a record from its secondary indexes. If an evicted record matches a search query, Harper fetches the full record from the source on demand to satisfy the query. This means indexes remain fully functional even when most of the data has been evicted.

Examples:

# Override table name
type Product @table(table: "products") {
	id: Long @primaryKey
}

# Place in a specific database
type Order @table(database: "commerce") {
	id: Long @primaryKey
}

# Auto-expire records after 1 hour (e.g., a session cache)
type Session @table(expiration: 3600) {
	id: Long @primaryKey
	userId: String
}

# Disable replication for this table explicitly
type LocalRecord @table(replicate: false) {
	id: Long @primaryKey
	value: String
}

# Combine multiple arguments
type Event @table(database: "analytics", expiration: 86400) {
	id: Long @primaryKey
	name: String @indexed
}

Database naming: Since all tables default to the data database, when designing plugins or applications, consider using unique database names to avoid table naming collisions.

Replication: Replication is enabled by default for all tables. Note that if you disable replication on a table and re-enable it later, it will not catch-up on previous writes during when the replication was disabled.

`@export`

Exposes the table as an externally accessible resource endpoint, available via REST, MQTT, and other interfaces.

type MyTable @table @export(name: "my-table") {
	id: Long @primaryKey
}

The optional name parameter specifies the URL path segment (e.g., /my-table/). Without name, the type name is used.

`@sealed`

Prevents records from including any properties beyond those explicitly declared in the type. By default, Harper allows records to have additional properties.

type StrictRecord @table @sealed {
	id: Long @primaryKey
	name: String
}

`@hidden` (Type Directive)

Suppresses the type from introspectable surfaces — MCP tool descriptors and the OpenAPI document. The table still exists; data is still queryable through Harper's other interfaces subject to RBAC. @hidden is a metadata-visibility directive, not an access-control mechanism: use attribute_permissions on roles to control data access.

type InternalConfig @table @hidden {
	id: Long @primaryKey
	value: String
}

@hidden is also available as a field directive to suppress individual attributes.

Documenting Types and Fields

Harper picks up GraphQL's standard triple-quoted docstrings on type and field definitions. Docstrings flow through to:

MCP — Table.description (consumed as a prefix on every verb-tool description) and inputSchema.properties[*].description on derived tool schemas
OpenAPI — components.schemas[*].description, per-property description, and the path-level description for every verb on the resource

"""
Product catalog row — what shows up in the storefront listing,
search, and inventory feeds. One row per SKU.
"""
type Product @table @export {
	"""
	Stock keeping unit — globally unique across catalogs.
	"""
	sku: String! @primaryKey

	"""
	Display name shown in the storefront.
	"""
	name: String!

	"""
	Retail price in cents (USD).
	"""
	priceCents: Int!
}

Docstrings on @hidden fields are dropped from the descriptive surfaces alongside the field itself.

Trust model. Docstrings reach LLMs and public OpenAPI consumers verbatim. Treat them as code: don't put secrets, internal-only commentary, or speculative prose in them. Use @hidden to suppress fields that shouldn't surface publicly.

Field Directives

Field directives apply to individual attributes in a type definition.

`@primaryKey`

Designates the attribute as the table's primary key. Primary keys must be unique; inserts with a duplicate primary key are rejected.

type Product @table {
	id: Long @primaryKey
	name: String
}

If no primary key is provided on insert, Harper auto-generates one:

UUID string — when type is String or ID
Auto-incrementing integer — when type is Int, Long, or Any

Changed in: v4.4.0

Auto-incrementing integer primary keys were added. Previously only UUID generation was supported for ID and String types.

Using Long or Any is recommended for auto-generated numeric keys. Int is limited to 32-bit and may be insufficient for large tables.

`@indexed`

Creates a secondary index on the attribute for fast querying. Required for filtering by this attribute in REST queries, SQL, or NoSQL operations.

type Product @table {
	id: Long @primaryKey
	category: String @indexed
	price: Float @indexed
}

If the field value is an array, each element in the array is individually indexed, enabling queries by any individual value.

Null values are indexed by default (added in v4.3.0), enabling queries like GET /Product/?category=null.

`@embed`

Added in: v5.1.0

Automatically computes an embedding vector for the attribute whenever the source field is written, using a configured embedding model:

type Document @table {
	id: Long @primaryKey
	text: String
	embedding: [Float] @embed(source: "text", model: "default")
}

source — the name of the field to embed. Must be a declared field on the same type, passed as a string literal.
model — the logical name of a configured embedding model, passed as a string literal.

The attribute type must be [Float]. The attribute is automatically indexed with an HNSW vector index, so it is immediately searchable by similarity; an explicit @indexed on the same attribute is allowed only if it is also HNSW.

Write semantics:

Creating a record with the source field, or updating the source field, computes the vector before the write commits (with inputType: 'document'). A failure to compute the embedding fails the write.
An update that does not touch the source field leaves the vector unchanged.
Setting the source field to null sets the vector to null.
Replicated writes and audit-log replays do not re-embed — the vector travels with the record, and only the node that accepted the original write calls the model.

Multiple @embed attributes on one type are computed concurrently.

`@createdTime`

Automatically assigns a creation timestamp (Unix epoch milliseconds) to the attribute when a record is created.

type Event @table {
	id: Long @primaryKey
	createdAt: Long @createdTime
}

`@updatedTime`

Automatically assigns a timestamp (Unix epoch milliseconds) each time the record is updated.

type Event @table {
	id: Long @primaryKey
	updatedAt: Long @updatedTime
}

`@hidden` (Field Directive)

Suppresses the field from MCP tool descriptors and the OpenAPI document. The attribute still exists in the table; data is still queryable through other interfaces subject to RBAC. Use this for fields that should not appear in introspectable surfaces.

type Customer @table {
	id: Long @primaryKey
	name: String

	"""
	Internal — do not surface to external consumers.
	"""
	creditScore: Int @hidden
}

@hidden is a metadata-visibility directive, not access control: attribute_permissions on roles remains the data-access enforcement mechanism.

Relationships

Added in: v4.3.0

The @relationship directive defines how one table relates to another through a foreign key. Relationships enable join queries and allow related records to be selected as nested properties in query results.

`@relationship(from: attribute)` — many-to-one or many-to-many

The foreign key is in this table, referencing the primary key of the target table.

type RealityShow @table @export {
	id: Long @primaryKey
	networkId: Long @indexed # foreign key
	network: Network @relationship(from: networkId) # many-to-one
	title: String @indexed
}

type Network @table @export {
	id: Long @primaryKey
	name: String @indexed # e.g. "Bravo", "Peacock", "Netflix"
}

Query shows by network name:

GET /RealityShow?network.name=Bravo

If the foreign key is an array, this establishes a many-to-many relationship (e.g., a show with multiple streaming homes):

type RealityShow @table @export {
	id: Long @primaryKey
	networkIds: [Long] @indexed
	networks: [Network] @relationship(from: networkIds)
}

`@relationship(to: attribute)` — one-to-many or many-to-many

The foreign key is in the target table, referencing the primary key of this table. The result type must be an array.

type Network @table @export {
	id: Long @primaryKey
	name: String @indexed # e.g. "Bravo", "Peacock", "Netflix"
	shows: [RealityShow] @relationship(to: networkId) # one-to-many
	# shows like "Real Housewives of Atlanta", "The Traitors", "Vanderpump Rules"
}

`@relationship(from: attribute, to: attribute)` — foreign key to foreign key

Both from and to can be specified together to define a relationship where neither side uses the primary key — a foreign key to foreign key join. This is useful for many-to-many relationships that join on non-primary-key attributes.

type OrderItem @table @export {
	id: Long @primaryKey
	orderId: Long @indexed
	productSku: Long @indexed
	product: Product @relationship(from: productSku, to: sku) # join on sku, not primary key
}

type Product @table @export {
	id: Long @primaryKey
	sku: Long @indexed
	name: String
}

Schemas can also define self-referential relationships, enabling parent-child hierarchies within a single table.

Computed Properties

Added in: v4.4.0

The @computed directive marks a field as derived from other fields at query time. Computed properties are not stored in the database but are evaluated when the field is accessed.

type Product @table {
	id: Long @primaryKey
	price: Float
	taxRate: Float
	totalPrice: Float @computed(from: "price + (price * taxRate)")
}

The from argument is a JavaScript expression that can reference other record fields.

Computed properties can also be defined in JavaScript for complex logic:

type Product @table {
	id: Long @primaryKey
	totalPrice: Float @computed
}

tables.Product.setComputedAttribute('totalPrice', (record) => {
	return record.price + record.price * record.taxRate;
});

Computed properties are not included in query results by default — use select to include them explicitly.

Computed Indexes

Computed properties can be indexed with @indexed, enabling custom indexing strategies such as composite indexes, full-text search, or vector indexing:

type Product @table {
  id: Long @primaryKey
  tags: String
  tagsSeparated: String[] @computed(from: "tags.split(/\\s*,\\s*/)") @indexed
}

When using a JavaScript function for an indexed computed property, use the version argument to ensure re-indexing when the function changes:

type Product @table {
	id: Long @primaryKey
	totalPrice: Float @computed(version: 1) @indexed
}

Increment version whenever the computation function changes. Failing to do so can result in an inconsistent index.

Vector Indexing

Added in: v4.6.0

Use @indexed(type: "HNSW") to create a vector index using the Hierarchical Navigable Small World algorithm, designed for fast approximate nearest-neighbor search on high-dimensional vectors.

type Document @table {
	id: Long @primaryKey
	textEmbeddings: [Float] @indexed(type: "HNSW")
}

Embedding vectors can also be computed automatically at write time from a text field with the @embed directive, which creates the HNSW index implicitly.

Query by nearest neighbors using the sort parameter:

let results = Document.search({
	sort: { attribute: 'textEmbeddings', target: searchVector },
	limit: 5,
});

HNSW can be combined with filter conditions:

let results = Document.search({
	conditions: [{ attribute: 'price', comparator: 'lt', value: 50 }],
	sort: { attribute: 'textEmbeddings', target: searchVector },
	limit: 5,
});

Filtering by Distance Threshold

To return only records whose distance to a target vector is below a threshold, place target directly on the condition (alongside comparator and value). This returns matches within the threshold without using sort:

let results = Document.search({
	conditions: {
		attribute: 'textEmbeddings',
		comparator: 'lt',
		value: 0.1,
		target: searchVector,
	},
});

This form is useful when you want to bound result quality by a similarity cutoff rather than ranking by similarity.

Selecting the Distance

Use the special $distance field in select to include the computed distance from the target vector in returned records:

let results = Document.search({
	select: ['name', '$distance'],
	sort: { attribute: 'textEmbeddings', target: searchVector },
	limit: 5,
});

$distance is available in both sort-based ranking and conditions-based threshold queries.

Per-Query Search Options

The sort descriptor (and threshold condition) accepts options that tune an individual query:

let results = Document.search({
	sort: { attribute: 'textEmbeddings', target: searchVector, distance: 'dotProduct', ef: 200 },
	limit: 5,
});

distance — overrides the index's distance function for this query: "cosine", "euclidean", or "dotProduct" (dotProduct Added in: v5.1.0).
ef Added in: v5.1.0 — overrides the search exploration budget for this query. Higher values improve recall at the cost of latency.

Changed in: v5.1.0 — When a query passes no ef and the index does not explicitly configure efConstructionSearch (or efConstruction), the search budget auto-scales with the size of the index, so recall holds as the table grows instead of decaying with a fixed budget.

HNSW Parameters

Parameter	Default	Description
`distance`	`"cosine"`	Distance function: `"cosine"` (negative cosine similarity), `"euclidean"`, or `"dotProduct"` (added in v5.1.0)
`efConstruction`	`100`	Max nodes explored during index construction. Higher = better recall, lower = better performance
`M`	`16`	Preferred connections per graph layer. Higher = more space, better recall for high-dimensional data
`optimizeRouting`	`0.5`	Heuristic aggressiveness for omitting redundant connections (0 = off, 1 = most aggressive)
`mL`	computed from `M`	Normalization factor for level generation
`efConstructionSearch`	auto-scaled	Max nodes explored during search. When unset, auto-scales with index size (see above); setting it (or `efConstruction`, which seeds it) fixes the budget
`quantization`	—	`"int8"` stores vectors quantized to int8 (added in v5.1.0, see below)

Example with custom parameters:

type Document @table {
	id: Long @primaryKey
	textEmbeddings: [Float] @indexed(type: "HNSW", distance: "euclidean", optimizeRouting: 0, efConstructionSearch: 100)
}

Note: this parameter was previously documented as efSearchConstruction; the option name Harper reads is efConstructionSearch.

Changed in: v5.1.0 — Changing efConstructionSearch on an existing index no longer triggers a rebuild; it only affects searches. Structural parameters (distance, M, efConstruction, quantization) still rebuild the index when changed.

Vector Quantization

Added in: v5.1.0

quantization: "int8" stores the index's vectors quantized to 8-bit integers, substantially reducing index size and memory traffic:

type Document @table {
	id: Long @primaryKey
	textEmbeddings: [Float] @indexed(type: "HNSW", quantization: "int8")
}

Graph navigation runs on the quantized (approximate) distances. For nearest-neighbor sort queries, Harper re-ranks the results against the full-precision vectors stored on the records, restoring exact ordering and exact $distance values. Distance-threshold (lt/le) queries currently filter on the approximate distance.

Field Types

Harper supports the following field types:

Type	Description
`String`	Unicode text, UTF-8 encoded
`Int`	32-bit signed integer (−2,147,483,648 to 2,147,483,647)
`Long`	54-bit signed integer (−9,007,199,254,740,992 to 9,007,199,254,740,992)
`Float`	64-bit double precision floating point
`BigInt`	Integer up to ~300 digits. Note: distinct JavaScript type; handle appropriately in custom code
`Boolean`	`true` or `false`
`ID`	String; indicates a non-human-readable identifier
`Any`	Any primitive, object, or array
`Date`	JavaScript `Date` object
`Bytes`	Binary data as `Buffer` or `Uint8Array`
`Blob`	Binary large object; designed for streaming content >20KB

Added BigInt in v4.3.0

Added Blob in v4.5.0

Arrays of a type are expressed with [Type] syntax (e.g., [Float] for a vector).

Blob Type

Added in: v4.5.0

Blob fields are designed for large binary content. Harper's Blob type implements the Web API Blob interface, so all standard Blob methods (.text(), .arrayBuffer(), .stream(), .slice()) are available. Unlike Bytes, blobs are stored separately from the record, support streaming, and do not need to be held entirely in memory. Use Blob for content typically larger than 20KB (images, video, audio, large HTML, etc.).

See Blob usage details below.

Blob Usage

Declare a blob field:

type MyTable @table {
	id: Any! @primaryKey
	data: Blob
}

Create and store a blob using createBlob():

let blob = createBlob(largeBuffer);
await MyTable.put({ id: 'my-record', data: blob });

Retrieve blob data using standard Web API Blob methods:

let record = await MyTable.get('my-record');
let buffer = await record.data.bytes(); // ArrayBuffer
let text = await record.data.text(); // string
let stream = record.data.stream(); // ReadableStream

Blobs support asynchronous streaming, meaning a record can reference a blob before it is fully written to storage. Use saveBeforeCommit: true to wait for full write before committing:

let blob = createBlob(stream, { saveBeforeCommit: true });
await MyTable.put({ id: 'my-record', data: blob });

Any string or buffer assigned to a Blob field in a put, patch, or publish is automatically coerced to a Blob.

When returning a blob via REST, register an error handler to handle interrupted streams:

export class MyEndpoint extends MyTable {
	static async get(target) {
		const record = super.get(target);
		let blob = record.data;
		blob.on('error', () => {
			MyTable.invalidate(target);
		});
		return { status: 200, headers: {}, body: blob };
	}
}

Dynamic Schema Behavior

When a table is created through the Operations API or Studio without a schema definition, it follows dynamic schema behavior:

Attributes are reflexively created as data is ingested
All top-level attributes are automatically indexed
Records automatically get __createdtime__ and __updatedtime__ audit attributes

Dynamic schema tables are additive — new attributes are added as new data arrives. Existing records will have null for any newly added attributes.

Use create_attribute and drop_attribute operations to manually manage attributes on dynamic schema tables. See the Operations API for details.

OpenAPI Specification

Tables exported with @export are described via an /openapi endpoint on the main HTTP server associated with the REST service (default port 9926).

GET http://localhost:9926/openapi

This provides an OpenAPI 3.x description of all exported resource endpoints. The endpoint is a starting guide and may not cover every edge case.

Renaming Tables

Harper does not support renaming tables. Changing a type name in a schema definition creates a new, empty table — the original table and its data are unaffected.

JavaScript API — tables, databases, transaction(), and createBlob() globals for working with schema-defined tables in code
Data Loader — Seed tables with initial data alongside schema deployment
REST Querying — Querying tables via HTTP using schema-defined attributes and relationships
Resources — Extending table behavior with custom application logic
Storage Algorithm — How Harper indexes and stores schema-defined data
Configuration — Component configuration for schemas

Overview​

Loading Schemas​

Type Directives​

@table​

How scanInterval Determines the Eviction Cycle​

Eviction with Indexing​

@export​

@sealed​

@hidden (Type Directive)​

Documenting Types and Fields​

Field Directives​

@primaryKey​

@indexed​

@embed​

@createdTime​

@updatedTime​

@hidden (Field Directive)​

Relationships​

@relationship(from: attribute) — many-to-one or many-to-many​

@relationship(to: attribute) — one-to-many or many-to-many​

@relationship(from: attribute, to: attribute) — foreign key to foreign key​

Computed Properties​

Computed Indexes​

Vector Indexing​

Filtering by Distance Threshold​

Selecting the Distance​

Per-Query Search Options​

HNSW Parameters​

Vector Quantization​

Field Types​

Blob Type​

Blob Usage​

Dynamic Schema Behavior​

OpenAPI Specification​

Renaming Tables​

Related Documentation​