Caching AI Generations with Harper

AI API calls are expensive. Generating a product description, summarizing an article, or personalizing a recommendation with a large language model can cost fractions of a cent per call — but those fractions add up fast at scale. And in most applications, the same content gets generated over and over: the same product page viewed by thousands of users, the same document summarized dozens of times.

Harper's caching system is a natural fit for this problem. You can generate AI content once, cache it close to your users, and serve it instantly on every subsequent request. When the underlying data changes you invalidate the cached generation and let it be regenerated on the next access.

In this guide you will build a product description endpoint backed by an LLM, wrap it in a Harper cache table, and implement an invalidation strategy so descriptions stay fresh when product data changes.

What You Will Learn

How to build a Resource class that calls an AI API and returns generated content
How to cache AI generations in a Harper table with an appropriate TTL
How to expose an invalidation endpoint so descriptions can be refreshed on demand
How to use ETags and conditional requests to avoid redundant content delivery downstream

Prerequisites

Completed Caching with Harper (recommended — this guide builds directly on those concepts)
Working Harper installation (local or Fabric)
An OpenAI API key (or another compatible LLM provider)
A command-line HTTP client (curl recommended) or familiarity with fetch

The Problem This Solves

Consider a product catalog with hundreds of items. Each product needs a compelling description tailored to its attributes. Generating those descriptions with an LLM produces higher-quality copy than static text, and you can re-generate them easily as your brand voice evolves.

The challenge: you cannot call the LLM on every product page load. A single generation might take 1–3 seconds and cost money. Instead, you generate on first access, cache the result, and regenerate only when needed.

Harper makes this straightforward:

A ProductDescription cache table stores generated descriptions with a long TTL (24 hours).
A DescriptionGenerator Resource class calls the LLM and returns the generated text.
sourcedFrom connects the generator to the cache — Harper handles all the caching logic.
An exported ProductDescription Resource class provides the REST endpoint and handles invalidation requests.

Setting Up the Application

Clone the example repository:

git clone https://github.com/HarperFast/ai-cache-example.git harper-ai-cache

Create a .env file with your API key:

OPENAI_API_KEY=sk-...

Then start Harper:

harper dev .

Defining the Schema

Open schema.graphql. There are two tables: Product holds the product catalog, and ProductDescription is the cache table for AI-generated descriptions.

type Product @table @export {
	id: ID @primaryKey
	name: String @indexed
	category: String @indexed
	price: Float
	features: [String]
}

type ProductDescription @table(expiration: 86400) @export {
	id: ID @primaryKey
	description: String
	generatedAt: Long
	product: Product @relationship(from: id)
}

The product field on ProductDescription uses @relationship(from: id) to declare that its own primary key (id) is a foreign key pointing to Product. Since ProductDescription records share the same ID as the product they describe, this gives consumers a way to fetch the full product details alongside the description in a single request.

Both tables use @export — Harper's auto-generated REST endpoints are sufficient for reading and writing. Invalidation will be handled automatically inside a custom Product Resource class rather than through a separate endpoint.

ProductDescription has a 24-hour TTL (expiration: 86400). A one-day window is reasonable for marketing copy — it's long enough to avoid redundant generation, and short enough that descriptions stay reasonably current even without explicit invalidation.

Configuring the Application

Open config.yaml and enable the required plugins:

graphqlSchema:
  files: 'schema.graphql'
rest: true
jsResource:
  files: 'resources.js'

graphqlSchema loads schema.graphql and creates both tables.
rest enables Harper's REST API on port 9926.
jsResource loads resources.js, registering the generator source, the sourcedFrom connection, and the exported ProductDescription endpoint.

Seeding the Product Catalog

Rather than seeding data from resources.js, this application uses Harper's built-in Data Loader to populate the Product table from a JSON file on startup. Add dataLoader to config.yaml:

graphqlSchema:
  files: 'schema.graphql'
rest: true
jsResource:
  files: 'resources.js'
dataLoader:
  files: 'data/products.json'

Then create data/products.json:

{
	"table": "Product",
	"records": [
		{
			"id": "prod-001",
			"name": "Titanium Water Bottle",
			"category": "Outdoors",
			"price": 39.99,
			"features": ["BPA-free", "Keeps cold 24h", "Lightweight 200g"]
		},
		{
			"id": "prod-002",
			"name": "Noise-Cancelling Headphones",
			"category": "Electronics",
			"price": 199.99,
			"features": ["40h battery", "Foldable", "USB-C charging"]
		}
	]
}

The Data Loader runs on every startup and deployment. It uses content hashing to skip records that haven't changed, so it's safe to redeploy without duplicating data or overwriting any manual edits.

Building the Description Generator

The DescriptionGenerator class is the upstream source for the ProductDescription cache. Its get method fetches the product from the Product table (which it extends), builds a prompt, and calls the OpenAI API.

// resources.js

const OPENAI_API_KEY = process.env.OPENAI_API_KEY;

class DescriptionGenerator extends tables.Product {
	static async get(productId) {
		// Load the product data from our base class, the Product table
		const product = await super.get(productId);
		if (!product) {
			const error = new Error('Product not found');
			error.statusCode = 404;
			throw error;
		}

		// Build a prompt from the product's attributes
		const prompt = `Write a compelling, two-sentence product description for the following item.
Product name: ${product.name}
Category: ${product.category}
Price: $${product.price}
Features: ${product.features.join(', ')}
Keep the tone enthusiastic but professional.`;

		// Call the OpenAI chat completions API
		const response = await fetch('https://api.openai.com/v1/chat/completions', {
			method: 'POST',
			headers: {
				'Content-Type': 'application/json',
				'Authorization': `Bearer ${OPENAI_API_KEY}`,
			},
			body: JSON.stringify({
				model: 'gpt-4o-mini',
				messages: [{ role: 'user', content: prompt }],
				max_tokens: 120,
			}),
		});

		const result = await response.json();
		const description = result.choices[0].message.content.trim();

		return {
			id: productId,
			description,
			generatedAt: Date.now(),
		};
	}
}

tables.ProductDescription.sourcedFrom(DescriptionGenerator);

With sourcedFrom in place, Harper now handles all caching behavior automatically:

Cache miss: the first request for /ProductDescription/prod-001 calls DescriptionGenerator.get(), stores the result, and returns it.
Cache hit: subsequent requests within 24 hours return the stored description instantly — no LLM call.
Cache expiry: after 24 hours, the next request regenerates the description.
Cache invalidation: Because we are extending a table, Harper automatically invalidates the cache when the source table is updated.

Requesting a Generated Description

With Harper running, request a description for the first product:

curl
fetch

curl -i 'http://localhost:9926/ProductDescription/prod-001'

const response = await fetch('http://localhost:9926/ProductDescription/prod-001');
const etag = response.headers.get('etag');
const data = await response.json();
console.log(data.description);
console.log('ETag:', etag);

The first request will take a moment — Harper is calling the OpenAI API. You will get back something like:

curl
fetch

HTTP/1.1 200 OK
content-type: application/json
etag: "abCDefGHij"
...

{
  "id": "prod-001",
  "description": "Stay hydrated on every adventure with this ultralight titanium bottle that keeps your drinks cold for a full 24 hours. BPA-free and weighing just 200g, it's built for those who refuse to compromise between performance and sustainability.",
  "generatedAt": 1712500000000
}

Stay hydrated on every adventure with this ultralight titanium bottle...
ETag: "abCDefGHij"

Make the same request again immediately:

curl
fetch

curl -i 'http://localhost:9926/ProductDescription/prod-001'

const second = await fetch('http://localhost:9926/ProductDescription/prod-001');
console.log(second.status);

curl
fetch

HTTP/1.1 200 OK
content-type: application/json
etag: "abCDefGHij"
server-timing: db;dur=1.2
...

{ ... same description ... }

The response is instant. Check the server-timing header — the database read time will be in single-digit milliseconds rather than the seconds the LLM call took.

Using ETags to Avoid Redundant Transfers

Harper automatically includes an ETag header with every record response. You can use If-None-Match to avoid re-transferring a description your client already has:

curl
fetch

# Use the etag value from the previous response, double quotes included
curl -i 'http://localhost:9926/ProductDescription/prod-001' \
  -H 'If-None-Match: "abCDefGHij"'

const first = await fetch('http://localhost:9926/ProductDescription/prod-001');
const etag = first.headers.get('etag'); // e.g. "abCDefGHij"

const second = await fetch('http://localhost:9926/ProductDescription/prod-001', {
	headers: { 'If-None-Match': etag },
});
console.log(second.status);

curl
fetch

HTTP/1.1 304 Not Modified
etag: "abCDefGHij"

A 304 Not Modified response means the cached description your client holds is still current. No data is serialized or transmitted. This layering — Harper's internal cache plus HTTP conditional requests for downstream caches — means a regeneration event only propagates to clients when their cached copy actually becomes stale.

Querying a Description with Its Product

Because ProductDescription declares a @relationship to Product, you can fetch the description and the full product record together using the select query parameter:

curl
fetch

curl -s 'http://localhost:9926/ProductDescription/prod-001?select(description,generatedAt,product(name,price,features))'

const response = await fetch(
	'http://localhost:9926/ProductDescription/prod-001?select(description,generatedAt,product(name,price,features))'
);
const data = await response.json();
console.log(data);

curl
fetch

{
	"description": "Stay hydrated on every adventure...",
	"generatedAt": 1712500000000,
	"product": {
		"name": "Titanium Water Bottle",
		"price": 39.99,
		"features": ["BPA-free", "Keeps cold 24h", "Lightweight 200g"]
	}
}

{
  description: 'Stay hydrated on every adventure...',
  generatedAt: 1712500000000,
  product: { name: 'Titanium Water Bottle', price: 39.99, features: [ ... ] }
}

Harper resolves the relationship and joins the Product record in a single database read — no extra round-trips.

Invalidating Descriptions When Products Change

With the custom Product class in place, a single PATCH to the product is all it takes. Harper calls your patch override, saves the update, and invalidates the cached description in the same operation — no second request needed.

curl
fetch

curl -X PATCH 'http://localhost:9926/Product/prod-001' \
  -H 'Content-Type: application/json' \
  -d '{"features": ["BPA-free", "Keeps cold 48h", "Lightweight 180g"]}'

await fetch('http://localhost:9926/Product/prod-001', {
	method: 'PATCH',
	headers: { 'Content-Type': 'application/json' },
	body: JSON.stringify({ features: ['BPA-free', 'Keeps cold 48h', 'Lightweight 180g'] }),
});

curl
fetch

HTTP/1.1 204 No Content

The next GET /ProductDescription/prod-001 triggers a new LLM call with the updated features. Every subsequent request serves the new cached description instantly.

Handling Errors Gracefully

LLM APIs can fail — rate limits, network errors, service outages. By default, Harper will surface a 500 error to the client if DescriptionGenerator.get() throws. For a production cache you may want to serve the stale description if the LLM is temporarily unavailable.

Add the stale-if-error Cache-Control directive to your request to accept a stale cached response when the source returns an error:

curl
fetch

curl -i 'http://localhost:9926/ProductDescription/prod-001' \
  -H 'Cache-Control: stale-if-error'

const response = await fetch('http://localhost:9926/ProductDescription/prod-001', {
	headers: { 'Cache-Control': 'stale-if-error' },
});

With stale-if-error, Harper returns the most recently cached description rather than propagating the upstream error — a sensible default for AI-generated marketing copy where slightly stale content is better than a broken page.

Going Further

This guide used the passive caching pattern: Harper fetches from the source on demand. For high-traffic applications, you may want to proactively populate the cache — for example, pre-generating descriptions for all products at startup or on a schedule. This is covered in the Active Caching and Subscriptions guide (coming soon).

You might also consider:

Category-wide invalidation: extend the patch handler to iterate tables.ProductDescription and call invalidate on each record when a category-level change affects many products at once.
Version-aware ETags: include a version field in DescriptionGenerator.get() so clients can detect stale descriptions proactively rather than waiting for a server-side invalidation.
Cost tracking: log generatedAt changes to measure how often you are actually hitting the LLM versus serving from cache.

Additional Resources

Caching with Harper — foundational guide covering sourcedFrom, ETags, and TTL expiration
Resource API — sourcedFrom, invalidate, getContext, static methods
Database Schema — @table(expiration:) directive reference
REST Headers — ETag, If-None-Match, Cache-Control directives
harper-ecommerce-template — Full ecommerce application using Harper with AI-generated content

What You Will Learn​

Prerequisites​

The Problem This Solves​

Setting Up the Application​

Defining the Schema​

Configuring the Application​

Seeding the Product Catalog​

Building the Description Generator​

Requesting a Generated Description​

Using ETags to Avoid Redundant Transfers​

Querying a Description with Its Product​

Invalidating Descriptions When Products Change​

Handling Errors Gracefully​

Going Further​

Additional Resources​