Caching AI Generations with Harper
AI API calls are expensive. Generating a product description, summarizing an article, or personalizing a recommendation with a large language model can cost fractions of a cent per call — but those fractions add up fast at scale. And in most applications, the same content gets generated over and over: the same product page viewed by thousands of users, the same document summarized dozens of times.
Harper's caching system is a natural fit for this problem. You can generate AI content once, cache it close to your users, and serve it instantly on every subsequent request. When the underlying data changes you invalidate the cached generation and let it be regenerated on the next access.
In this guide you will build a product description endpoint backed by an LLM, wrap it in a Harper cache table, and implement an invalidation strategy so descriptions stay fresh when product data changes.
What You Will Learn
- How to build a Resource class that calls an AI API and returns generated content
- How to cache AI generations in a Harper table with an appropriate TTL
- How to expose an invalidation endpoint so descriptions can be refreshed on demand
- How to use ETags and conditional requests to avoid redundant content delivery downstream
Prerequisites
- Completed Caching with Harper (recommended — this guide builds directly on those concepts)
- Working Harper installation (local or Fabric)
- An OpenAI API key (or another compatible LLM provider)
- A command-line HTTP client (
curlrecommended) or familiarity withfetch
The Problem This Solves
Consider a product catalog with hundreds of items. Each product needs a compelling description tailored to its attributes. Generating those descriptions with an LLM produces higher-quality copy than static text, and you can re-generate them easily as your brand voice evolves.
The challenge: you cannot call the LLM on every product page load. A single generation might take 1–3 seconds and cost money. Instead, you generate on first access, cache the result, and regenerate only when needed.
Harper makes this straightforward:
- A
ProductDescriptioncache table stores generated descriptions with a long TTL (24 hours). - A
DescriptionGeneratorResource class calls the LLM and returns the generated text. sourcedFromconnects the generator to the cache — Harper handles all the caching logic.- An exported
ProductDescriptionResource class provides the REST endpoint and handles invalidation requests.
Setting Up the Application
Clone the example repository:
git clone https://github.com/HarperFast/ai-cache-example.git harper-ai-cache
Create a .env file with your API key:
OPENAI_API_KEY=sk-...
Then start Harper:
harper dev .
Defining the Schema
Open schema.graphql. There are two tables: Product holds the product catalog, and ProductDescription is the cache table for AI-generated descriptions.
type Product @table @export {
id: ID @primaryKey
name: String @indexed
category: String @indexed
price: Float
features: [String]
}
type ProductDescription @table(expiration: 86400) @export {
id: ID @primaryKey
description: String
generatedAt: Long
product: Product @relationship(from: id)
}
The product field on ProductDescription uses @relationship(from: id) to declare that its own primary key (id) is a foreign key pointing to Product. Since ProductDescription records share the same ID as the product they describe, this gives consumers a way to fetch the full product details alongside the description in a single request.
Both tables use @export — Harper's auto-generated REST endpoints are sufficient for reading and writing. Invalidation will be handled automatically inside a custom Product Resource class rather than through a separate endpoint.
ProductDescription has a 24-hour TTL (expiration: 86400). A one-day window is reasonable for marketing copy — it's long enough to avoid redundant generation, and short enough that descriptions stay reasonably current even without explicit invalidation.
Configuring the Application
Open config.yaml and enable the required plugins:
graphqlSchema:
files: 'schema.graphql'
rest: true
jsResource:
files: 'resources.js'
graphqlSchemaloadsschema.graphqland creates both tables.restenables Harper's REST API on port9926.jsResourceloadsresources.js, registering the generator source, thesourcedFromconnection, and the exportedProductDescriptionendpoint.
Seeding the Product Catalog
Rather than seeding data from resources.js, this application uses Harper's built-in Data Loader to populate the Product table from a JSON file on startup. Add dataLoader to config.yaml:
graphqlSchema:
files: 'schema.graphql'
rest: true
jsResource:
files: 'resources.js'
dataLoader:
files: 'data/products.json'
Then create data/products.json:
{
"table": "Product",
"records": [
{
"id": "prod-001",
"name": "Titanium Water Bottle",
"category": "Outdoors",
"price": 39.99,
"features": ["BPA-free", "Keeps cold 24h", "Lightweight 200g"]
},
{
"id": "prod-002",
"name": "Noise-Cancelling Headphones",
"category": "Electronics",
"price": 199.99,
"features": ["40h battery", "Foldable", "USB-C charging"]
}
]
}
The Data Loader runs on every startup and deployment. It uses content hashing to skip records that haven't changed, so it's safe to redeploy without duplicating data or overwriting any manual edits.
Building the Description Generator
The DescriptionGenerator class is the upstream source for the ProductDescription cache. Its get method fetches the product from the Product table (which it extends), builds a prompt, and calls the OpenAI API.
// resources.js
const OPENAI_API_KEY = process.env.OPENAI_API_KEY;
class DescriptionGenerator extends tables.Product {
static async get(productId) {
// Load the product data from our base class, the Product table
const product = await super.get(productId);
if (!product) {
const error = new Error('Product not found');
error.statusCode = 404;
throw error;
}
// Build a prompt from the product's attributes
const prompt = `Write a compelling, two-sentence product description for the following item.
Product name: ${product.name}
Category: ${product.category}
Price: $${product.price}
Features: ${product.features.join(', ')}
Keep the tone enthusiastic but professional.`;
// Call the OpenAI chat completions API
const response = await fetch('https://api.openai.com/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${OPENAI_API_KEY}`,
},
body: JSON.stringify({
model: 'gpt-4o-mini',
messages: [{ role: 'user', content: prompt }],
max_tokens: 120,
}),
});
const result = await response.json();
const description = result.choices[0].message.content.trim();
return {
id: productId,
description,
generatedAt: Date.now(),
};
}
}
tables.ProductDescription.sourcedFrom(DescriptionGenerator);
With sourcedFrom in place, Harper now handles all caching behavior automatically:
- Cache miss: the first request for
/ProductDescription/prod-001callsDescriptionGenerator.get(), stores the result, and returns it. - Cache hit: subsequent requests within 24 hours return the stored description instantly — no LLM call.
- Cache expiry: after 24 hours, the next request regenerates the description.
- Cache invalidation: Because we are extending a table, Harper automatically invalidates the cache when the source table is updated.
Requesting a Generated Description
With Harper running, request a description for the first product:
- curl
- fetch
curl -i 'http://localhost:9926/ProductDescription/prod-001'
const response = await fetch('http://localhost:9926/ProductDescription/prod-001');
const etag = response.headers.get('etag');
const data = await response.json();
console.log(data.description);
console.log('ETag:', etag);
The first request will take a moment — Harper is calling the OpenAI API. You will get back something like:
- curl
- fetch
HTTP/1.1 200 OK
content-type: application/json
etag: "abCDefGHij"
...
{
"id": "prod-001",
"description": "Stay hydrated on every adventure with this ultralight titanium bottle that keeps your drinks cold for a full 24 hours. BPA-free and weighing just 200g, it's built for those who refuse to compromise between performance and sustainability.",
"generatedAt": 1712500000000
}
Stay hydrated on every adventure with this ultralight titanium bottle...
ETag: "abCDefGHij"
Make the same request again immediately:
- curl
- fetch
curl -i 'http://localhost:9926/ProductDescription/prod-001'
const second = await fetch('http://localhost:9926/ProductDescription/prod-001');
console.log(second.status);
- curl
- fetch
HTTP/1.1 200 OK
content-type: application/json
etag: "abCDefGHij"
server-timing: db;dur=1.2
...
{ ... same description ... }
200
The response is instant. Check the server-timing header — the database read time will be in single-digit milliseconds rather than the seconds the LLM call took.
Using ETags to Avoid Redundant Transfers
Harper automatically includes an ETag header with every record response. You can use If-None-Match to avoid re-transferring a description your client already has:
- curl
- fetch
# Use the etag value from the previous response, double quotes included
curl -i 'http://localhost:9926/ProductDescription/prod-001' \
-H 'If-None-Match: "abCDefGHij"'
const first = await fetch('http://localhost:9926/ProductDescription/prod-001');
const etag = first.headers.get('etag'); // e.g. "abCDefGHij"
const second = await fetch('http://localhost:9926/ProductDescription/prod-001', {
headers: { 'If-None-Match': etag },
});
console.log(second.status);
- curl
- fetch
HTTP/1.1 304 Not Modified
etag: "abCDefGHij"
304
A 304 Not Modified response means the cached description your client holds is still current. No data is serialized or transmitted. This layering — Harper's internal cache plus HTTP conditional requests for downstream caches — means a regeneration event only propagates to clients when their cached copy actually becomes stale.
Querying a Description with Its Product
Because ProductDescription declares a @relationship to Product, you can fetch the description and the full product record together using the select query parameter:
- curl
- fetch
curl -s 'http://localhost:9926/ProductDescription/prod-001?select(description,generatedAt,product(name,price,features))'
const response = await fetch(
'http://localhost:9926/ProductDescription/prod-001?select(description,generatedAt,product(name,price,features))'
);
const data = await response.json();
console.log(data);
- curl
- fetch
{
"description": "Stay hydrated on every adventure...",
"generatedAt": 1712500000000,
"product": {
"name": "Titanium Water Bottle",
"price": 39.99,
"features": ["BPA-free", "Keeps cold 24h", "Lightweight 200g"]
}
}
{
description: 'Stay hydrated on every adventure...',
generatedAt: 1712500000000,
product: { name: 'Titanium Water Bottle', price: 39.99, features: [ ... ] }
}
Harper resolves the relationship and joins the Product record in a single database read — no extra round-trips.
Invalidating Descriptions When Products Change
With the custom Product class in place, a single PATCH to the product is all it takes. Harper calls your patch override, saves the update, and invalidates the cached description in the same operation — no second request needed.
- curl
- fetch
curl -X PATCH 'http://localhost:9926/Product/prod-001' \
-H 'Content-Type: application/json' \
-d '{"features": ["BPA-free", "Keeps cold 48h", "Lightweight 180g"]}'
await fetch('http://localhost:9926/Product/prod-001', {
method: 'PATCH',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ features: ['BPA-free', 'Keeps cold 48h', 'Lightweight 180g'] }),
});
- curl
- fetch
HTTP/1.1 204 No Content
204
The next GET /ProductDescription/prod-001 triggers a new LLM call with the updated features. Every subsequent request serves the new cached description instantly.
Handling Errors Gracefully
LLM APIs can fail — rate limits, network errors, service outages. By default, Harper will surface a 500 error to the client if DescriptionGenerator.get() throws. For a production cache you may want to serve the stale description if the LLM is temporarily unavailable.
Add the stale-if-error Cache-Control directive to your request to accept a stale cached response when the source returns an error:
- curl
- fetch
curl -i 'http://localhost:9926/ProductDescription/prod-001' \
-H 'Cache-Control: stale-if-error'
const response = await fetch('http://localhost:9926/ProductDescription/prod-001', {
headers: { 'Cache-Control': 'stale-if-error' },
});
With stale-if-error, Harper returns the most recently cached description rather than propagating the upstream error — a sensible default for AI-generated marketing copy where slightly stale content is better than a broken page.
Going Further
This guide used the passive caching pattern: Harper fetches from the source on demand. For high-traffic applications, you may want to proactively populate the cache — for example, pre-generating descriptions for all products at startup or on a schedule. This is covered in the Active Caching and Subscriptions guide (coming soon).
You might also consider:
- Category-wide invalidation: extend the
patchhandler to iteratetables.ProductDescriptionand callinvalidateon each record when a category-level change affects many products at once. - Version-aware ETags: include a version field in
DescriptionGenerator.get()so clients can detect stale descriptions proactively rather than waiting for a server-side invalidation. - Cost tracking: log
generatedAtchanges to measure how often you are actually hitting the LLM versus serving from cache.
Additional Resources
- Caching with Harper — foundational guide covering
sourcedFrom, ETags, and TTL expiration - Resource API —
sourcedFrom,invalidate,getContext, static methods - Database Schema —
@table(expiration:)directive reference - REST Headers — ETag,
If-None-Match,Cache-Controldirectives - harper-ecommerce-template — Full ecommerce application using Harper with AI-generated content