Semantic Compression

Compresses context by preserving meaning and constraints while discarding surface wording to fit within token limits.

Also known asPrompt CompressionSemantic Context PackingIntent-Preserving Compression

Problem

Human-readable summaries often drop constraints and edge details needed for downstream reasoning under token limits.

Solution

Compress for semantic preservation, not prose readability, retaining decisions, constraints, and dependencies in dense form. This carries functional meaning forward with lower token cost and less silent loss.

Example

One worked example. The book has 2–4 per technique.

Prompt

Compress the following architecture decisions. Preserve all constraints, rationale, and dependencies. Discard phrasing and narrative.

1. Microservices architecture with domain-bounded services.
2. All inter-service communication must be asynchronous via message bus, except health check endpoints.
3. Authentication via JWT with RS256 signing, 15-minute token expiry.
4. Each service owns its database (PostgreSQL); no shared database access.
5. Deployment on Kubernetes, one namespace per service.
6. Error responses follow RFC 7807 Problem Details format.
7. API versioning via URL path (/v1/, /v2/).
8. Rate limiting per tenant, 429 status with Retry-After header.

Response

arch: microservices, domain-bounded | comm: async-only via msg-bus (except /health) | auth: JWT RS256 15min-exp | data: pg-per-svc, no shared-db | deploy: k8s ns-per-svc | errors: RFC7807 | versioning: URL-path /v1/ /v2/ | rate-limit: per-tenant 429+Retry-After

Techniques

Concrete ways to implement Semantic Compression. Each technique fits a different situation.

01
Semantic Condensation
Compress the full content into a dense representation that preserves intent, constraints, and relationships while discarding narrative and redundancy.
02
Selective Reduction
Produce a smaller task-scoped artifact by dropping entire semantic categories irrelevant to the downstream task while preserving what remains.
03
Progressive Compression
Periodically compress the older portion of an accumulating context while keeping recent material verbatim so long sessions stay within token limits.

Full treatment in the book