Taxonomy & Data Model
The entity model behind the browse philosophy. This document covers the shapes and constraints; API shapes live in API.md; philosophy in AgnosticUI.md.
The model is content-centric. We have one content kind (Content) and the taxonomy that classifies it: Surface, Group, Channel, Tag, and Platform. The schema supports adding more content kinds later without structural change.
1. Overview
Surface (1) ← Layer 1, typically one: "Content"
├── Group (n) ← Layer 2: topic buckets
│ └── Channel (n) ← Layer 3 (optional): sub-topics
└── …
Tag (n) ← shared vocabulary, faceted
Platform (n) ← source platform badges (youtube, bluesky, …)
Content (n) ← the items themselves
├── surfaceSlug (denormalized ref)
├── groupSlug (denormalized ref)
├── channelSlug? (denormalized ref)
├── platformSlug (denormalized ref)
├── tagSlugs[] (multikey index)
└── approvalStatus (pending | approved | rejected)
Denormalization by slug is intentional: list/filter queries run against Content alone, without joins. Slugs are stable (rename is atomic; see API.md §3).
2. Surface (Layer 1)
Usually one. Supports future kinds without structural change.
@Schema({ timestamps: true, collection: 'surfaces' })
class Surface {
@Prop({ required: true, unique: true }) slug: string; // e.g. "content"
@Prop({ required: true }) label: string; // "Content"
@Prop({ required: true }) routePrefix: string; // "/browse"
@Prop({ required: true, default: 'topic-groups' }) browseMode: 'topic-groups' | 'alphabetical' | 'faceted-primary' | 'entity-linked';
@Prop() accentColor?: string; // hex
@Prop() icon?: string; // name or URL
@Prop({ default: 0 }) displayOrder: number;
@Prop({ default: true }) isVisible: boolean;
@Prop({ default: true }) isActive: boolean;
@Prop() minPermission?: string; // e.g. "content.moderate"
}
Indexes: { slug: 1 } unique.
3. Group (Layer 2)
@Schema({ timestamps: true, collection: 'groups' })
class Group {
@Prop({ required: true, ref: 'Surface' }) surfaceSlug: string;
@Prop({ required: true }) slug: string; // unique within surfaceSlug
@Prop({ required: true }) label: string;
@Prop() color?: string;
@Prop() icon?: string;
@Prop({ default: 0 }) displayOrder: number;
@Prop({ default: true }) isActive: boolean;
}
Indexes: { surfaceSlug: 1, slug: 1 } unique.
4. Channel (Layer 3, optional)
@Schema({ timestamps: true, collection: 'channels' })
class Channel {
@Prop({ required: true, ref: 'Group' }) groupSlug: string;
@Prop({ required: true, ref: 'Surface' }) surfaceSlug: string; // denormalized
@Prop({ required: true }) slug: string; // unique within groupSlug
@Prop({ required: true }) label: string;
@Prop({ default: 0 }) displayOrder: number;
@Prop({ default: true }) isActive: boolean;
}
Indexes: { groupSlug: 1, slug: 1 } unique; { surfaceSlug: 1, groupSlug: 1 } for catalog assembly.
5. Platform
A small, mostly-static list of source platforms that power the platform filter and badge color.
@Schema({ timestamps: true, collection: 'platforms' })
class Platform {
@Prop({ required: true, unique: true }) slug: string; // "youtube"
@Prop({ required: true }) label: string; // "YouTube"
@Prop() color?: string; // brand hex
@Prop({ default: 0 }) displayOrder: number;
@Prop({ default: true }) isActive: boolean;
}
Seeded at bootstrap with a starter set (youtube, twitter, bluesky, reddit, article, generic); operators can add more via the admin surface.
6. Tag
Shared vocabulary for all taggable content. Single collection; slug is the natural primary key; facet classifies meaning.
@Schema({ timestamps: true, collection: 'tags' })
class Tag {
@Prop({ required: true, unique: true }) slug: string;
@Prop({ required: true }) label: string; // display form
@Prop({ required: true, default: 'unclassified' })
facet: 'topic' | 'format' | 'level' | 'source-type' | 'unclassified';
@Prop() description?: string;
@Prop({ default: 0 }) usageCount: number; // denormalized, see below
@Prop({ default: true }) isActive: boolean;
}
usageCount denormalization
Updated on content state transitions:
- On approve: increment counts for each tag in the approved content.
- On reject or soft-delete: decrement.
- On tag edit of approved content: apply diff.
A nightly reconciliation job (countDocuments({ tagSlugs: slug, approvalStatus: 'approved', isActive: true }) per tag) corrects drift. Optional v1; becomes necessary at scale.
Rename
Atomic, transactional, cascade to Content.tagSlugs. See API.md §3.
Never delete
Rename or deactivate (isActive=false). Deletion loses the audit trail of what a content item meant at a point in time.
Indexes: { slug: 1 } unique; { facet: 1, slug: 1 } for the tag explorer's facet-grouped view; { usageCount: -1 } for by-usage sort.
7. Content
The item being browsed. Existing User/Role schemas stay as-is.
@Schema({ timestamps: true, collection: 'content' })
class Content {
@Prop({ required: true, unique: true }) slug: string;
@Prop({ required: true }) title: string;
@Prop() browseTitle?: string; // curator override
@Prop() description?: string;
@Prop({ required: true }) url: string; // canonical link
@Prop({ required: true, ref: 'Platform' }) platformSlug: string;
// taxonomy links (denormalized by slug)
@Prop({ required: true, ref: 'Surface' }) surfaceSlug: string;
@Prop({ required: true, ref: 'Group' }) groupSlug: string;
@Prop({ ref: 'Channel' }) channelSlug?: string;
@Prop({ type: [String], default: [] }) tagSlugs: string[];
// media
@Prop() thumbnailUrl?: string;
@Prop({ type: Object }) thumbnailMip?: { small: string; medium: string; large: string };
@Prop({ type: Object }) embed?: { provider: string; html?: string; width?: number; height?: number };
// submission + approval
@Prop({ required: true, ref: 'User' }) submittedBy: Types.ObjectId;
@Prop({ required: true, default: 'pending' })
approvalStatus: 'pending' | 'approved' | 'rejected';
@Prop() approvedAt?: Date; // set on transition → approved
@Prop({ type: Object }) approvalMeta?: { actorUserId: Types.ObjectId; actorAt: Date; reason?: string };
// soft delete
@Prop({ default: true }) isActive: boolean;
}
Required indexes
| Index | Purpose |
|---|---|
{ slug: 1 } unique | Detail lookup |
{ approvalStatus: 1, isActive: 1, approvedAt: -1, _id: -1 } | Default feed (newest approved) |
{ tagSlugs: 1 } multikey | Tag filter — hot path |
{ groupSlug: 1, approvedAt: -1 } | Group view |
{ groupSlug: 1, channelSlug: 1, approvedAt: -1 } | Channel view |
{ platformSlug: 1, approvedAt: -1 } | Platform filter |
Text index on { title: 'text', description: 'text' } | q= text search |
{ submittedBy: 1, createdAt: -1 } | "My submissions" queries |
Mongo multi-index planning is best-effort; the default-feed composite covers the unfiltered case, and the tag multikey handles tag-narrowed filters. The query planner intersects where sensible.
Approval state machine
pending ─approve→ approved
pending ─reject→ rejected
approved ─soft-delete→ (isActive=false; status unchanged)
rejected ─revive→ pending (moderator only)
Illegal transitions (approve an already-rejected item without re-queueing; approve a soft-deleted item) return 422 with error.code = "content.state_invalid".
Submit flow
POST /content/submit with { url, title, platformSlug?, groupSlug?, channelSlug?, tagSlugs? }:
- Validate URL (valid URL, 2–2048 chars).
- Validate platformSlug if supplied; infer from URL hostname if missing (simple map:
youtube.com → youtube,x.com/twitter.com → twitter,bsky.app → bluesky,reddit.com → reddit, elsegeneric). - Validate groupSlug is active; fall back to a configured default group if unspecified.
- Validate every tag slug exists and is active; reject with 400 if any doesn't (do not auto-create tags from submissions).
- Generate a slug from title; on conflict, append a short random suffix.
- Insert with
approvalStatus: 'pending',submittedBy: <user>. - Fire
content.submittedevent (in-process EventEmitter v1; queue later).
Approval flow
POST /content/<slug>/approve:
- Load content; 404 if missing or inactive.
- 422 if not
pending(idempotent rejections are fine; see "idempotency" below). - Update
approvalStatus: 'approved',approvedAt: now,approvalMeta. - Fire
content.approvedevent → async thumbnail mip generation, usage-count increment. - Return updated content in envelope.
POST /content/<slug>/reject is analogous; takes optional reason.
Idempotency. Approving an already-approved item returns the current state with 200 and meta.unchanged: true — don't error. Rejecting an already-rejected item same pattern.
8. Catalog derivation
GET /catalog is a derived view. Assembly:
- Query
surfaces.find({ isActive: true, isVisible: true }).sort({ displayOrder: 1 }).lean(). - For each surface, query
groups(same filters, sorted). - For each group, query
channels(same filters, sorted). - Query active
platformsand build the platform list. - Compute facet list from a config (or from distinct
tags.facetvalues). - Compute
etagfrom a hash ofmax(updatedAt)across the queried collections.
Assembly happens in a service method that is cached (Nest CacheModule with ttl: 300s, keyed by 'catalog:v1'). Writes to any taxonomy collection invalidate the cache via a central CatalogCacheService.invalidate() hook called from each taxonomy controller's write paths.
Handling admin vs public
Hidden surfaces (those with minPermission set) are omitted for callers who lack the permission. The cache is keyed on a derived permission bucket ('public' | 'moderator' | 'admin') so a single cache serves each bucket.
9. Seeding
On first boot (if surfaces.countDocuments() === 0), the app seeds:
- One surface:
{ slug: 'content', label: 'Content', routePrefix: '/browse', browseMode: 'topic-groups' }. - A starter group:
{ surfaceSlug: 'content', slug: 'general', label: 'General' }. - The platform list (
youtube,twitter,bluesky,reddit,article,generic). - Roles (
admin,moderator,member) per AUTH.md §5.
Seed script: api/scripts/seed.ts (TBD). Safe to re-run — each seed entry uses findOneAndUpdate with upsert: true.
10. Migration posture
No migration framework in v1. When a schema change is needed:
- Mongoose schema change is forward-compatible (
default: …on new required fields; type widening). - A one-off script in
api/scripts/migrations/mutates existing documents. - Never rename a field in place — add-new, backfill, deprecate, remove.
This is acceptable at current scale (no prod data). Revisit if operational maturity demands a formal tool (e.g. migrate-mongo).