Skip to main content

Related content

GET /content/<slug>/related returns a small list of items related to the current one. V1 uses tag and channel overlap; later versions add explicit pins and signal aggregation.


Purpose

A detail page needs a "see also" rail. The simplest useful relationship is structural: items sharing tags or channels. That's cheap, predictable, and hits ~80% of the UX value with ~20% of the complexity.

Contract reference: AgnosticUI.md §14.


V1 contract

GET /content/<slug>/related?limit=10

Response: { success: true, data: ContentDto[] } (no pagination — fixed-size result).

Max limit: 20. Default: 10.


Scoring (v1)

For a content item X with tags = T and channelSlug = C:

score(Y) = 2 × (Y.channelSlug === C ? 1 : 0)
+ (count of tags shared between X.tags and Y.tags)

Exclusions:

  • Y._id === X._id
  • Y.approvalStatus !== 'approved'
  • Y.isActive !== true

Candidates with score 0 are dropped. Ties broken by approvedAt descending.

Why this formula

  • Channel overlap is a strong signal (same editorial cluster), weighted 2.
  • Shared tags is weaker but compositional; summed directly.
  • Formula is a dot product — trivial to compute, trivial to index.
  • Not personalized — deliberately. Personalization is a different product.

Implementation

Query strategy

async related(slug: string, limit = 10): Promise<ContentDto[]> {
const x = await this.model.findOne({ slug, isActive: true, approvalStatus: 'approved' }).lean();
if (!x) throw new NotFoundError('content.not_found', { slug });

const orClauses: FilterQuery<Content>[] = [];
if (x.channelSlug) orClauses.push({ channelSlug: x.channelSlug });
if (x.tagSlugs?.length) orClauses.push({ tagSlugs: { $in: x.tagSlugs } });
if (!orClauses.length) return [];

const candidates = await this.model.find({
$and: [
{ _id: { $ne: x._id }, isActive: true, approvalStatus: 'approved' },
{ $or: orClauses },
],
}).limit(100).lean(); // oversample; scoring trims

const scored = candidates
.map(y => ({ y, s: score(x, y) }))
.filter(({ s }) => s > 0)
.sort((a, b) => b.s - a.s || toTime(b.y.approvedAt) - toTime(a.y.approvedAt))
.slice(0, limit);

return scored.map(({ y }) => toContentDto(y));
}

function score(x: Content, y: Content): number {
const channelWin = y.channelSlug === x.channelSlug && x.channelSlug ? 2 : 0;
const tagOverlap = x.tagSlugs.filter(t => y.tagSlugs.includes(t)).length;
return channelWin + tagOverlap;
}

Oversample (limit(100) before scoring) so the scoring step has enough candidates to rank. Raise this if relevance looks thin at scale.

Why not $lookup / $facet aggregation

For current scale (thousands of items), an in-process score is fine and readable. An aggregation pipeline with $addFields for the scores would push work to Mongo and keep it at constant wire cost — consider that refactor if related-content responses exceed ~100ms routinely.


V2 extensions (reserved)

  1. Explicit moderator pins. New collection related_pins with { contentId, relatedIds[], actorId }. Pinned items appear first, regardless of score.
  2. Signal aggregation. View events fed to a rollup table content_cooccurrence with (contentId, relatedContentId, count). Incorporate into score with a weight.
  3. Provenance indicator. Return data[i].reason: 'pinned' | 'channel' | 'tag' | 'signal' so the client can badge the rail.

Each is additive — v1 shape stays backward compatible.


Required variables and services

  • ContentModel
  • No env variables.

Gotchas

  • Cold content. An item with no tags and no channel returns []. The client renders the rail empty or hides it. Don't compute from other signals (recent, popular) — that would lie about the relationship.
  • Tag-sparse content. An item with one rare tag may have no candidates. Fine — empty is honest.
  • Oversample isn't bounded by score. We always fetch 100 candidates and score, even if the first 10 are score-3+. Early termination could save CPU at scale; not a concern at v1 scale.
  • Query builds on tagSlugs: { $in: ... }. An item with 50 tags runs a wide scan. Tags per content should stay in single digits by editorial discipline; enforce via DTO (@ArrayMaxSize(20) on submit).

Testing

  • Unit: score — same-channel + one shared tag = 3; same-channel only = 2; no overlap = 0; item with no channel contributes 0 for the channel term.
  • Integration: seed three items where A shares a channel with B and one tag with C; /related on A returns B before C.
  • Integration: item with no tags and no channel → empty array.
  • Integration: ?limit=3 bounds the response even when many candidates score > 0.
  • Integration: rejected items never appear in results.