Layer 3 — Intelligence Graph
Layer 3 transforms story clusters into a living relationship network — mapping entity co-occurrences, detecting narrative threads that span weeks of coverage, and surfacing what's accelerating in real time.
What Layer 3 does
Layers 0–2 process individual articles and group them into stories. Layer 3 operates one level above that — it treats story clusters as nodes and asks: how are these stories connected? Which entities link them? Which narratives have been building for weeks?
The result is a graph of the information landscape: entity co-occurrence networks built from tens of thousands of article pairs, cluster relationships tracking which stories share key players, and narrative threads that follow a topic across months of coverage. The trends endpoint surfaces which entities are accelerating right now relative to their recent baseline.
How the graph is built
Layer 3 runs a full graph build once daily at 04:00 UTC. Each build executes three passes in order:
- Entity co-occurrence pass — for every article pair that shares a named entity, a relationship record is created or updated between those two entities. Co-occurrence count and relationship strength are recalculated from scratch each build, so the graph always reflects the current article corpus, not an accumulation of stale increments.
- Cluster relationship pass — story clusters that share significant entities are linked
with a
shared_entitiesrelationship. Confidence is scored on entity overlap depth. The temporal gap between clusters is recorded but does not affect confidence — two stories about the same entity separated by six months are linked with the same strength as two from the same week. - Narrative thread pass — clusters connected by high-confidence relationships are grouped into narrative threads: coherent storylines that span multiple clusters over time. Each thread receives an importance score based on cluster count and a confidence score based on relationship consistency.
Trend detection runs separately. Entity velocity is calculated as
mentions_today / avg(mentions_last_7_days) and is updated continuously as new articles are
processed by Layer 1, not just on the daily build cycle.
Manual graph build
You can trigger a full graph rebuild on demand using POST /v1/graph/build. This runs the
same three-pass process as the scheduled build and returns statistics on completion. Builds are
synchronous — the request will hold open until the build finishes.
Cluster relationship types
Layer 3 records four relationship types between story clusters. Currently, builds populate
shared_entities by default. The remaining types are detected when temporal and semantic
signals are strong enough to support them.
| Type | Meaning | Primary signal |
|---|---|---|
shared_entities |
Two stories cover overlapping key players or locations | Entity overlap count |
evolved_into |
Story A is an earlier chapter of Story B | Temporal gap + entity continuity + semantic similarity |
merged_with |
Two parallel stories converged into one narrative | Entity overlap + concurrent timing |
split_from |
A story diverged into separate distinct threads | Semantic divergence from common ancestor |
Relationship confidence
Each cluster relationship carries a confidence score between 0.0 and 1.0. For
shared_entities relationships, confidence is a function of entity overlap depth — how many
named entities the two clusters have in common, weighted by entity type. Person and organization overlaps
score higher than location overlaps alone.
A confidence of 1.0 indicates complete overlap of primary entities. A confidence of
0.6 — the minimum threshold for a relationship to be recorded — indicates meaningful but
partial overlap. Relationships below 0.6 are discarded during the build pass.
Entity relationships
Separate from cluster relationships, Layer 3 maintains a direct entity-to-entity network.
relationship_strength is normalized across the corpus: a pair of entities with the highest
co-occurrence count in the dataset scores 1.0, and all other pairs are scaled relative to that maximum.
This means strength scores are corpus-relative, not absolute — adding more articles will shift scores as
the denominator grows.
When querying entity relationships, filter by relationship_strength > 0.5 for meaningful
signal. The majority of entity pairs in the graph reflect incidental co-occurrence in long articles rather
than a genuine editorial connection.
Trend velocity
Velocity measures how fast an entity is being mentioned right now relative to its recent baseline. The formula is:
An entity with a velocity of 35.0 is being mentioned 35 times more on the most recent
ingestion day than its average across the prior seven ingestion days — a strong signal of a breaking or
rapidly developing story. Velocity is calculated against actual ingestion days, not calendar days, so
gaps in the pipeline do not artificially collapse scores. Entities with no prior ingestion history
default to 1.0 (neutral) rather than inflating the trending feed.
| Velocity range | Momentum label | Interpretation |
|---|---|---|
| ≥ 3.0 | spiking |
3× or more above baseline — major breaking story |
| 1.5–2.99 | rising |
Accelerating above baseline — developing story |
| 0.5–1.49 | stable |
Normal coverage volume |
| < 0.5 | falling |
Below baseline — story fading |
min_velocity to 2.0 or
higher and cross-reference against your known entity list to remove noise from incidental high-frequency
terms.
Narrative threads
A narrative thread is a group of story clusters that form a coherent storyline over time. Where a single cluster represents one burst of coverage, a narrative thread represents the full arc — the story as it has developed across weeks or months.
Threads are built from chains of high-confidence cluster relationships. An importance score (0.0–1.0) is assigned based on cluster count, and a confidence score reflects the consistency of relationships within the chain. Threads with fewer than three clusters or a confidence below 0.3 are not surfaced.
| Field | Description |
|---|---|
importance_score |
Composite score based on cluster count and coverage breadth. Higher = longer-running, more widely covered story arc. |
confidence_score |
Average confidence of the relationships linking clusters in this thread. High confidence means the clusters are tightly related; lower confidence means the thread is inferred from weaker connections. |
status |
active — thread has clusters updated within the past 48 hours.
dormant — no recent activity but thread not concluded.
concluded — no cluster activity in over 7 days.
|
Endpoints
https://layer3.api.polariapi.com. For example:
GET https://layer3.api.polariapi.com/v1/graph/stats
Graph statistics
Returns a summary of the current graph state — total relationship counts and trending entity count. Use this to verify the graph has been built and to monitor growth over time.
| Field | Description |
|---|---|
entity_relationships |
Total entity-to-entity co-occurrence pairs in the graph, including weak relationships. |
cluster_relationships |
Story cluster pairs with at least one recorded relationship (confidence ≥ 0.6). |
narrative_threads |
Active narrative threads with three or more clusters. |
trending_entities |
Entities with velocity_score > 1.5 — currently above their 7-day baseline. |
Cluster relationships
Returns all relationships for a given story cluster — both outbound (this cluster relates to others) and inbound (other clusters relate to this one). Use this to find which stories are connected to a story you are already tracking.
| Parameter | Type | Description |
|---|---|---|
| cluster_idrequired | string | Story cluster ID from Layer 2 (e.g. clus_9x3k2m8f). |
| Field | Description |
|---|---|
source / target |
Directed relationship. When source equals the queried cluster ID, the relationship
is outbound. When target equals it, the relationship is inbound. Both directions are
returned in the same array. |
type |
Relationship type. See Cluster relationship types above. |
confidence |
0.0–1.0. Minimum recorded value is 0.6. Filter to ≥ 0.8 for the strongest connections. |
POST /v1/graph/build to incorporate recent clusters.
Trending entities
Returns entities ordered by velocity score — how fast they are being mentioned relative to their recent baseline. Results reflect the current entity metrics snapshot, which is updated continuously as new articles are processed by Layer 1.
| Parameter | Type | Description |
|---|---|---|
| min_velocity | float | Minimum velocity score to include. Default: 2.0. Use 1.5 for a broader
feed; 5.0 or higher for only strong breakout signals. |
| limit | integer | Maximum results to return. Default: 20. Max: 100. |
| Field | Description |
|---|---|
entity |
Entity name as extracted by Layer 1. May include variants of the same real-world entity
(e.g. "Supreme Court" and "the Supreme Court") until entity
normalization consolidates them. |
velocity |
Ratio of the latest ingestion day's mentions to the average across the prior 7 ingestion
days. A value of 35.0 means 35× the recent baseline rate. |
mentions |
Raw mention count for today's ingestion window. High velocity with low mentions may indicate an entity with a near-zero historical baseline, not necessarily a major story. |
Trigger graph build
Triggers a full synchronous graph rebuild from current story cluster data. The request holds open until the build completes and returns build statistics. No request body is required.
Scheduled builds
The graph rebuilds automatically at 04:00 UTC daily. This timing is chosen to run after overnight
ingestion has completed and before peak API usage hours. The scheduled build is identical to a manual
POST /v1/graph/build — it is not incremental.
If your use case requires fresher graph data than daily, trigger manual builds after large ingestion batches using the build endpoint. There is no rate limit on build triggers, but concurrent builds are not safe — wait for the previous build to complete before starting another.
Best practices
Filter entity relationships by strength
The entity relationship graph contains ~80,000 pairs. Strength is corpus-relative — the most
co-reported pair scores 1.0 and all others scale against it. The majority of pairs represent incidental
co-occurrence within long articles rather than a genuine editorial connection. Filter to
relationship_strength > 0.5 for meaningful signal (~1,200 pairs). Strength above 0.8
indicates entities that are consistently co-reported across many articles and clusters.
Cross-reference velocity with mention count
A velocity of 30.0 from 3 mentions (entity rarely mentioned, appeared once today) is not the same signal
as a velocity of 30.0 from 300 mentions. Always read velocity alongside
mentions. For most monitoring use cases, a minimum of 20–30 mentions combined with a velocity
above 5.0 produces the cleanest breakout signal.
Use cluster relationships to expand story coverage
When tracking a specific story, call
GET /v1/graph/cluster/{cluster_id}/relationships to find related clusters. Follow the
high-confidence relationships (≥ 0.8) to find stories covering the same key entities from different angles.
This is more reliable than keyword search for finding adjacent coverage because it is grounded in actual
entity co-occurrence, not surface text similarity.
Rebuild after large ingestion batches
The graph reflects the state of clusters at the time of the last build. Articles ingested after the last build are fully processed through Layers 0–2 and available via the clustering endpoints, but they will not appear in graph relationships or trend velocity calculations until the next build runs. If you ingest a large batch of historical articles, trigger a manual build to incorporate them.
Poll stats before querying relationships
Before building a graph visualization or running relationship queries, call
GET /v1/graph/stats to confirm the graph has been built. A response showing zero
cluster_relationships means the build has not yet run on your dataset — cluster relationship
queries will return empty arrays for all cluster IDs until the first build completes.
Performance
| Operation | Typical latency |
|---|---|
GET /v1/graph/stats |
~80ms |
GET /v1/graph/cluster/{id}/relationships |
~60ms |
GET /v1/trends/entities (warm) |
~25ms |
POST /v1/graph/build (~8k clusters) |
3–5 minutes |
Relationship lookup latency is consistent regardless of result count — an indexed query on
source_cluster_id and target_cluster_id means clusters with 15 relationships
return in the same time as clusters with 0. The trends endpoint warms to ~25ms on repeated calls as the
query plan is cached by PostgreSQL.
Error responses
| Status | Cause |
|---|---|
404 |
Cluster ID not found in the Layer 2 story pool. |
422 |
Invalid parameter — e.g. a non-numeric min_velocity value. |
503 |
Graph build in progress. Retry after the build completes. |