Semantic intelligence infrastructure

Multi-layer
intelligence pipeline.

Submit any article. Get back entities, quality scores, story clusters, and relationship graphs — through a single async API built for developers.

Layer 0
Token Intelligence
Quality scoring, 384-dim embeddings, semantic deduplication
quality score embeddings semantic hash
Layer 1
Semantic Analysis
Named entity extraction, sentence-level embeddings, geo tagging
PERSON · ORG · GPE sentences timeline
Layer 2
Story Clustering
Cross-source grouping, lifecycle tracking, diversity scoring
breaking developing archived
Pro+
Layer 3
Intelligence Graph
Entity relationship networks, trend velocity, constellation view
co-occurrence velocity graph

01 · SUBMIT
POST your article
Send any text + metadata to /v1/process. Returns a job_id immediately.
02 · PROCESS
Pipeline runs async
All four layers execute in sequence. Most articles complete in 1–3 seconds.
03 · RETRIEVE
Poll or webhook
Check /v1/status/{job_id} or receive a push event when processing completes.
PYTHON — full end-to-end
import time, requests API_KEY = "pk_live_your_key" BASE = "https://api.polariapi.com" HEADERS = {"Authorization": f"Bearer {API_KEY}"} # 1. Submit job = requests.post(f"{BASE}/v1/process", headers=HEADERS, json={ "text": "The Federal Reserve held interest rates steady on Wednesday...", "metadata": {"title": "Fed Holds Rates Steady", "source": "Reuters"} }).json() # 2. Poll while True: s = requests.get(f"{BASE}/v1/status/{job['job_id']}", headers=HEADERS).json() if s["status"] == "completed": break time.sleep(0.5) # 3. Retrieve article = requests.get(f"{BASE}/v1/article/{s['article_id']}", headers=HEADERS).json() print(article["quality_score"]) # → 0.81 print(article["semantic_hash"]) # → "a3f8c2e1..."

Media Monitoring
Track entities across every source, in real time
Know the moment your brand, competitor, or key person spikes in coverage. Layer 1 extracts every named entity — Layer 3 shows you velocity and trend direction.
  • Entity mention timelines across all ingested sources
  • Trend velocity scores — detect spikes before they peak
  • Story lifecycle: breaking → developing → archived
  • Webhook alerts on entity.spike events
Trending entities · last 24h
# GET /v1/trends/entities?time_range=24h { "trending_entities": [ { "entity": "Federal Reserve", "mention_count": 89, "story_count": 7, "velocity_score": 0.92, "trend_direction": "rising" }, { "entity": "Jerome Powell", "mention_count": 64, "velocity_score": 0.87, "trend_direction": "rising" } ] }
Financial Intelligence
Company and sector coverage, structured for analysis
Feed earnings stories, regulatory filings, and macro news through the pipeline. Get structured entity data, coverage diversity scores, and relationship graphs ready for downstream models.
  • ORG entity extraction with co-occurrence relationships
  • Source diversity scoring — how broadly is a story covered?
  • Story clustering groups wire duplication automatically
  • Semantic search across processed corpus
Source diversity · story analysis
# GET /v1/story/{id}/sources { "diversity_score": 0.68, "diversity_rating": "moderate", "source_distribution": { "financial": { "sources": ["Bloomberg", "WSJ", "FT"], "percentage": 42.8 }, "wire": { "sources": ["Reuters", "AP"], "percentage": 28.6 } }, "coverage_gaps": ["regional", "opinion"] }
Story Deduplication
One story, twelve sources — collapsed into one
Layer 2 uses a two-stage algorithm — embedding similarity then entity overlap — to cluster the same event across Reuters, AP, Bloomberg, and 100+ other outlets into a single story object.
  • 92%+ clustering precision on cross-source stories
  • Confidence score on every article join
  • Tunable similarity threshold per request
  • Batch clustering for high-volume ingestion
Cluster result — article join
# POST /v1/cluster # { "article_id": "art_xyz", "similarity_threshold": 0.75 } { "article_id": "art_8f7h2k9s", "story_id": "clus_31e6c353", "action": "joined", "confidence": 0.87, "story_size": 12 } # story now has 12 articles from 7 sources # all covering the same Fed decision
Research & Analysis
Semantic search across a structured news corpus
Layer 0's 384-dimensional embeddings power semantic search that goes beyond keyword matching. Find conceptually similar articles even when they share no exact terms.
  • Semantic similarity search — not keyword overlap
  • Quality score filtering removes low-signal content
  • Entity timelines show narrative evolution over time
  • Batch processing for large corpus ingestion
Semantic search results
# GET /v1/search?query=fed+monetary+policy&min_quality=0.6 { "results": [ { "title": "Fed Holds Rates Steady", "source": "Reuters", "quality_score": 0.81, "similarity_score": 0.94 }, { "title": "Powell Signals Patience on Cuts", "source": "Bloomberg", "quality_score": 0.77, "similarity_score": 0.89 } ], "total": 47 }

92%+
Clustering precision across cross-source stories
18+ art/s
Batch throughput at scale — Layer 1 semantic analysis
28+ days
Longest continuously tracked story cluster — cross-briefing narrative persistence
80k+
Entity relationship pairs in the intelligence graph — built from article-level co-occurrence
25ms
Trend velocity endpoint latency — warm path, 38k+ entity metrics queried
2,300+
Narrative evolution links detected — story clusters linked by temporal continuity and shared entities

Capability NewsAPI Aylien Polari
Article search & retrieval
Named entity extraction
Semantic embeddings partial ✓ 384-dim
Cross-source story clustering ✓ 92%+ precision
Multi-week story tracking ✓ 28+ days
Entity relationship graph ✓ 80k+ pairs
Narrative evolution (evolved_into) ✓ Layer 3
Trend velocity detection ✓ ratio-based
Developer-first pricing $449/mo Enterprise only from $99/mo

Start building.
Beta is open.

Early access customers get 6 months free on Professional. No credit card required.

Request early access →
Read the docs first →