The Pipeline

Layer 0 — Token Intelligence

Every article enters Polari through Layer 0. It filters noise, assigns a quality score, generates a 384-dimensional semantic embedding, and creates a fingerprint for deduplication. Articles that pass the quality threshold proceed to Layers 1–3.

What Layer 0 does

Layer 0 runs four operations on every submitted article, in order:

Quality scoring — a weighted multi-factor score (0–1) that determines whether the article contains meaningful signal or noise.
Embedding generation — a 384-dimensional vector using BAAI/bge-small-en-v1.5, stored for semantic similarity search.
Semantic hashing — a content fingerprint for fast cross-source deduplication, independent of URL or title.

Quality scoring

The quality score is the most important output from Layer 0. It gates all downstream processing and is available on every article object.

Factor	Weight	What it measures
Content depth	40%	Length, information density, paragraph structure
Coherence	30%	Sentence count, avg sentence length, sentiment consistency
Source credibility	20%	Domain reputation scoring
Spam detection	10%	Excessive caps, repetition, URL density, special characters

Quality tiers

Score	Rating	Typical content
`0.8+`	Excellent	High-quality journalism, academic content
`0.65–0.8`	Good	Professional content, established outlets
`0.5–0.65`	Medium	Mixed quality, unknown sources
`0.3–0.5`	Low	Short content, social media, thin articles
`<0.3`	Noise	Spam, clickbait, malformed content

Quality score. Use min_quality on search endpoints to filter results to your desired signal level. All articles are forwarded downstream regardless of score.

Async job pattern

All Layer 0 processing endpoints return immediately with a job_id. Poll /v1/status/{job_id} until completion, then retrieve the result. Most articles complete in under 1 second.

FLOW
POST /v1/process
            → { "job_id": "job_a1b2c3d4", "article_id": "art_8f7h2k9s", "status": "queued" }

            GET /v1/status/{job_id} # poll every ~500ms, max 30s
            → { "status": "processing" }
            → { "status": "completed", "article_id": "art_8f7h2k9s" }

            GET /v1/article/{article_id}
            → full result

Endpoints

Base URL. All Layer 0 endpoints are served from https://layer0.api.polariapi.com. For example: POST https://layer0.api.polariapi.com/v1/process

Submit article

POST /v1/process

Field	Type	Description
textrequired	string	Article body. Minimum 100 characters.
metadata.title	string	Article headline
metadata.url	string	Used as deduplication key if provided
metadata.source	string	Publisher name — factors into source credibility scoring
metadata.author	string	Byline
metadata.published_date	ISO 8601	Original publication datetime

REQUEST

{
              "text": "The Federal Reserve held interest rates steady on
                Wednesday...",
              "metadata": {
              "title": "Fed Holds Rates Steady",
              "url": "https://reuters.com/fed-rates-2026",
              "source": "Reuters",
              "author": "Jane Smith",
              "published_date": "2026-04-29T12:00:00"
              }
              }
            

RESPONSE
{
              "job_id": "job_a1b2c3d4",
              "article_id": "art_8f7h2k9s",
              "status": "queued"
              }
            

Submit batch

POST /v1/process/batch

Submit up to 50 articles in a single request. Articles process in parallel. Counts as one API call against your rate limit regardless of article count.

REQUEST

{
              "articles": [
              { "text": "First article...", "metadata": { "title": "Article
                One" } },
              { "text": "Second article...", "metadata": { "title": "Article
                Two" } }
              ]
              }
            

RESPONSE
{
              "jobs": [
              { "job_id": "job_aaa111", "article_id": "art_xxx", "status": "queued" },
              { "job_id": "job_bbb222", "article_id": "art_yyy", "status": "queued" }
              ],
              "total": 2
              }
            

Poll job status

GET /v1/status/{job_id}

RESPONSES
# processing
              { "job_id": "job_a1b2c3d4", "status": "processing" }

              # completed
              { "job_id": "job_a1b2c3d4", "article_id": "art_8f7h2k9s", "status": "completed" }

              # failed
              { "job_id": "job_a1b2c3d4", "status": "failed", "error":
              "Text content too short (minimum 50 characters)" }
            

Retrieve article

GET /v1/article/{article_id}

Parameter	Type	Description
include_embedding	boolean	Return the raw 384-dimensional embedding vector. Default: `false`. Pass `?include_embedding=true` to include it.

RESPONSE
{
              "article_id": "art_8f7h2k9s",
              "title": "Fed Holds Rates Steady",
              "source": "Reuters",
              "published_date": "2026-04-29T12:00:00Z",
              "quality_score": 0.74,
              "semantic_hash": "a3f8c2e1...",
              "token_count": 842,
              "processed_at": "2026-04-29T12:00:03Z"
              }
            

Field	Description
quality_score	0–1. Signal quality of the article. Use `min_quality` on search endpoints to filter by this value.
semantic_hash	Content fingerprint for cross-source deduplication — independent of URL or title.
token_count	Article length in tokens.
embedding	Array of 384 floats. Only present when `?include_embedding=true` is passed. Omitted by default.

Semantic search

GET /v1/search

Semantic search across all processed articles ranked by embedding similarity — not keyword overlap. Finds conceptually related articles even when they share no exact terms.

Parameter	Type	Description
queryrequired	string	Natural language search query
limit	integer	Max results. Default: `10`. Max: `100`
min_quality	float	Minimum quality score filter (0.0–1.0)

RESPONSE
{
              "query": "federal reserve interest rates",
              "results": [
              {
              "article_id": "art_8f7h2k9s",
              "title": "Fed Holds Rates Steady Amid Inflation
                Concerns",
              "source": "Reuters",
              "quality_score": 0.81,
              "similarity_score": 0.94
              }
              ],
              "total": 47
              }
            

← Previous

Webhooks

Layer 1 — Semantic Analysis