Layer 0 — Token Intelligence
Every article enters Polari through Layer 0. It filters noise, assigns a quality score, generates a 384-dimensional semantic embedding, and creates a fingerprint for deduplication. Articles that pass the quality threshold proceed to Layers 1–3.
What Layer 0 does
Layer 0 runs four operations on every submitted article, in order:
- Quality scoring — a weighted multi-factor score (0–1) that determines whether the article contains meaningful signal or noise.
- Embedding generation — a 384-dimensional vector using
BAAI/bge-small-en-v1.5, stored for semantic similarity search. - Semantic hashing — a content fingerprint for fast cross-source deduplication, independent of URL or title.
- Quality gate — articles scoring below
0.53are stored but not forwarded to Layer 1 or Layer 2.
Quality scoring
The quality score is the most important output from Layer 0. It gates all downstream processing and is available on every article object.
| Factor | Weight | What it measures |
|---|---|---|
| Content depth | 40% | Length, information density, paragraph structure |
| Coherence | 30% | Sentence count, avg sentence length, sentiment consistency |
| Source credibility | 20% | Domain reputation scoring |
| Spam detection | 10% | Excessive caps, repetition, URL density, special characters |
Quality tiers
| Score | Rating | Typical content |
|---|---|---|
0.8+ |
Excellent | High-quality journalism, academic content |
0.65–0.8 |
Good | Professional content, established outlets |
0.5–0.65 |
Medium | Mixed quality, unknown sources |
0.3–0.5 |
Low | Short content, social media, thin articles |
<0.3 |
Noise | Spam, clickbait, malformed content |
0.53 are stored and returned but are
not forwarded to Layer 1 or Layer 2. Use min_quality on search endpoints to filter results to
your desired signal level.
Async job pattern
All Layer 0 processing endpoints return immediately with a job_id. Poll
/v1/status/{job_id} until completion, then retrieve the result. Most articles complete in under
1 second.
Endpoints
https://layer0.api.polariapi.com. For example:
POST https://layer0.api.polariapi.com/v1/process
Submit article
| Field | Type | Description |
|---|---|---|
| textrequired | string | Article body. Minimum 100 characters. |
| metadata.title | string | Article headline |
| metadata.url | string | Used as deduplication key if provided |
| metadata.source | string | Publisher name — factors into source credibility scoring |
| metadata.author | string | Byline |
| metadata.published_date | ISO 8601 | Original publication datetime |
Submit batch
Submit up to 50 articles in a single request. Articles process in parallel. Counts as one API call against your rate limit regardless of article count.
Poll job status
Retrieve article
| Parameter | Type | Description |
|---|---|---|
| include_embedding | boolean | Return the raw 384-dimensional embedding vector. Default: false. Pass
?include_embedding=true to include it.
|
| Field | Description |
|---|---|
| quality_score | 0–1. Scores below 0.53 are not forwarded to Layer 1/2 processing. |
| semantic_hash | Content fingerprint for cross-source deduplication — independent of URL or title. |
| token_count | Article length in tokens. |
| embedding_id | Reference to the 384-dimensional vector stored in the embedding layer. |
| embedding | Array of 384 floats. Only present when ?include_embedding=true is passed. Omitted by
default. |
Semantic search
Semantic search across all processed articles ranked by embedding similarity — not keyword overlap. Finds conceptually related articles even when they share no exact terms.
| Parameter | Type | Description |
|---|---|---|
| queryrequired | string | Natural language search query |
| limit | integer | Max results. Default: 10. Max: 100 |
| min_quality | float | Minimum quality score filter (0.0–1.0) |