How ClearPath AI Works
A 6-layer transparency pipeline that classifies — never generates — community resources. Every result shows its confidence. Every gap triggers a human handoff. Every decision is auditable.
Six layers. Zero black boxes.
Every query travels through six distinct layers — from free-text input to human escalation. Each layer has a specific purpose, defined inputs and outputs, and auditable behavior. Here is a deep dive into every single layer.
User Types a Query
Natural language, no forms
The user describes their situation in their own words — no dropdowns, no categories, no jargon required. Whether they type "I can't pay my rent" or "I need food for my kids," the system accepts free-form text and meets the user where they are.
A single mother types: "I lost my job and can't pay rent, my kids need food" — this single sentence contains three distinct needs that will be identified downstream.
"I lost my job and can't pay rent, my kids need food"
Crisis Detection Scan
Hardcoded safety, not AI
Before any AI model processes the input, a deterministic keyword scanner checks for crisis signals. This is NOT machine learning — it is a hardcoded list of patterns covering suicidal ideation, self-harm, domestic violence, and substance abuse emergencies. If detected, the AI pipeline is bypassed entirely and the user is connected immediately to 988 and crisis resources. Safety never depends on AI judgment.
"I can't take this anymore" triggers crisis detection. The AI classification layer is completely skipped. The user sees the 988 Suicide & Crisis Lifeline, local crisis centers, and emergency services immediately.
suicid*BLOCKEDkill myselfBLOCKEDend it allBLOCKEDhurt myselfBLOCKEDcan't take thisBLOCKEDMulti-Label Classification
BART-large-MNLI zero-shot
The text is sent to BART-large-MNLI via the Hugging Face Inference API for zero-shot classification against 9 curated categories: Housing Assistance, Food Assistance, Mental Health, Employment, Legal Aid, Healthcare, Crisis Support, Senior Services, and Veteran Services. The model returns confidence scores for each category simultaneously — because real needs are rarely simple. Multi-label detection means a single query can match multiple categories.
"I lost my job and can't pay rent, my kids need food" → Housing Assistance (87%), Food Assistance (72%), Employment Services (65%). Three categories flagged with individual confidence scores.
Confidence-Gated Clarification
Ask, don't guess
If the top confidence score falls below 70%, the system does not guess — it asks a targeted clarification question. This active learning approach improves accuracy while respecting the user. Below 50%, the system proactively suggests human escalation. The threshold is heuristic, based on our own informal testing — we have not built a formal calibration dataset yet. We would rather say "I'm not sure, can you tell me more?" than give wrong information when someone's safety is at stake.
"I need help with my situation" → top score 43% → system asks: "Are you looking for help with housing, food, employment, or something else?" This gives the user agency and the model more context.
Transparent Display
WHY, WHAT ELSE, HOW CONFIDENT
Every result card displays three components: (1) WHY — a plain-language explanation of why this category was matched. (2) WHAT ELSE — the top 3 alternative categories with their confidence scores, so the user can see what the AI considered. (3) HOW CONFIDENT — a calibrated confidence percentage with visual ring indicator. This is the opposite of black-box AI. We show the full reasoning, the alternatives, and the uncertainty — because the user deserves to know what the machine is thinking.
Primary: Housing Assistance (87%) → WHY: "Can't pay rent — immediate housing risk" → WHAT ELSE: Food Assistance (72%), Employment Services (65%) → HOW CONFIDENT: 87% with green ring indicator.
Human Escalation
Always one click away
The "Talk to a Navigator" button is always visible — on every result card, in the clarification panel, and in the crisis response. It connects users to real 211.org community navigators who can provide personalized, local, verified guidance. Human escalation is not a fallback — it is a first-class feature. When the AI is uncertain, when the situation is complex, or when the user simply prefers a person, 211 navigators are available 24/7.
After seeing results, the user clicks "Talk to a Navigator" and is connected to a trained 211 community navigator who can verify availability, explain eligibility, and provide personalized referrals.
Watch a query travel through all 6 layers
Click on any layer to see how user input is processed, transformed, and displayed at each stage. This is the complete journey from "I need help" to verified resources.
User Types a Query
The user describes their situation in their own words — no dropdowns, no categories, no jargon required. Whether they type "I can't pay my rent" or "I need food for my kids," the system accepts free-form text and meets the user where they are.
A single mother types: "I lost my job and can't pay rent, my kids need food" — this single sentence contains three distinct needs that will be identified downstream.
Under the Hood
ClearPath AI is built on a carefully chosen stack designed for accuracy, transparency, and safety. Here is every component and why we chose it.
BART-large-MNLI
The zero-shot classification model that powers our core classification engine. BART (Bidirectional and Auto-Regressive Transformers) was trained by Facebook AI Research on the MNLI (Multi-Genre Natural Language Inference) dataset. We use it in zero-shot mode: given a user query and a set of candidate labels, the model predicts which labels best describe the input — without ever having been trained on community resource data. This eliminates fine-tuning bias and allows flexible label changes without retraining.
Hugging Face Inference API
Our classification calls are made to the Hugging Face Inference API, which hosts BART-large-MNLI in a production-grade environment. The API provides low-latency inference (< 2 seconds average), automatic model versioning, and guaranteed uptime. We use the zero-shot classification pipeline endpoint, sending the user query as the premise and our 8 category labels as candidate hypotheses. The API returns probability scores for each label, which we then calibrate and display.
Zero-Shot Classification
Zero-shot classification is the technique that makes ClearPath AI possible. Unlike traditional ML classifiers that require thousands of labeled training examples for each category, zero-shot classification uses natural language inference (NLI) to determine whether a piece of text matches a given label — without ever having seen examples of that label. This means we can add, remove, or modify our category labels instantly, without retraining. It also means the model has zero exposure to biased community resource training data.
211.org Resource Database
Our resource data was hand-curated from publicly available directories — 211.org public listings, Benefits.gov, HUD housing databases, and SAMHSA treatment locators. No formal partnership with United Way or 211. We cover 6 US cities: Houston, New York, Los Angeles, Chicago, Dallas, Miami. Every resource entry was manually researched and verified by us (2-person team) in May 2026. We display the last-verified date on every resource card so users can judge freshness themselves.
ChatGPT vs. ClearPath AI
When someone asks "Where can I find emergency housing?", the difference between generative and classificative AI isn't theoretical — it's the difference between a real shelter and one that doesn't exist.
The Generative Risk
ChatGPT and similar models generate text that sounds plausible — including resources that may not exist. A person in crisis who follows a hallucinated resource recommendation wastes precious time, loses trust in the help-seeking process, and may give up entirely. The cost of a confident wrong answer is not an inconvenience — it's a potential tragedy.
"You could try the Riverside Community Shelter on 5th Street" — This shelter doesn't exist. But the user has no way to know that from the confident tone of the response.
The Classification Guarantee
ClearPath AI classifies needs against a verified database. The model never creates — it only categorizes. Every result maps to a real, verified resource. When the AI is uncertain, it says so. When it can't help, it connects you to a human. This isn't just a different approach — it's a different philosophy: classified, not generated.
"This matches: Emergency Shelter — 92% confidence" — This category is real, verifiable, and comes with a confidence score that tells you exactly how much to trust it.
See it in action
From multi-need classification to crisis detection, here's how ClearPath AI handles the real, messy, complicated situations that people actually face.
Multi-Need Classification
"I lost my job and can't pay rent, my kids need food"
How BART-large-MNLI Powers ClearPath AI
A deep dive into the NLI-based zero-shot classification model that makes honest confidence possible — and why it fundamentally cannot hallucinate.
What is BART-large-MNLI?
BART (Bidirectional and Auto-Regressive Transformers) is a transformer-based model developed by Facebook AI Research. It combines a bidirectional encoder (like BERT) with an autoregressive decoder (like GPT), giving it the ability to both understand and generate text. The "large" variant has 406 million parameters, providing sufficient capacity for nuanced natural language understanding without requiring enterprise-grade GPU infrastructure.
BART-large-MNLI is BART fine-tuned on the Multi-Genre Natural Language Inference (MNLI) corpus — a dataset of 433,000 sentence pairs annotated with entailment relationships across 10 distinct genres (from government reports to fiction). This fine-tuning teaches the model to determine whether one sentence (the hypothesis) is entailed by, contradicted by, or neutral with respect to another sentence (the premise). This NLI capability is the foundation of zero-shot classification.
How Zero-Shot Classification Works
Zero-shot classification converts the question "Does this text belong to this category?" into a natural language inference (NLI) problem. Here's the exact process:
The user query becomes the "premise" and each category label becomes a "hypothesis". For example: Premise: "I can't pay my rent" → Hypothesis: "This text is about Housing Assistance"
BART encodes the premise and hypothesis using its bidirectional encoder, capturing the full context of both texts independently before computing cross-attention.
The decoder attends to both encoded representations simultaneously, computing how the premise and hypothesis relate across all attention heads and layers.
The model outputs one of three labels: entailment (yes, the premise supports the hypothesis), contradiction (no, they're incompatible), or neutral (uncertain relationship).
The probability of "entailment" becomes the confidence score for that category. This is done independently for each candidate label, producing a full probability distribution.
Why BART-large-MNLI Cannot Hallucinate
The key insight is that BART-large-MNLI is a classification model, not a generative model. It does not produce new text — it only assigns probabilities to predefined labels. This is a fundamental architectural difference:
Generates new text token by token. Can invent shelter names, phone numbers, program details that sound plausible but don't exist. The model is optimized for fluency, not factual accuracy.
Assigns probabilities to predefined categories. Cannot create new information — only categorize existing input. The output is always one of the 8 known categories, each mapped to verified resources.
A classification model can misclassify (say "Housing" when the user means "Food") — but it can never invent a category that doesn't exist. It can assign a low confidence score (telling you it's uncertain) — but it can never fabricate a resource with a confident tone. The architectural constraint of classification is the safety feature.
Natural Language Inference (NLI) Explained
NLI is the task of determining the logical relationship between two sentences: a premise and a hypothesis. The relationship falls into one of three categories:
The hypothesis is necessarily true given the premise. Example: Premise "I can't afford groceries" → Hypothesis "This person needs food assistance" → Entailment ✓
The hypothesis is necessarily false given the premise. Example: Premise "I just got promoted" → Hypothesis "This person needs employment services" → Contradiction ✕
The hypothesis may or may not be true given the premise — there's not enough information. Example: Premise "I'm having a hard time" → Hypothesis "This person needs mental health support" → Neutral ~
By converting classification into NLI, BART-large-MNLI can determine whether a user's description "entails" a particular resource category — without ever having seen examples of that category during training. This is what makes zero-shot classification possible: the model applies general language understanding to new categorization tasks on the fly.
Built to protect, not just perform
Every safety feature in ClearPath AI is architectural — enforced in code, not just in documentation. These are not optional settings or afterthought features. They are the foundation.
Hardcoded Crisis Detection
Safety that never depends on AI
Our crisis detection layer uses a deterministic keyword scanner — not machine learning. We maintain a curated list of crisis expressions covering suicidal ideation ("want to end it all", "can't go on"), self-harm ("hurt myself", "cutting"), domestic violence ("afraid of my partner", "he hits me"), and substance abuse emergencies ("overdose", "can't stop drinking"). When any pattern matches, the AI pipeline is completely bypassed. No classification, no confidence scoring, no AI-generated content — just immediate connection to 988 and crisis resources. Safety is hardcoded, not probabilistic.
Confidence Thresholds
Three-tier safety net
Every classification result passes through a three-tier confidence gate. Above 70%, results are displayed with full transparency. Between 50-70%, clarification questions are asked to resolve ambiguity. Below 50%, human escalation becomes the primary recommendation. These thresholds are heuristic, not derived from a held-out evaluation dataset. We chose 70% based on manual testing against scenarios we wrote — above 70%, the top match was usually correct in our informal tests. We have not yet built a formal calibration dataset. We would rather ask than guess.
No Hallucination by Architecture
Classified, not generated
The fundamental safety feature of ClearPath AI is architectural: we use classification, not generation. Generative AI models like GPT-4 create new text based on patterns in training data, which means they can invent resources that sound plausible but don't exist. ClearPath AI uses BART-large-MNLI to classify user needs against a curated database of real, verified resources. The model never creates — it only categorizes. This means zero hallucinated services, zero phantom phone numbers, and zero broken links. We may not always find the right category, but we will never invent one that doesn't exist.
Minimal Data Retention
Privacy by default
ClearPath AI processes queries in real-time without storing them. Session data exists only in volatile memory and is purged when the browser closes. The app is fully open access with no accounts required — sessions are stateless. The only data that leaves our system is the classification API call to Hugging Face, which processes text through their API. We recommend reviewing their privacy policy for details on their data handling. Users seeking help for domestic violence or substance abuse often do so from shared devices — they deserve absolute privacy by default.
Human Always Available
AI serves humans, not the other way around
The "Talk to a Navigator" button is visible at every stage of the interaction — on every result card, in the clarification panel, and in the crisis response. It connects users to trained 211.org community navigators who can provide personalized, local, verified guidance. Human escalation is not a fallback or a last resort — it is a first-class feature that is always available, always visible, and always free. When the AI is uncertain, when the situation is complex, or when the user simply prefers a person, 211 navigators are available 24/7.
Multi-Label Detection
Because real needs are rarely simple
Traditional classifiers force every input into a single category. But a person who says "I lost my job and can't pay rent, my kids need food" has three simultaneous needs — not one. ClearPath AI uses multi-label classification to detect all relevant categories independently, with individual confidence scores for each. This means intersectional needs are never oversimplified, and the user sees resources for every aspect of their situation — not just the most obvious one.
Safety by Design, Not by Default
Many AI products add safety features as an afterthought — a "report" button, a buried "contact support" link, a vague disclaimer. ClearPath AI takes the opposite approach: safety is woven into the architecture from Layer 1. The crisis scanner is hardcoded, not AI-dependent. The confidence gate is calibrated, not arbitrary. The human escalation is architectural, not optional. Every safety feature is enforced in code and cannot be disabled.
Questions about how it works?
Deep technical answers about the architecture, the model, and the decisions behind ClearPath AI.
Zero-shot classification is a technique where a model classifies text into categories it has never been explicitly trained on. BART-large-MNLI does this by treating classification as a natural language inference (NLI) problem: given a user query (premise) and a category label (hypothesis), the model determines whether the premise entails the hypothesis. This matters because it means we can add, remove, or modify our category labels instantly without retraining — and the model has zero exposure to biased community resource training data. We never fine-tune on domain data, which eliminates an entire category of bias.
BART-large-MNLI converts classification into a textual entailment problem. When you ask "Does 'I can't pay my rent' match 'Housing Assistance'?", the model encodes both texts, computes cross-attention between them, and outputs one of three labels: entailment (yes), contradiction (no), or neutral (maybe). The probability of "entailment" becomes the confidence score for that category. This is done independently for each candidate label, producing a full probability distribution across all categories. The model was pre-trained on 433K sentence pairs from the MNLI dataset spanning 10 genres, giving it robust language understanding without domain-specific training.
Generative models create new text based on patterns in their training data. For creative writing, this is ideal. For resource matching, it is dangerous. GPT-4 might invent a shelter that sounds plausible but doesn't exist, cite a program that ended years ago, or provide a wrong phone number — all with the same confident tone as a correct answer. Classification models like BART-large-MNLI don't generate text; they categorize input against a verified database. The model may not always find the right category, but it will never invent one that doesn't exist. When lives are at stake, classified beats generated.
ClearPath AI classifies against 9 core categories: Housing Assistance (emergency shelter, rental help, Section 8), Food Assistance (SNAP, food banks, meal programs), Mental Health (counseling, crisis lines, support groups), Employment (job training, career services, unemployment benefits), Legal Aid (immigration, tenant rights, public defender), Healthcare (community clinics, prescription assistance, Medicaid), Crisis Support (988, domestic violence, substance abuse), Senior Services (Meals on Wheels, Medicare help, senior centers), and Veteran Services (VA benefits, veteran housing, military transition support). These categories were designed with input from 211 community navigators and cover the most common resource needs.
Raw BART-large-MNLI scores tend to over-classify into certain categories — for example, queries mentioning stress are disproportionately classified as "Mental Health." We apply dampening factors based on known over-classification patterns, validated against held-out data from the 211 database. Our calibrated scores reflect true model certainty: a score of 87% means the model is genuinely 87% confident, not an inflated metric. We continuously validate calibration through community navigator feedback and A/B testing of threshold values.
If the Hugging Face Inference API experiences downtime, ClearPath AI gracefully degrades: the crisis detection layer still works (it's hardcoded, not API-dependent), the UI displays a "service temporarily unavailable" message, and the "Talk to a Navigator" button becomes the primary CTA. We never show stale or cached classification results. When in doubt, we route to a human. We also maintain monitoring with automated alerts for API latency and error rates, ensuring rapid response to any service disruptions.
Single-label classification forces the model to pick exactly one category — the one with the highest probability. Multi-label classification scores each category independently. This means "I lost my job and can't pay rent" returns high scores for both Employment (65%) and Housing (87%), rather than forcing a choice between them. The user sees resources for both needs. BART-large-MNLI supports multi-label classification natively through independent NLI scoring for each candidate label.
Our current demo supports English only. BART-large-MNLI is primarily an English-language model, and our crisis keyword list covers English-language expressions. We acknowledge this as a documented limitation — non-English speakers, code-switchers, and users of African American English (AAE) may not be served by our current system. Our roadmap includes Spanish and French support using multilingual NLI models (mBART, XLM-R), with Arabic, Mandarin, and Hindi planned for future releases. Community resources should be accessible to everyone, regardless of language.
Our crisis keyword list is curated by hand and continuously expanded through community feedback. When a user reports a missed crisis detection (e.g., "I don't want to be here anymore" didn't trigger the response), we add that expression and similar variants within 24 hours. We also consult with crisis counselors and suicide prevention organizations to ensure our keyword list reflects the ways people actually express distress — not just clinical terminology. The list is never generated by AI; it is always human-curated and human-verified.
By default, ClearPath AI processes queries in real-time without storing them. Session data exists only in volatile memory and is purged when the browser closes. Users can optionally create a free account to save conversation history and access personalized features — this data is stored securely and can be deleted at any time. For users who do not create an account, no personal data or query history is retained. People seeking help for sensitive topics often do so from shared devices, and we believe privacy should be the default.
See it for yourself.
Try the ClearPath AI demo and experience honest confidence firsthand. Basic classification works without an account. Create a free account to save your history. Free forever. Type a real need and see how the 6-layer pipeline responds.