Build Log
DIY Helper

Teaching AI to Know Its Audience

In post 1 I talked about why we started building DIY Helper. In post 2 I covered the two-phase agent pipeline that generates project reports. This post is about the part of the system that has changed the way I think about AI products entirely: the intelligence layer.

The Over-Explainer Problem

Ask about GFCI outlet wiring and the AI opens with "A GFCI, or Ground Fault Circuit Interrupter, is a safety device that..." — and you, a person who owns a multimeter and has wired a subpanel, close the tab.

Flip the scenario. A first-time homeowner asks why their outlet stopped working and the AI starts talking about ampacity, neutral bus bars, and NEC code sections. Technically correct. Completely useless.

This is not a knowledge problem. It is a calibration problem. The AI gives everyone the same depth, the same vocabulary, the same assumed baseline. I know carpentry reasonably well. I know nothing about plumbing. The AI could not tell the difference. So I decided to fix it.

Intent Classification: What Does the User Actually Need?

The first piece is a router. Before the main model generates a response, a fast, cheap classification call figures out what kind of help the user needs. Not what topic — what mode.

Four intent types:

Quick question — "What size nail for baseboards?" Just wants an answer, not a workflow.

Troubleshooting — "My outlet sparks when I plug something in." Needs diagnosis.

Mid-project — "The mortar isn't sticking to my tile." Actively working, needs immediate help.

Full project — "I want to build a deck." Needs the complete planning pipeline from post 2.

The classification runs on Claude Haiku. Temperature 0, max 100 tokens, constrained to return JSON with the intent, a confidence score, and a one-line reasoning.

Each intent type triggers a different system prompt. Quick questions get a focused prompt that produces 1-3 paragraphs with no workflow overhead. Troubleshooting enters diagnostic mode. Full project triggers the heavyweight guided flow from post 2.

The classification costs under $0.001 per call and Haiku typically responds in 100-200ms. I built in a 500ms timeout as a safety net for cold starts. If the call times out, errors out, or returns confidence below 70%, the system falls through to the default full-project behavior — exactly what the app did before the intelligence layer existed. Graceful degradation to the status quo.

Classification is cached on the conversation record — the first message gets classified, every subsequent message reuses the cached intent.

Skill Profiling: The User Is Already Telling You Who They Are

Intent classification tells us what the user needs. Skill profiling tells us who they are. And the nice part is: we do not have to ask.

The system builds a skill profile across eight trade domains (electrical, plumbing, carpentry, HVAC, general, landscaping, painting, roofing) by analyzing three signals that already exist in the app.

Signal 1: Tool inventory. DIY Helper has a persistent inventory where users track their tools. If someone owns a miter saw, a Kreg jig, and a brad nailer, the system does not need a quiz to know they are not a carpentry beginner. Tool count per domain maps directly to familiarity level.

Signal 2: Trade terminology. This is the one I enjoyed building most. A curated dictionary of 200+ advanced terms across all eight trades. "Romex," "afci," "subpanel," "fish tape" for electrical. "PEX," "closet flange," "dielectric union," "water hammer" for plumbing. When a user drops these terms in conversation, the system infers familiarity. No AI call — pure pattern matching.

Signal 3: Completed projects. Past project history feeds the same inference. Three completed electrical projects is a stronger signal than any terminology match.

Thresholds are intentionally simple. Per domain: 0-2 signals means novice, 3-7 means familiar, 8+ means experienced. Three levels. Trying to distinguish twelve granular expertise levels would be false precision.

The three signal sources merge with a "highest level wins" strategy:

If your tool inventory says "familiar" with electrical but your terminology says "experienced," the system uses "experienced." Optimistic by design. Under-explaining is a better failure mode than over-explaining — people tolerate brevity much better than condescension.

Prompt Calibration: Where It All Comes Together

The skill profile feeds into the system prompt through a calibration function that appends context-specific instructions before every AI response:

The same user, in the same conversation, can get carpentry advice in shorthand and plumbing advice with definitions. The AI is not uniformly smart or uniformly basic — it is calibrated per domain, per user.

One rule is non-negotiable. The calibrator always appends this regardless of skill level:

No expertise level earns the right to skip safety information. An experienced electrician still gets reminded about permits. A veteran roofer still gets told about fall protection. This is not optional and it is not calibrated away.

The Stack of Three

Intent classification, skill profiling, and prompt calibration are each simple in isolation. The power is in the composition. A message hits the system and in under 300ms the intelligence layer has:

Determined the user needs troubleshooting help (not a full project plan)

Loaded their profile showing they are experienced in electrical but novice in plumbing

Injected calibration instructions into a troubleshooting-specific system prompt

The main model — Sonnet, the expensive one — then generates a response that feels uncannily appropriate. Not because the model is smarter, but because it received better instructions.

That is the part that changed how I think about this. The model is the same for every user. The intelligence layer is what makes the product feel different.

Up Next

The intelligence layer makes the AI talk to users like it knows them. But none of that matters if the information it is giving them is wrong. In post 4, I will dig into the grounding layer — building codes, real product prices, local store inventory, and why getting these right is harder than it sounds.