Build Log — February 3, 2026

February 3, 2026

Context / Focus for Today

Heavy bug fix day driven by user reports. The plan engine had multiple issues — wrong days, uncapped distances, flat taper — and the AI coach was describing workouts instead of creating them. Also: Strava API access approved.

Things I Got Done Today

Strava API Approved 🎉

Strava Developer Program submission approved
Combined with existing Garmin integration, covers the two biggest endurance platforms
Opens up deeper activity sync and analysis features

AI Coach: Force Tool Calling (PR #254)

Issue: AI responds saying it cleared workouts but never actually calls the delete_workouts tool
Root cause: tool_choice detection only matched creation keywords. Delete/clear/remove requests got tool_choice='auto', letting the model skip the tool call entirely
Fix: Added isActionRequest detection that forces tool_choice='required' for both create and action requests
Lesson: Prompting the model to use a tool is unreliable. tool_choice='required' is the only guarantee.

Plan Builder Workouts Not Clearing (PR #256)

Issue: "Clear all workouts" only deleted AI-generated workouts. Plan Builder workouts had external_source = null and were skipped.
Fix: Tag Plan Builder workouts with external_source = 'plan', updated clearAll handler to use isGeneratedWorkout() helper that catches ai + plan + null sources

Streaming Route: Plan Engine Integration (PR #257)

Issue: AI describes a full marathon plan in text but never creates the workouts on the calendar
Root cause: Streaming route (/api/coach/stream) had no handler for generate_training_plan tool calls — it only handled create_workouts, delete_workouts, update_workouts. The non-streaming route had a full handler but users hit the streaming route by default.
Fix: Added rawToolCalls to response type, intercept generate_training_plan in streaming route, run buildPlan() server-side
Also fixed: Streaming route was missing full athlete profile — pace data, zones, race times, preferred days were only in the non-streaming fallback. Replicated full context to both routes.

JSON Fallback: Plan Engine Support (PR #260)

Issue: Production model (xiaomi/mimo-v2-flash) doesn't support tool calling. All plan generation must work through JSON fallback — but JSON fallback only knew about create_workouts, not generate_training_plan.
Fix: Added generate_training_plan to JSON fallback prompt, added planEngineRequest parsing, and robust input normalization
Key insight: Having the model output 5 plan parameters is much more reliable than asking it to generate 50+ individual workout JSON objects
8 new tests

Plan Engine: Day Alignment (PR #264)

Issue: Long runs on Tuesdays, threshold on Sundays — completely wrong days
Root cause: generateWeekWorkouts() assigned workouts sequentially from startDate without checking actual day-of-week
Fix: Align plan start to nearest Monday, use getUTCDay() for actual calendar day mapping. Added WeekSchedule interface for future custom day preferences.
10 new tests

Plan Engine: Distance Caps & Taper (PR #266)

Issue: 33-mile long run 3 weeks before race. 18-mile long run the day before race. Distances uncapped, taper was flat 60% for both weeks.
Root causes:
- Math.max(longRunDistance, weeklyMileage - totalMileage) dumps ALL leftover mileage into the long run with no ceiling
- Unbounded progressive overload (~5%/week, no cap) pushes build weeks to 90+ miles on a 70mpw base
- Flat taper, no race-week special handling
Fixes:
- Long run caps: marathon ≤22mi, half ≤15mi, tri ≤14mi
- Weekly mileage cap: +20% max above base
- Race week: no long run, shakeout runs only
- Progressive taper: 70% → 45% (was flat 60%)
- Off-by-one fix in getPhaseForWeek for taper detection
6 new tests

AI Bypassing Plan Engine (PR #268)

Issue: Even after engine fixes, AI still generating 30mi/26mi long runs
Root cause: AI uses create_workouts directly (LLM generates distances with zero validation) instead of generate_training_plan. System prompt had contradictory CRITICAL instructions.
Fix: Removed contradictory instructions, added distance validation safety net on create_workouts path, strengthened JSON fallback steering
Lesson: Fixing the engine doesn't matter if the AI never calls it. The system prompt is the real routing layer.

Stream Timeout Fix (PR #262)

Issue: "Response was cut off (server timeout)" when requesting 9-week marathon plan
Fix: Increased maxDuration from 60s → 180s, added dynamic system prompt hint steering toward generate_training_plan for multi-week requests

Twitter/X Setup

Configured OAuth 1.0a for @reach_flowstate
Posting script, auth verified, workflow established
Daily build logs + feature updates as they ship

Commits Today

14+ commits across 8 PRs
Major: Plan engine overhaul (day alignment, distance caps, progressive taper)
Major: AI coach forced tool calling + streaming route plan engine integration
Major: JSON fallback plan engine support for non-tool-calling models
Fix: Stream timeout, contradictory system prompt instructions
Infra: Twitter/X integration

In Progress

Strava deep integration (API approved, implementation next)
Plan engine: custom workout day preferences (kanban backlog)
Mobile UX iteration

Targets for Tomorrow

Pace math fix — AI miscalculating marathon goal pace (LLMs can't divide)
Chat input UX — Desktop sidebar clips long text
Monitor plan engine — Verify distance caps and taper working in production
Model evaluation — Test better models for tool calling reliability

Notes / Observations

The streaming vs non-streaming route divergence caused 3 separate bugs today. Need to extract shared logic.
Production model matters enormously — features that work with GPT-4o break completely with models that don't support tool calling.
System prompt contradictions are real bugs. When two CRITICAL instructions conflict, the model picks one randomly.
The plan engine went from "mostly broken" to "solid" in one day: correct days, capped distances, progressive taper, race week handling.
Sophisticated taper protocols existed in taper-protocols.ts but were never wired into the engine — classic "code exists but isn't used" situation.

Momentum Score: 9 / 10

Massive day. The plan engine is fundamentally better — correct scheduling, sane distances, real taper progression. The AI coach actually creates workouts now instead of just describing them. Eight PRs shipped addressing real user-reported issues. Strava approval is a strategic win. The only thing keeping this from a 10 is the number of "the same bug in a different code path" issues, which points to architectural debt in the streaming/non-streaming split.