Back to Developer Logs

Build Log — February 3, 2026

Build Log — February 3, 2026

Context / Focus for Today

Heavy bug fix day driven by user reports. The plan engine had multiple issues — wrong days, uncapped distances, flat taper — and the AI coach was describing workouts instead of creating them. Also: Strava API access approved.


Things I Got Done Today

Strava API Approved 🎉

  • Strava Developer Program submission approved
  • Combined with existing Garmin integration, covers the two biggest endurance platforms
  • Opens up deeper activity sync and analysis features

AI Coach: Force Tool Calling (PR #254)

  • Issue: AI responds saying it cleared workouts but never actually calls the delete_workouts tool
  • Root cause: tool_choice detection only matched creation keywords. Delete/clear/remove requests got tool_choice='auto', letting the model skip the tool call entirely
  • Fix: Added isActionRequest detection that forces tool_choice='required' for both create and action requests
  • Lesson: Prompting the model to use a tool is unreliable. tool_choice='required' is the only guarantee.

Plan Builder Workouts Not Clearing (PR #256)

  • Issue: "Clear all workouts" only deleted AI-generated workouts. Plan Builder workouts had external_source = null and were skipped.
  • Fix: Tag Plan Builder workouts with external_source = 'plan', updated clearAll handler to use isGeneratedWorkout() helper that catches ai + plan + null sources

Streaming Route: Plan Engine Integration (PR #257)

  • Issue: AI describes a full marathon plan in text but never creates the workouts on the calendar
  • Root cause: Streaming route (/api/coach/stream) had no handler for generate_training_plan tool calls — it only handled create_workouts, delete_workouts, update_workouts. The non-streaming route had a full handler but users hit the streaming route by default.
  • Fix: Added rawToolCalls to response type, intercept generate_training_plan in streaming route, run buildPlan() server-side
  • Also fixed: Streaming route was missing full athlete profile — pace data, zones, race times, preferred days were only in the non-streaming fallback. Replicated full context to both routes.

JSON Fallback: Plan Engine Support (PR #260)

  • Issue: Production model (xiaomi/mimo-v2-flash) doesn't support tool calling. All plan generation must work through JSON fallback — but JSON fallback only knew about create_workouts, not generate_training_plan.
  • Fix: Added generate_training_plan to JSON fallback prompt, added planEngineRequest parsing, and robust input normalization
  • Key insight: Having the model output 5 plan parameters is much more reliable than asking it to generate 50+ individual workout JSON objects
  • 8 new tests

Plan Engine: Day Alignment (PR #264)

  • Issue: Long runs on Tuesdays, threshold on Sundays — completely wrong days
  • Root cause: generateWeekWorkouts() assigned workouts sequentially from startDate without checking actual day-of-week
  • Fix: Align plan start to nearest Monday, use getUTCDay() for actual calendar day mapping. Added WeekSchedule interface for future custom day preferences.
  • 10 new tests

Plan Engine: Distance Caps & Taper (PR #266)

  • Issue: 33-mile long run 3 weeks before race. 18-mile long run the day before race. Distances uncapped, taper was flat 60% for both weeks.
  • Root causes:
    • Math.max(longRunDistance, weeklyMileage - totalMileage) dumps ALL leftover mileage into the long run with no ceiling
    • Unbounded progressive overload (~5%/week, no cap) pushes build weeks to 90+ miles on a 70mpw base
    • Flat taper, no race-week special handling
  • Fixes:
    • Long run caps: marathon ≤22mi, half ≤15mi, tri ≤14mi
    • Weekly mileage cap: +20% max above base
    • Race week: no long run, shakeout runs only
    • Progressive taper: 70% → 45% (was flat 60%)
    • Off-by-one fix in getPhaseForWeek for taper detection
  • 6 new tests

AI Bypassing Plan Engine (PR #268)

  • Issue: Even after engine fixes, AI still generating 30mi/26mi long runs
  • Root cause: AI uses create_workouts directly (LLM generates distances with zero validation) instead of generate_training_plan. System prompt had contradictory CRITICAL instructions.
  • Fix: Removed contradictory instructions, added distance validation safety net on create_workouts path, strengthened JSON fallback steering
  • Lesson: Fixing the engine doesn't matter if the AI never calls it. The system prompt is the real routing layer.

Stream Timeout Fix (PR #262)

  • Issue: "Response was cut off (server timeout)" when requesting 9-week marathon plan
  • Fix: Increased maxDuration from 60s → 180s, added dynamic system prompt hint steering toward generate_training_plan for multi-week requests

Twitter/X Setup

  • Configured OAuth 1.0a for @reach_flowstate
  • Posting script, auth verified, workflow established
  • Daily build logs + feature updates as they ship

Commits Today

  • 14+ commits across 8 PRs
  • Major: Plan engine overhaul (day alignment, distance caps, progressive taper)
  • Major: AI coach forced tool calling + streaming route plan engine integration
  • Major: JSON fallback plan engine support for non-tool-calling models
  • Fix: Stream timeout, contradictory system prompt instructions
  • Infra: Twitter/X integration

In Progress

  • Strava deep integration (API approved, implementation next)
  • Plan engine: custom workout day preferences (kanban backlog)
  • Mobile UX iteration

Targets for Tomorrow

  1. Pace math fix — AI miscalculating marathon goal pace (LLMs can't divide)
  2. Chat input UX — Desktop sidebar clips long text
  3. Monitor plan engine — Verify distance caps and taper working in production
  4. Model evaluation — Test better models for tool calling reliability

Notes / Observations

  • The streaming vs non-streaming route divergence caused 3 separate bugs today. Need to extract shared logic.
  • Production model matters enormously — features that work with GPT-4o break completely with models that don't support tool calling.
  • System prompt contradictions are real bugs. When two CRITICAL instructions conflict, the model picks one randomly.
  • The plan engine went from "mostly broken" to "solid" in one day: correct days, capped distances, progressive taper, race week handling.
  • Sophisticated taper protocols existed in taper-protocols.ts but were never wired into the engine — classic "code exists but isn't used" situation.

Momentum Score: 9 / 10

Massive day. The plan engine is fundamentally better — correct scheduling, sane distances, real taper progression. The AI coach actually creates workouts now instead of just describing them. Eight PRs shipped addressing real user-reported issues. Strava approval is a strategic win. The only thing keeping this from a 10 is the number of "the same bug in a different code path" issues, which points to architectural debt in the streaming/non-streaming split.