Self-Correcting Claude Code Workflows with PreToolUse Hooks

Q: How do PreToolUse hooks access the transcript?

PreToolUse hooks receive a JSON payload via stdin that includes the transcript file path. The hook reads the transcript JSONL file directly, seeking to the last 100KB to extract recent assistant and thinking blocks for keyword analysis.

Q: Does this work with other Claude Code features like delegation?

Yes, the hook is transparent to Claude. It injects context via the additionalContext field, which Claude treats as supplementary information for the current tool call. This works seamlessly with delegation, memory systems, and other workflow features.

Q: What happens if the hook takes too long?

Claude Code has built-in timeouts for hooks. If a hook exceeds the timeout, Claude proceeds without the injected context. That's why the 37ms execution time matters - it's well under any reasonable timeout threshold.

Q: Can I use this pattern for other types of context injection?

Absolutely. The thinking-recall pattern generalizes to any scenario where you want to dynamically inject relevant context based on what Claude is currently reasoning about. Examples include injecting API docs when Claude thinks about specific endpoints, or security guidelines when risky operations are detected.

I've been running Claude Code sessions that span 30+ tool calls - editing configs, running tests, analyzing logs. Around the 15-tool mark, I noticed something: Claude would "forget" patterns I'd mentioned earlier. Not hallucinate, just drift away from established context.

The problem wasn't token limits. It was attention drift. Claude's working memory is excellent, but in long sessions with constant tool results, relevant context from 10 minutes ago gets buried.

So I built a hook that fixes this automatically.

The Problem: Context Drift in Long Sessions

Here's what I observed across dozens of multi-hour sessions:

Tool call 5: Claude remembers the delegation rules perfectly
Tool call 15: Claude asks about delegation again
Tool call 25: Claude suggests an approach I'd explicitly rejected earlier

The pattern was consistent. After 8-10 tool calls, context from the session start would fade. Not disappear - just become less salient than the immediate tool results.

Before - Manual Context Management

-Context drift after 15+ tool calls
-Manual reminders every few interactions
-Repeated explanations of established patterns
-Lost thread of conversation in long sessions

After - Self-Correcting Workflow

+Automatic context refresh after 8 tool calls
+91-route knowledge base auto-matched
+Relevant patterns injected mid-session
+37ms overhead per trigger

The Solution: Thinking Recall Hook

I built a PreToolUse hook that runs after every 8th tool call. Here's what it does:

Reads the transcript back 4-8 exchanges
Extracts keywords from Claude's thinking blocks
Matches against a 91-route context router
Injects up to 500 tokens of relevant context
Returns in under 40ms

The key insight: Claude's <thinking> blocks contain the clearest signal of what's cognitively active. If Claude is thinking about "delegation" and "model selection", those keywords reveal what context would be valuable right now.

Self-Correction Flow

Tool call counter hits 8

Hook reads last 4-8 exchanges from transcript

Extract keywords from thinking blocks

Match against context router + experience DB

Inject < 500 tokens via additionalContext

Claude proceeds with refreshed context

Implementation Details (Concept Level)

The hook is 370 lines of Python, stdlib only. No dependencies, no network calls, no external services. It has to be fast - anything over 200ms would feel like lag.

What Gets Analyzed

I only read assistant and thinking blocks. Tool results are ignored for security - I don't want the hook to amplify potential prompt injections from web scraping or file reads.

The keyword extraction is simple: frequency analysis with stopword filtering, weighted by recency. Recent thinking blocks count more than older ones.

What Gets Matched

Three knowledge sources:

context-router.json: 91 routes mapping keywords to rule files
experience-router.json: Past session learnings with similarity matching
_memory/projects/*.json: Active project state and goals

The router uses keyword set intersection - if 2+ keywords from a route match what Claude is thinking about, that route's context files get injected.

What Gets Injected

Up to 500 tokens, deduplicated across three dimensions:

Content hashing (exact duplicates)
Jaccard similarity (paraphrases)
Topic-shift detection (is this genuinely new information?)

The hook passes context via the additionalContext field in the PreToolUse response. Claude sees it as "additional context for this tool call" - natural, non-intrusive.

Performance Numbers

Median execution time: 37ms
95th percentile: 45ms
Target was 200ms
Fail-open: any error exits silently with code 0

The speed comes from aggressive caching. The context router is loaded once at startup and held in memory. File reads are only triggered on cache misses.

Real-World Example

Yesterday I was debugging a delegation issue. The session went like this:

Tool calls 1-7: Setting up test cases, reading config files
Tool call 8: Hook fires, injects delegation scoring rules
Tool call 9: Claude references the scoring rules correctly
Tool calls 10-16: Implementation and testing
Tool call 17: Hook fires again, injects model selection criteria
Tool call 18: Claude applies criteria without me prompting

I didn't mention delegation scoring after tool call 3. But at tool call 9, Claude had it available again - not because of its memory, but because the hook detected "delegation" in the thinking blocks and injected the right context.

Hook Architecture

Context Injection< 500 tokens via additionalContext

DeduplicationContent hash + Jaccard + topic-shift detection

Knowledge Matching91-route context router + experience DB + project memory

Transcript AnalysisExtract keywords from thinking blocks (last 4-8 exchanges)

PreToolUse TriggerAfter 8+ tool calls on Write/Edit/Bash/Task

Lessons Learned

Fail-open is critical. My first version would block the tool call if the hook crashed. Bad idea. Now any error - file not found, JSON parse failure, timeout - results in exit 0. Claude never sees the failure.

Keyword extraction beats embeddings here. I tried semantic similarity first. Too slow (150ms) and too many false positives. Simple keyword matching with frequency weighting works better for this use case.

Deduplication saved me. Without it, the hook would inject the same 3 rules every time. Jaccard similarity (comparing word sets) catches paraphrases that content hashing misses.

The 8-call threshold was trial and error. Too frequent (every 5 calls) felt noisy. Too rare (every 15 calls) came too late. Eight is the sweet spot for my workflows.

Why This Matters for Claude Code Workflows

Most people treat Claude Code like a traditional IDE extension - a tool that responds to requests. But with hooks, you can build self-correcting systems that adapt mid-session.

This particular hook solves context drift. But the pattern generalizes:

PreToolUse hooks can validate inputs before risky operations
PostToolUse hooks can verify outputs and trigger rollbacks
Hooks can coordinate between multiple agents in parallel workflows

The broader insight: Claude Code workflows don't have to be linear. You can build feedback loops, validation layers, and self-correction mechanisms that run automatically.

If you want to dive deeper into hook architecture, keyword extraction strategies, and the complete scoring algorithms, Evolving Lite includes the full Python implementation, config examples, and advanced patterns for multi-agent coordination. Free and open source.

Get Started

Want to see how this fits into a complete workflow automation system? I've written about hooks automation patterns and context management strategies that complement this approach.

FAQ

How do PreToolUse hooks access the transcript?+

PreToolUse hooks receive a JSON payload via stdin that includes the transcript file path. The hook reads the transcript JSONL file directly, seeking to the last 100KB to extract recent assistant and thinking blocks for keyword analysis.

Does this work with other Claude Code features like delegation?+