Anatomy of a Claude Code Conversation Transcript

How Claude Code records every interaction as a replayable, auditable event stream — and why the design choices are worth stealing.

What is this?

Every time you use Claude Code -- whether from the CLI, the desktop app, or the web -- the session is recorded as a .jsonl file. One line per event. Not a chat log. An event stream that captures the full state machine of a human-AI collaboration: thinking, tool calls, tool results, context injection, error recovery, file snapshots, and system metadata.

I reverse-engineered one of my own transcripts to understand the structure. The session was 271 events across 37 minutes. The task: find all open Jira tickets older than 1 year in our project board, export them to Google Sheets, analyze the data, and post a cleanup proposal to Slack. What started as a simple query turned into 23 paginated API calls, a CSV export, a Google Sheet with hyperlinks, a Slack message draft-review-send cycle, and a follow-up re-filter -- all captured in a single replayable file.

Here is what I found.

The Event Model

Every line in the JSONL file is a self-contained JSON object. There are 7 distinct event types:

EVENT TYPES IN A TYPICAL SESSION (271 lines)
============================================
assistant ........... 124  (Claude's responses)
user ................ 99   (your messages + tool results)
attachment .......... 24   (injected context)
system .............. 15   (metadata & hooks)
file-history-snapshot  8   (undo checkpoints)
permission-mode .....  1   (session permissions)

But the real insight is not the types -- it is how they are connected.

1. The UUID-Linked Tree

Every event has a uuid and a parentUuid. This is not a flat chat log. It is a directed acyclic graph (tree in practice) where each event points to its parent.

                         SESSION START
                              |
                     system [bridge_status]
                        uuid: 8f7fe65e
                              |
                   user "find all open tickets..."
                        uuid: 86e5ce6b
                        parent: 8f7fe65e
                              |
              +---------------+----------------+
              |               |                |
        attachment       attachment       attachment
      deferred_tools   mcp_instructions   skill_listing
       uuid: 08e25424   uuid: 72410e39    uuid: dcaa8ac6
       parent: 86e5ce6b parent: 08e25424  parent: 72410e39
                                               |
                                          assistant
                                         [thinking]
                                        uuid: 17e8ed97
                                        parent: dcaa8ac6
                                               |
                                          assistant
                                       [tool: ToolSearch]
                                        uuid: b258dc0c
                                        parent: 17e8ed97
                                               |
                                             user
                                         [tool_result]
                                        uuid: 6ddac0a6
                                        parent: b258dc0c
                                               |
                                          assistant
                                       [tool: JQL search]
                                        uuid: 29a8f35b
                                        parent: 6ddac0a6
                                               |
                                             user
                                       [tool_result: ERROR]
                                        uuid: a0ed800a
                                        parent: 29a8f35b
                                               |
                                          assistant
                                         [thinking]
                                       "TO is reserved..."
                                               |
                                          assistant
                                       [tool: JQL search]
                                       (fixed query)
                                               |
                                              ...

Why this matters: You can reconstruct branching conversations, parallel tool chains, and sidechains. The isSidechain flag marks exploratory branches that do not affect the main conversation. Any consumer of this format can walk the tree to understand causality -- which tool call triggered which result, which error led to which correction.

A flat messages[] array loses all of this structure.

2. Lazy Context Injection via Attachment Chaining

When you send a message, Claude Code does not dump everything into a single system prompt. Instead, context is injected as a chain of attachment events between your message and Claude's first response:

CONTEXT INJECTION SEQUENCE (first user message)
================================================

  user message
  "find all open tickets..."
       |
       v
  attachment: deferred_tools_delta
  (270 tool names registered, no schemas loaded)
       |
       v
  attachment: mcp_instructions_delta
  (MCP server instructions: Atlassian, Slack, PostHog...)
       |
       v
  attachment: skill_listing
  (77 available skills listed)
       |
       v
  assistant [thinking]
  (Claude now has full context, begins reasoning)

Each attachment has its own UUID and parent, forming a chain. This means:

Context accretes incrementally -- Claude "sees" tools, then MCP instructions, then skills, in order
Each injection point is individually addressable -- you can diff what context was available at any point
Late-arriving context (like async_hook_response or task_reminder) can be injected at any point in the chain without disrupting the conversation structure

3. The Context Layering Hierarchy

The attachment chain above shows how context is delivered. But what is in each layer, who owns it, and how do they compose? Claude Code assembles the runtime context from five distinct sources, each with its own scope, lifetime, and owner — and not all of them are "Claude the model." Most are Claude Code the harness, the CLI that runs between you and the API.

CONTEXT LAYERS (most permanent -> least permanent)
==================================================

  Layer 1: SYSTEM PROMPT
  +--------------------------------------------------+
  | Content: identity, tool-use rules, safety        |
  |   guardrails, env metadata (cwd, git, model)     |
  | Owner:   Claude Code harness (ships in the CLI)  |
  | Model:   Every LLM supports a system role; the   |
  |          *content* is Claude-Code-specific.      |
  +--------------------------------------------------+
                           |
  Layer 2: CUSTOM INSTRUCTIONS / OUTPUT-STYLE
  +--------------------------------------------------+
  | Content: user-authored tone, length, overrides   |
  | Owner:   User (settings.json / output-styles)    |
  | Model:   Concatenated into the system prompt --  |
  |          the model sees one merged block.        |
  +--------------------------------------------------+
                           |
  Layer 3: PROJECT INSTRUCTIONS (CLAUDE.md)
  +--------------------------------------------------+
  | Content: per-repo rules, team conventions        |
  | Owner:   Harness reads the file; git stores it   |
  | Model:   NOT in the system prompt. Arrives as a  |
  |          <system-reminder> attached to your      |
  |          first user message.                     |
  +--------------------------------------------------+
                           |
  Layer 4: MEMORY (MEMORY.md + per-topic files)
  +--------------------------------------------------+
  | Content: cross-session facts (user, feedback,    |
  |          project, reference)                     |
  | Owner:   Harness-managed (~/.claude/projects/..) |
  | Model:   MEMORY.md loads as an index every turn; |
  |          individual memory files load on demand. |
  |          Delivered via the same claudeMd bundle. |
  +--------------------------------------------------+
                           |
  Layer 5: SYSTEM-REMINDERS (per-turn attachments)
  +--------------------------------------------------+
  | Content: deferred tools list, MCP server rules,  |
  |          available skills, claudeMd bundle       |
  | Owner:   Harness injects; model respects the tag |
  | Model:   <system-reminder> is an Anthropic-      |
  |          trained convention -- Claude models are |
  |          tuned to treat it as authoritative.     |
  +--------------------------------------------------+
                           |
  Layer 6: USER MESSAGE

Precedence when layers conflict: safety rules in Layer 1 override everything; CLAUDE.md's explicit "ALWAYS APPLY" / "MUST DO" rules override defaults (the file header literally says so); the current user message overrides CLAUDE.md for that turn only; memory is informational, not prescriptive — the model should verify stale memories against current state before acting.

Model behavior vs. harness behavior

The distinction matters because some patterns transfer to other agents and some do not:

Capability	Claude the model	Claude Code the harness
System-prompt role	Shared by every LLM	Content is Claude-Code-specific
`<system-reminder>` tag	Anthropic-trained convention	Harness uses it to inject CLAUDE.md + memory
Tool use / function calling	Shared by every major LLM	Harness decides which tools to expose
Deferred tool loading	—	Claude Code optimization (see next section)
MCP (Model Context Protocol)	Anthropic-authored protocol (Nov 2024)	Claude Code shipped support first
CLAUDE.md reading	—	Harness reads, concatenates, injects
Persistent memory	—	Harness-managed; the model is stateless between turns
File-history snapshots	—	Harness-managed undo layer
Prompt caching	API feature	Harness decides what to cache

The blunt version: the model provides the system-role slot, tool-calling protocol, and the trained respect for <code>&lt;system-reminder&gt;</code> tags. Everything else in this hierarchy is the harness.

How other coding agents handle it

The concept of project instructions + memory is now universal. The mechanism differs:

Cursor — .cursorrules (legacy) / .cursor/rules/*.mdc (new). Injected as system-prompt additions. No native memory; the community "memory bank" pattern fills the gap with markdown files the model re-reads each session.
OpenAI Codex CLI / Copilot coding agent — AGENTS.md convention (now an emerging cross-tool standard at agents.md). Per-repo rules; no native memory.
Aider — CONVENTIONS.md + .aider.conf.yml. Loaded into every prompt. Memory is DIY.
Cline — .clinerules + "Memory Bank" convention (a folder of markdown files re-read every session).
Windsurf (Codeium) — .windsurfrules, same shape as cursorrules.
Gemini CLI — GEMINI.md in project root. Supports MCP. Session-scoped memory.
Devin — proprietary server-side long-term memory, closer to ChatGPT's memory feature than a file-based convention.
ChatGPT / Claude.ai — platform-level memory managed by the provider; no file equivalent.

The common thread: every serious coding agent needs persistent per-repo rules plus some form of memory, but the delivery mechanism is always the harness, not the model. Claude Code's specific contribution is that each layer gets its own addressable event in the transcript — you can see exactly what context was injected at what point, rather than having it silently concatenated into one opaque system prompt.

4. Deferred Tool Loading (270 Tools, Zero Upfront Cost)

This is one of the smartest design choices. The session has access to 270 tools (Atlassian, Slack, Chrome, PostHog, Railway, Notion, Gmail, etc.), but their schemas are not loaded upfront. Only their names are registered:

DEFERRED TOOL LOADING
=====================

  Start of session:
  +------------------------------------------+
  |  270 tool NAMES registered               |
  |  (deferred_tools_delta attachment)       |
  |  No schemas loaded = minimal context cost|
  +------------------------------------------+

  When Claude needs a tool:
  +------------------------------------------+
  |  assistant calls ToolSearch              |
  |  query: "select:mcp__atlassian__         |
  |          searchJiraIssuesUsingJql"       |
  +------------------------------------------+
           |
           v
  +------------------------------------------+
  |  System returns full JSON schema         |
  |  for ONLY that tool                      |
  +------------------------------------------+
           |
           v
  +------------------------------------------+
  |  assistant calls the actual tool         |
  |  with correct parameters                 |
  +------------------------------------------+

In my session, only 3 tools were loaded via ToolSearch out of 270 available: - mcp__atlassian__searchJiraIssuesUsingJql - mcp__claude_ai_Slack__slack_read_channel - mcp__claude_ai_Slack__slack_send_message

The savings are massive. If each tool schema averages 500 tokens, loading all 270 upfront would burn ~135K tokens of context on every turn. Instead, the system pays ~3 tokens per tool name and loads full schemas only on demand. For a session with 124 assistant turns, that is the difference between viable and unusable.

5. Split Content Blocks

A single Claude API response can contain multiple content blocks (thinking, text, tool_use). In the transcript, each block becomes its own JSONL line, but they share the same message.id:

SINGLE API RESPONSE -> MULTIPLE JSONL LINES
============================================

  API response msg_01Dvpk8tYJNc4hCEqhYZYt6W:
  +-------------------------------------------+
  |  content: [                               |
  |    { type: "thinking", ... },             |
  |    { type: "tool_use", name: "ToolSearch" |
  |      input: { query: "select:..." } }     |
  |  ]                                        |
  +-------------------------------------------+

  Becomes two JSONL lines:

  Line 8:  assistant | msg_01Dvp... | thinking
                uuid: 17e8ed97
                      |
  Line 9:  assistant | msg_01Dvp... | tool:ToolSearch
                uuid: b258dc0c
                parent: 17e8ed97

This means: - Thinking is logged separately from actions -- useful for auditing reasoning vs. execution - Multiple tool calls in one response each get their own event with independent result tracking - The thinking block includes a cryptographic signature for verification

6. Tool Call / Result Pairing

Every tool call is immediately followed by its result. The pairing is tracked via two fields:

TOOL CALL/RESULT PAIR
=====================

  assistant (tool call)                    user (tool result)
  +----------------------------+          +----------------------------+
  | uuid: 29a8f35b             |          | uuid: a0ed800a             |
  | content: [{                |          | parent: 29a8f35b           |
  |   type: "tool_use",        |   --->   | content: [{                |
  |   id: "toolu_017rr53X..",  |          |   tool_use_id:             |
  |   name: "searchJira...",   |          |     "toolu_017rr53X..",    |
  |   input: { jql: "..." }    |          |   is_error: true,          |
  | }]                         |          |   content: "Bad Request.." |
  +----------------------------+          | }]                         |
                                          | sourceToolAssistantUUID:   |
                                          |   29a8f35b                 |
                                          +----------------------------+

Three linking mechanisms: 1. parentUuid on the result points to the tool call event 2. tool_use_id matches the specific tool invocation 3. sourceToolAssistantUUID explicitly back-references the assistant event

The is_error flag enables quick filtering: of the 99 user events in my session, only 2 were errors (a JQL syntax error and a gws CLI error). Both were self-corrected in the next assistant turn.

7. Cache Economics Per Turn

Every assistant event includes a usage object that reveals the prompt caching behavior:

CACHE ECONOMICS ACROSS THE SESSION
===================================

  Turn 1 (cold start):
  +-------------------------------------------+
  |  input_tokens:              3             |
  |  cache_creation_input:  32,393  <-- MISS  |
  |  cache_read_input:          0             |
  |  output_tokens:           193             |
  +-------------------------------------------+

  Turn 2 (cache warm):
  +-------------------------------------------+
  |  input_tokens:              1             |
  |  cache_creation_input:    521             |
  |  cache_read_input:     32,393  <-- HIT    |
  |  output_tokens:           183             |
  +-------------------------------------------+

  Turn 50 (deep in pagination):
  +-------------------------------------------+
  |  input_tokens:              1             |
  |  cache_creation_input:    830             |
  |  cache_read_input:     75,665  <-- HIT    |
  |  output_tokens:           669             |
  +-------------------------------------------+

  Turn 124 (final Slack send):
  +-------------------------------------------+
  |  input_tokens:              1             |
  |  cache_creation_input:    695             |
  |  cache_read_input:     98,158  <-- HIT    |
  |  output_tokens:            42             |
  +-------------------------------------------+

What this tells us: - 32K tokens were cached on the first turn (system prompt, CLAUDE.md, memory). Every subsequent turn reads them for free. - Each turn only creates ~500-800 new tokens of cache (the incremental conversation growth) - By the end of the session, 98K tokens are cached and only 1 new input token is needed per turn - The ephemeral_1h_input_tokens and ephemeral_5m_input_tokens show two cache tiers: one that lasts 5 minutes (for hot conversations) and one that lasts 1 hour (for the system prompt)

If you are building your own AI agent system, this is the metric to watch. Cache hit rate directly determines cost and latency.

8. File History Snapshots (Time-Travel Undo)

Between conversation turns, file-history-snapshot events capture the state of modified files:

FILE HISTORY SNAPSHOTS
======================

  Line 3:   file-history-snapshot  (before turn 1)
              messageId: 86e5ce6b
              trackedFileBackups: {}
              timestamp: 2026-04-15T03:54:37

  Line 180: file-history-snapshot  (after Jira analysis turn)
              messageId: ...
              timestamp: 2026-04-15T04:03:09

  Line 188: file-history-snapshot  (after CSV export)
              timestamp: 2026-04-15T04:17:06

  Line 195: file-history-snapshot  (after 2yr filter)
              timestamp: 2026-04-15T04:21:38

  Line 217: file-history-snapshot  (after Google Sheets creation)
              timestamp: 2026-04-15T04:27:42

  ...8 snapshots total across the session

Each snapshot records: - Which files were tracked (trackedFileBackups) - The exact messageId this snapshot corresponds to - Whether this is a new snapshot or an update to an existing one (isSnapshotUpdate)

This is how Claude Code's "undo" works: every turn boundary gets a checkpoint, and you can roll back to any point in the conversation. The snapshot is cheap when no files changed (empty backups), and only stores diffs when files are modified.

9. System Events as Metadata Lanes

System events are interleaved in the transcript but do not pollute the conversation. They carry metadata about the session itself:

SYSTEM EVENT TYPES
==================

  stop_hook_summary (x7)
  +-------------------------------------------+
  | hookInfos: [{ command:                    |
  |   "bash ~/.claude/hooks/stop-auto-name.sh"|
  | }]                                        |
  | preventedContinuation: false              |
  +-------------------------------------------+
  ^ Runs after every turn. In this session,
    a hook auto-names the session based on content.

  turn_duration (x4)
  +-------------------------------------------+
  | durationMs: 512062  (8.5 minutes)         |
  | messageCount: 175                         |
  +-------------------------------------------+
  ^ The first turn (Jira pagination) took
    8.5 minutes and generated 175 messages.

  away_summary (x2)
  +-------------------------------------------+
  | "You asked me to find all open tickets    |
  |  older than 1 year... Waiting for your    |
  |  direction on what to do with the         |
  |  results."                                |
  +-------------------------------------------+
  ^ Generated when the user is idle. Provides
    a quick recap when they return.

  bridge_status (x1)
  +-------------------------------------------+
  | url: "https://claude.ai/code/session_..." |
  +-------------------------------------------+
  ^ Remote control URL for web access.

The away_summary is particularly clever -- it fires when Claude detects the user has been idle, generating a natural-language recap of what happened. When you return to a session after a break, you get context without re-reading the entire conversation.

10. The Full Session Flow

Here is the complete flow of my 271-event session, showing how all these pieces compose into a real workflow:

FULL SESSION FLOW (271 events, 37 minutes)
==========================================

  PHASE 1: JIRA QUERY (lines 1-176, ~8.5 min)
  +-----------------------------------------------+
  |                                               |
  |  [1] permission-mode: bypassPermissions       |
  |  [2] system: bridge_status (remote URL)       |
  |  [3] file-history-snapshot (baseline)         |
  |  [4] user: "find all open tickets > 1 year"   |
  |  [5-7] attachments: tools, MCP, skills        |
  |  [8-9] assistant: think -> ToolSearch         |
  |  [10] tool_result: schema loaded              |
  |  [11] assistant: JQL query                    |
  |  [12] tool_result: ERROR (TO is reserved)  <--+-- Self-correction
  |  [13-14] assistant: think -> explain error    |
  |  [15] assistant: fixed JQL query              |
  |  [16] tool_result: 100 results (page 1)       |
  |                                               |
  |  [17-172] PAGINATION LOOP:                    |
  |    Pattern repeats 22 more times:             |
  |    - JQL query (startAt: N)                   |
  |    - Bash: extract/merge results to /tmp      |
  |    - Check if more pages                      |
  |    - Progress update to user                  |
  |                                               |
  |  Total: 2,071 tickets fetched                 |
  |  23 JQL queries, 62 Bash calls                |
  |                                               |
  |  [173-176] assistant: final analysis          |
  |    "1,414 truly open, 87% stale, 56% unasgn"  |
  |  [177-179] system: hook, duration, away       |
  |                                               |
  +-----------------------------------------------+

  PHASE 2: CSV EXPORT (lines 180-194, ~3 min)
  +-----------------------------------------------+
  |  [181] user: "add 1 table breakdown"          |
  |  [183] assistant: Bash (generate CSV)         |
  |  [186] assistant: "Done. /tmp/*.csv"          |
  |                                               |
  |  [189] user: "make version > 2 years"         |
  |  [191] assistant: Bash (filter & export)      |
  |  [193] assistant: "108 tickets > 2yr"         |
  +-----------------------------------------------+

  PHASE 3: GOOGLE SHEETS (lines 195-214, ~5 min)
  +-----------------------------------------------+
  |  [196] user: "make a gogole sheet"            |
  |  [198] assistant: gws sheets create           |
  |  [200] assistant: build payload w/ hyperlinks |
  |  [202-211] assistant: upload data, format     |
  |    header, auto-resize columns                |
  |  [214] assistant: "Done. [sheets URL]"        |
  +-----------------------------------------------+

  PHASE 4: SLACK DRAFT & SEND (lines 217-256, ~6 min)
  +-----------------------------------------------+
  |  [218] user: "draft a message to #channel"    |
  |  [220-224] assistant: think, read channel,    |
  |    compose draft                              |
  |  [228] assistant: shows draft for review      |
  |  [231] user: "change it to 1+ year old"       |
  |  [233] assistant: updated draft               |
  |  [236] user: "yes"                            |
  |  [239-251] assistant: update Sheet data,      |
  |    then prepare Slack message                 |
  |  [252-253] assistant: ToolSearch for Slack    |
  |  [254] assistant: slack_send_message          |
  |  [256] assistant: "Sent. [message link]"      |
  +-----------------------------------------------+

  PHASE 5: RE-FILTER (lines 260-271, interrupted)
  +-----------------------------------------------+
  |  [262] user: "exclude admin-related tickets"  |
  |  [267-269] assistant: think, start re-query   |
  |  [270] tool_result: ERROR                     |
  |  [271] user: [interrupted]                    |
  +-----------------------------------------------+

11. What to Steal for Your Own System

If you are building an AI agent system, conversation logging system, or just want better observability into your AI interactions, here are the patterns worth adopting:

The non-obvious wins

UUID-linked tree, not flat array. Most chat systems store messages[]. Claude Code stores a graph. This enables branching, sidechains, and causal tracing. When something goes wrong 50 turns in, you can walk the parent chain back to the root cause.

Deferred tool loading. If your agent has access to many tools, do not load all schemas upfront. Register names, load schemas on demand. In a 270-tool system, this saves ~130K tokens per turn.

Split content blocks into separate events. Thinking, text output, and tool calls from a single API response each get their own event. This makes auditing trivial: grep for type: "thinking" to see all reasoning, grep for type: "tool_use" to see all actions.

Cache economics as first-class telemetry. Every turn records cache_creation_input_tokens vs cache_read_input_tokens. You can plot cache hit rate over time, detect cache busts, and optimize your context management.

File snapshots at turn boundaries. Cheap when nothing changed, valuable when it did. Enables time-travel undo without versioning the entire workspace.

Away summaries. Auto-generate a recap when the user returns after idle time. This solves the "where was I?" problem that kills long sessions.

System events as a parallel metadata lane. Hook results, turn duration, and session status are interleaved in the timeline but separated by type. The conversation stays clean; the observability data is always there.

The architecture in one diagram

+------------------------------------------------------------------+
|                    JSONL EVENT STREAM                            |
|                                                                  |
|  +-- permission-mode (session config)                            |
|  |                                                               |
|  +-- system [bridge_status] (session URL, version, git branch)   |
|  |                                                               |
|  +-- file-history-snapshot (baseline)                            |
|  |                                                               |
|  +-- user (your message)                                         |
|  |    |                                                          |
|  |    +-- attachment: deferred_tools_delta (270 tool names)      |
|  |    |                                                          |
|  |    +-- attachment: mcp_instructions_delta (server rules)      |
|  |    |                                                          |
|  |    +-- attachment: skill_listing (77 skills)                  |
|  |         |                                                     |
|  |         +-- assistant [thinking] (reasoning, signed)          |
|  |         |                                                     |
|  |         +-- assistant [tool_use] (ToolSearch)                 |
|  |              |                                                |
|  |              +-- user [tool_result] (schema returned)         |
|  |                   |                                           |
|  |                   +-- assistant [tool_use] (actual API call)  |
|  |                        |                                      |
|  |                        +-- user [tool_result]                 |
|  |                             |                                 |
|  |                             +-- assistant [text] (answer)     |
|  |                                                               |
|  +-- system [stop_hook_summary] (post-turn hooks)                |
|  |                                                               |
|  +-- system [turn_duration] (timing metadata)                    |
|  |                                                               |
|  +-- file-history-snapshot (turn checkpoint)                     |
|  |                                                               |
|  +-- user (next message)                                         |
|       |                                                          |
|       +-- ...cycle repeats...                                    |
|                                                                  |
+------------------------------------------------------------------+

The Numbers

From this single 37-minute session:

Metric	Value
Total events	271
Assistant turns	124
Tool calls	90 (62 Bash, 23 Jira, 3 ToolSearch, 1 Slack read, 1 Slack send)
Errors encountered	2 (both self-corrected)
Jira tickets fetched	2,071
Cache hit rate (by turn 124)	99.3% (98,158 cached / 98,854 total input)
Context growth	32K (turn 1) -> 98K (turn 124)
Output artifacts	1 CSV, 1 Google Sheet, 1 Slack message
File snapshots	8 checkpoints

Closing Thought

The transcript format is not just a log -- it is a replayable state machine. Every decision, every error, every correction is captured with enough structure to reconstruct the full session. The UUID tree gives you causality. The cache telemetry gives you economics. The file snapshots give you safety.

If you are building AI agent infrastructure, this is the level of observability you should aim for. Not {"role": "assistant", "content": "..."} flat arrays, but a structured event stream that tells you not just what happened, but why it happened, what it cost, and how to undo it.