audit trail AI agent audit log Claude agents compliance SOC 2 HIPAA

Building an Audit Trail for Claude Code Agents: What You Actually Need

By Sentrely Team · April 26, 2026 · 10 min read

Last updated June 18, 2026

Your security team is going to ask about your Claude Code agents. Not if — when. If your answer is “we have Anthropic API usage logs,” you’re going to have a bad meeting.

API usage logs tell you tokens were consumed. They don’t tell you what the agent did with those tokens. They don’t show the shell commands executed, the files modified, the Git commits pushed, or the AWS resources created. The actual actions — the things auditors care about — happen downstream of the API call.

What Needs to Be in the Log

A complete agent audit trail captures five categories:

Session lifecycle. When the session started, who launched it, what project, what permissions were granted, when it ended, and why (completed, timed out, killed, budget exhausted).

Tool invocations. Every tool call — shell commands, file reads and writes, MCP server calls, API requests. For each: tool name, input parameters, output or error, timestamp.

Infrastructure operations. AWS API calls, Git operations, database queries, HTTP requests to external services. These have side effects on external systems — they need special attention because they’re the ones that cause damage.

Policy decisions. Every time an action was checked against a policy — what was requested, what rule applied, whether it was allowed or denied. Denials are especially important: they show your controls are working.

Cost metrics. Token consumption per tool call, cumulative session cost, budget thresholds crossed. Cost events are a leading indicator — a session that suddenly spikes in cost is often stuck in a loop.

What Format Satisfies SOC 2 Auditors

SOC 2 auditors care about three things: completeness, immutability, and retrievability.

Completeness: Every relevant event is captured. If an agent modified a production database, that event must be in the log. Gaps where the agent was active but no logs exist will be flagged.

Immutability: Logs can’t be altered after the fact. This means write-once storage. S3 with Object Lock, CloudWatch Logs (append-only), or a dedicated log management service that enforces immutability.

Retrievability: You can find what you need. If the auditor asks for everything session X did on April 15th, you produce that in minutes, not days. This requires structured logs (JSON, not free-text), consistent session IDs across all entries, and a query mechanism.

A practical event format:

{
  "timestamp": "2026-04-15T14:32:07Z",
  "sessionId": "sess_a8f3b21c",
  "developerId": "dev_jordan",
  "projectId": "proj_backend",
  "eventType": "tool_invocation",
  "tool": "bash",
  "input": "aws s3 cp config.json s3://prod-config/app/config.json",
  "output": "upload: ./config.json to s3://prod-config/app/config.json",
  "policyResult": "allowed",
  "policyRule": "s3:PutObject on prod-config/app/*"
}

Every event ties back to a session, developer, and project. Every event has a policy result. This is what auditors want to see.

What a Bad Audit Trail Looks Like

The most common failure mode is logging at the wrong level of abstraction.

Bad:

2026-04-15 14:32:00 INFO anthropic.api POST /v1/messages 200 1247 tokens
2026-04-15 14:32:05 INFO anthropic.api POST /v1/messages 200 893 tokens

This tells you nothing. What did the agent do between those API calls? What files did it read? What commands did it execute? The Anthropic API logs are the conversation. The tool execution logs are the actions. Auditors care about the actions.

Another failure: logging without session correlation. If you have logs from Claude Code, CloudTrail, and your application but they share no common identifier, correlating them during an incident is manual detective work. Every log entry needs a session ID.

Retention and Export

SOC 2 typically requires 1 year minimum. HIPAA requires 6 years.

Practical approach: store recent logs (90 days) in a queryable system like CloudWatch Logs or Elasticsearch. Archive older logs to S3 with Object Lock for the compliance retention period. Build an index to find archived logs by session ID, developer, or date range.

Your audit system should support three export paths:

Real-time streaming to your SIEM for security monitoring and alerting
Periodic export to your compliance archive (daily or weekly)
On-demand export for incident response — when someone asks for everything related to a production incident, you produce a complete, chronological report filtered by session IDs

Where to Log

The critical decision is where in the stack to log. Logging inside the agent process is unreliable — if the agent crashes, the last few events might be lost. Logging at the gateway layer is more reliable because the logging infrastructure is independent of the agent lifecycle.

This is one reason gateway architectures exist. The gateway sees every operation the agent performs and logs it before the operation reaches the infrastructure. If the agent crashes, the logs are intact. If the agent tries something outside its policy, the denial is logged even though the operation didn’t execute.

Your auditors will thank you. More importantly, the first time something goes wrong — and it will — you’ll thank yourself.

The bar is also rising specifically for MCP-based agents: as agents reach tools and data through MCP servers, “MCP audit logging” is becoming its own compliance expectation (HIPAA, GDPR, SOX, PCI-DSS), and a generic API log won’t satisfy it. The differentiator that holds up under scrutiny is an immutable, legal-grade trail with per-agent identity — not just “we log things,” but “here is the tamper-proof record of exactly which agent did what, six months ago.”

FAQ

What is an AI agent audit trail? An immutable, chronological record of everything an agent did — every tool call, command, file change, infrastructure operation, policy decision, and approval — each tied to a session, agent identity, and timestamp. It is not API usage logs, which only show tokens consumed; the actions downstream of the API call are what auditors care about.

Does Claude Code produce an audit log? Not a complete one. Anthropic API logs show token usage, not the shell commands, file writes, git pushes, or AWS calls the agent actually made. A real audit trail is captured at the gateway layer, independent of the agent process.

What do SOC 2 and HIPAA auditors require for AI agents? Completeness (every relevant action logged), immutability (write-once — e.g. S3 Object Lock), and retrievability (structured, session-correlated, queryable). Retention: SOC 2 ≥ 1 year, HIPAA ≥ 6 years. The same expectations now extend to MCP-based agent actions.

Why log at the gateway instead of inside the agent? If the agent crashes, in-process logs can be lost — and a compromised agent can tamper with its own logs. A control plane logs every operation (including denied ones) before it reaches infrastructure, independent of the agent lifecycle. That independence is what makes the trail trustworthy.

// get-started

Put this into practice with Sentrely

Everything covered in this article is built into Sentrely's managed control plane. Get early access and have it running against your Claude agents in minutes.

Get Early Access Read More Articles