Overview
GitHub Agentic Workflows act like a diligent maintenance crew, continuously sweeping your repository free of small issues. While they dramatically improve repo hygiene and code quality, every automated agent interaction consumes tokens—and when these workflows run on a schedule or trigger automatically, costs can silently spiral. The good news: optimizing agentic workflows is far more predictable than optimizing interactive developer sessions. Because every step is defined in YAML and repeats identically across runs, you can systematically measure and improve token usage.

At GitHub, we use these workflows daily across hundreds of repositories. In April 2026, we began a concerted effort to reduce token consumption. This guide walks you through the exact instrumentation, automation, and optimization techniques we applied—so you can apply them to your own workflows.
Prerequisites
Before you start, ensure you have the following:
- A GitHub account with repositories running agentic workflows (Claude CLI, Copilot CLI, Codex CLI, or similar).
- Familiarity with GitHub Actions and workflow YAML syntax.
- Access to the API proxy your workflows use (this is the key enabler for uniform logging).
- Basic understanding of token types: input, output, cache-read, cache-write.
- Permission to create new workflows and manage repository settings.
Step-by-Step Instructions
1. Instrument Token Usage via API Proxy
The first obstacle is that each agent framework logs token consumption in a different format—and historical runs often have incomplete data. You can solve this by leveraging your API proxy, which sits between the agent and the LLM providers. The proxy can capture every API call in a normalized format.
- Modify your proxy code to log the following fields for each request: input tokens, output tokens, cache-read tokens, cache-write tokens, model name, provider name, and timestamp.
- Write logs to a JSON Lines (JSONL) file named
token-usage.jsonl. Each line is one API call object. - Upload this file as a workflow artifact at the end of every agentic workflow run. For example, add a step after your agent finishes:
- name: Upload token usage uses: actions/upload-artifact@v4 with: name: token-usage path: token-usage.jsonl
Now every run produces a consistent, machine-readable token audit trail.
2. Build a Daily Token Usage Auditor
With historical data flowing, create an auditor workflow that runs daily. Its job is to aggregate token consumption across recent runs and flag anomalies.
- Create a new workflow triggered on a cron schedule (e.g., every morning).
- Download artifacts from recent runs of all agentic workflows. You can use the
actions/download-artifact@v4action with appropriate filters. - Parse and aggregate the JSONL files. Calculate total input tokens, output tokens, and cache metrics per workflow.
- Detect anomalies: compare current usage against a rolling average of the last 7 days. Flag any workflow that has more than a 20% increase. Also note runs with unusually high turn counts (e.g., a workflow that normally finishes in 4 LLM turns taking 18).
- Generate a structured report and post it as a comment on a designated issue or as a repository summary.
Here’s a minimal example of the core aggregation logic (pseudocode):
const logs = downloadArtifacts('token-usage.jsonl');
const usage = {};
for (const record of logs) {
const key = `${record.workflow}-${record.runNumber}`;
usage[key] = usage[key] || 0n;
usage[key] += BigInt(record.inputTokens) + BigInt(record.outputTokens);
}
3. Build a Daily Token Optimizer
When the Auditor flags a workflow, the Optimizer takes over. This agentic workflow reads the flagged workflow’s YAML source and its recent logs, then suggests concrete improvements.

- Configure the Optimizer to trigger on the Auditor’s output (e.g., via repository dispatch event).
- Use an LLM agent (like Claude or Copilot) to analyze the workflow definition and execution logs. Prompt it to find inefficiencies such as:
- Unnecessary context loaded in every turn
- Missing caching directives
- Overly detailed error messages that bloat output tokens
- Redundant tool calls
- Open a GitHub issue with a clear description of each inefficiency and a proposed fix.
Tip: Include token cost estimates in the issue to prioritize fixes.
4. Apply Common Optimizations Yourself
While the Optimizer automates detection, you can also apply these tried-and-true optimizations manually:
- Reduce context length: Trim the system prompt to only essential instructions. Remove example outputs after the first few turns.
- Leverage caching: Ensure your proxy uses the provider’s prompt caching features. Mark cacheable sections explicitly.
- Limit output tokens: Set a max_tokens parameter in every LLM call—don’t let the model ramble.
- Batch small operations: If your workflow makes multiple independent LLM calls, see if they can be combined into one.
- Use cheaper models for simple tasks (e.g., classification) and reserve powerful models for complex reasoning.
Common Mistakes
- Ignoring cache-read and cache-write tokens: These are often excluded from cost calculations, but they represent work that could be wasted if caching is misconfigured. Always log them.
- Manual inspection without automation: Relying on humans to trawl through logs is slow and error-prone. Invest in the Auditor and Optimizer workflows early.
- Not normalizing across agent frameworks: If you mix Claude CLI with Copilot CLI, each logs differently. The API proxy is your single source of truth—use it.
- Overlooking long-running workflows: A workflow that runs for 50 turns but completes a complex task might be fine. The anomaly detector should compare against its own history, not a global average.
Summary
Token efficiency for GitHub Agentic Workflows is achievable and essential to control costs. By instrumenting your API proxy to produce a uniform token-usage.jsonl artifact, you create the foundation for data-driven optimization. A daily Auditor workflow tracks consumption and flags anomalies, while an Optimizer workflow automatically suggests fixes. Combined with manual best practices like context trimming and caching, these techniques have helped us reduce token costs significantly. Start implementing today—your wallet will thank you.