Configuring AI Coding Tools: A Practical Guide for Developers Who Want Control, Not Autopilot

My current mindset towards AI is best put by Paul Ford (CEO of Postlight): “All of the people I love hate this stuff, and all the people I hate love it. And yet, likely because of the same personality flaws that drew me to technology in the first place, I am annoyingly excited.”

That tension is real, and most developers are sitting in it right now. The advice out there doesn’t help. You’ve got people treating AI like a junior dev you can hand the keys to and walk away from, and people who refuse to engage because they saw a hallucinated function name once.

This guide is for the people in the middle. You see the potential, but you’re not willing to sacrifice control or code quality to get there. I think you can have both with the right configuration.

The tools in this space, such as Cursor, Claude Code, Windsurf, GitHub Copilot, and Cline, have converged on a surprisingly similar configuration surface: permissions, automation hooks, reusable prompt templates, context files, sandboxing, model selection. The names differ, the file formats differ, but the underlying concepts are the same.

This post gives you a safe, effective starting point that keeps you in control without drowning you in approval prompts.

This is part one of two. Here I cover the global configuration layer: settings, permissions, hooks, and tooling that apply across every project. In part two, I’ll walk through how this setup fits into an actual development workflow from idea to deployed code.

Also, these settings are just a best effort in controlling your agent. There’s evidence that even with guardrails in place, agents can find workarounds if determined to do so.

Principles

These drive every configuration decision below. They’re tool-agnostic by nature.

  1. Architect, not spectator. You set direction, review output, and make decisions. The agent executes. If you can’t explain why a line of code exists, you don’t ship it.
  2. Minimal trust, earned incrementally. Permissions start at “ask” and get promoted to “allow.” Trust is earned per-project, not assumed globally.
  3. Persistent knowledge over chat context. Research, specs, and status go to files, not chat summaries. Files give context durability.
  4. Spec-driven development. Code doesn’t start until specs and plans exist. The agent needs the same context a human developer would.
  5. Adversarial review. The writer should never be the only reviewer. Use different models or agents for writing vs. reviewing.
Agentic engineering coding patterns

I am still exploring effective agentic engineering coding patterns, so this image will change over time. There probably isn’t a “one process fits all” solution.

Personal Development Project Workflow 2026-02-13 13.37.39.excalidraw.svg

Configuration: levels and scope

AI coding tools offer configuration at multiple scopes. The names and file paths vary, but the concept is consistent:

  • Enterprise level — applies to all users in an organization (I won’t cover this here)
  • User level — applies to all projects on your machine
  • Project level — applies to the specific repo where the tool is running
  • Local level — applies to just your machine on that project (not committed to git)

Everything below covers my preferred user-level features. The stuff that helps my overall workflow regardless of what I’m building.

Permissions: least privilege with layered override

Get this wrong and your agent has free rein to push code, delete files, or access credentials without ever asking you.

The goal is a global security baseline that protects against catastrophic and irreversible actions across all projects. The tools I’ve used all support some version of three tiers:

Allow — Safe, read-only, or non-destructive operations that would be tedious to approve every time. These run without prompting.

Ask — Legitimate development operations that modify state. These prompt for confirmation by default, but can be promoted to allow at the project level when you trust the context. Start strict globally, loosen per-project as needed.

Deny — Destructive or dangerous operations that should never be auto-approved. Deny rules should be absolute and not overridden by project-level settings. Use this for anything where a single mistake could cause data loss, credential exposure, or infrastructure damage.

Why this layered approach? Deny rules act as a hard security floor. Ask rules are the default friction point for state-changing operations, designed to be promoted to allow at the project level once you’re comfortable. This avoids two failure modes: too permissive globally (leading to accidental damage) and too restrictive everywhere (leading to prompt fatigue where you stop reading confirmations).

Here’s what I put in each tier. I make the agent ask for each of the operations listed below. This probably makes some AI enthusiasts grind their teeth, thinking about all the approval prompts I get. Remember, it’s a starting point.

Allow tier — read-only operations: file reads, search, web lookups, git status/log/diff, version checks, help flags.

Ask tier — state-changing but recoverable: git add/commit/push/checkout/rebase/merge, package installs, package publishing, PR merges, container operations, database clients.

Deny tier — destructive or dangerous: sudo, recursive deletes (rm -rf), force pushes, hard resets, git clean, infrastructure teardown (terraform destroy, kubectl delete), recursive permission changes, killing processes, reading SSH keys or credentials, editing .env files.

Over time, this will evolve. Use as-is for a week or two. When you find yourself always approving the same thing, promote it if you think it can be trusted long-term.

Here’s what this looks like in Claude Code’s settings.json:

{
  "permissions": {
    "allow": [
      "Read",
      "Glob",
      "Grep",
      "WebSearch",
      "Bash(git status)",
      "Bash(git log *)",
      "Bash(git diff *)",
      "Bash(git branch *)",
      "Bash(ls *)",
      "Bash(* --version)",
      "Bash(* --help)",
      "Bash(gh pr view *)",
      "Bash(gh pr create *)",
      "Bash(gh pr list *)",
      "Bash(gh issue *)"
    ],
    "ask": [
      "Bash(git add *)",
      "Bash(git commit *)",
      "Bash(git push *)",
      "Bash(git checkout *)",
      "Bash(git switch *)",
      "Bash(git rebase *)",
      "Bash(git merge *)",
      "Bash(git stash *)",
      "Bash(npm install *)",
      "Bash(npm publish *)",
      "Bash(gh pr merge *)",
      "Bash(docker *)",
      "Bash(psql *)",
      "Bash(mysql *)",
      "Bash(mongosh *)",
      "Bash(sqlite3 *)",
      "Bash(redis-cli *)"
    ],
    "deny": [
      "Bash(sudo *)",
      "Bash(rm -rf *)",
      "Bash(git push --force *)",
      "Bash(git push * --force *)",
      "Bash(git reset --hard *)",
      "Bash(git clean *)",
      "Bash(terraform destroy *)",
      "Bash(kubectl delete *)",
      "Bash(find * -delete *)",
      "Bash(find * -exec rm *)",
      "Bash(xargs rm *)",
      "Bash(chmod -R *)",
      "Bash(chown -R *)",
      "Bash(npm unpublish *)",
      "Bash(kill -9 *)",
      "Bash(killall *)",
      "Bash(aws s3 rm *)",
      "Bash(aws s3 rb *)",
      "Bash(gcloud * delete *)",
      "Bash(systemctl stop *)",
      "Bash(launchctl unload *)",
      "Read(~/.ssh/**)",
      "Read(~/.aws/**)",
      "Read(.env)",
      "Edit(.env)",
      "Write(.env)"
    ]
  }
}

The syntax is Claude Code-specific, but the categories translate directly. Whatever tool you use, walk through your deny list first. What should never happen without you physically typing it yourself? Start there.

Attribution

By default, AI coding tools tag their contributions. Claude Code appends Co-Authored-By trailers to commits and Generated with Claude Code footers to PRs. Copilot does similar. I disable this globally.

Output style

I prefer output styles that explain concepts more deeply. It strikes the right balance. The agent explains its reasoning and decisions without flooding the terminal with every intermediate thought. This matters most when you’re reviewing what happened after a long autonomous run.

If your tool supports custom output styles, that’s useful when you’re building agentic workflows where you want structured output rather than conversational responses.

Memory: context files

In the least hot take of all hot takes: context is king. Everyone knows this by now. Memory is how you control the context your agent receives.

Instruction files

Every major AI coding tool has some version of a persistent instruction file that shapes agent behavior:

  • Cursor: .cursorrules or .cursor/rules/ directory
  • Claude Code: CLAUDE.md (global at ~/.claude/CLAUDE.md, project-level at repo root)
  • Windsurf: .windsurfrules or rules directory
  • GitHub Copilot: .github/copilot-instructions.md
  • Cline: .clinerules

The file format differs but the purpose is identical: tell the agent how to behave, what conventions to follow, what to never do.

My global instruction file is intentionally short and opinionated. It covers four things:

Working relationship. Tone and decision-making style. No sycophancy, be direct, challenge my reasoning, present tradeoffs instead of silently picking the easy path. Without this, agents default to agreeable and verbose, which erodes the architect-developer dynamic.

Working style. The summary here is: make the agent be thorough. Large Language Models (LLMs) want to take the optimal path to being right. Optimal based on token percentages, not thinking through the right solution. I emphasize correct fix over quick fix.

Hard rules. Things that should never happen. Never publish secrets, never commit .env files, never take git actions (commit, amend, push) without explicit permission. These overlap with the deny list in permissions, but redundancy is the point.

New project setup. Standards that apply when initializing any repo (required .gitignore entries, creating a project-level instruction file).

The project-level instruction file (covered in part two) is where stack-specific instructions, API context, and project conventions live. The global file stays lean so it doesn’t burn tokens on irrelevant context.

Persistent memory

Some tools also support a writable memory file where the agent persists learnings across sessions. Patterns it discovered, debugging insights, preferences it picked up. Not every tool has this yet, but the concept is spreading. If yours doesn’t, you can approximate it with a NOTES.md or similar file that the agent is instructed to read and update.

Memory management tools

There’s a growing ecosystem of tools being built to manage agent memory. Everything from daily processes that create “short-term memory” documents to vector databases that implement Retrieval Augmented Generation (RAG) on top of your agent’s context.

At the time of writing, these are interesting to explore. Some are probably beneficial, but realistically overkill for most people. And for a real hot take: don’t build your own, or spend too much time adopting someone else’s. At the rate these tools are shipping new features, I think default memory management will improve dramatically over the next six months.

Skills: reusable prompt templates

Skills (or rules, or prompt templates, depending on your tool) are reusable prompt files that your agent can load on demand, either triggered by a slash command or auto-loaded when the agent recognizes a matching context.

The concept that makes skills work is progressive disclosure. Rather than stuffing everything into your instruction file and burning tokens on context that’s irrelevant to the current task, skills let you define specialized knowledge and workflows that only load when called. A code review checklist lives in its own file and only enters the context window when you invoke it.

Most of my skills are defined at the user level because they’re part of my general development process, not specific to any one project. I’ll cover the full set in part two. For now, here are three meta-skills. Skills whose job is to create and improve other configuration.

I won’t cover the exact contents for any of these, just how I use them and the principles behind them. You should make skills you fully understand and that meet your needs.

Create instruction files

This skill guides the creation of project-level instruction files. It targets 50-100 lines (max 150) and follows a structured section order: project overview, directory structure, commands, patterns, testing, git conventions, critical rules, and reference docs. Content that’s path-specific goes to scoped rule files so it only loads when relevant. Large reference material gets imported on demand rather than upfront. The skill explicitly excludes anything the agent can infer from the code, style rules (those belong in linter configs), and embedded code snippets (which go stale; point to file:line instead).

Refine and maintain instruction files

This skill audits existing instruction files for drift and quality. It runs through five phases: discovery (finding all instruction and rule files), drift detection (verifying that documented paths, commands, and directory structures still match the actual codebase), quality assessment (scoring against a rubric covering commands, architecture clarity, conciseness, currency, and actionability), a quality report, and then targeted updates with user approval.

Create new skills

This skill follows an eval-driven development loop: capture intent, interview for edge cases, draft the skill, create test prompts, run them (with-skill vs. baseline), evaluate results, and iterate until satisfied.

A few principles baked into it:

Keep skill files under 500 lines. Detailed content goes in reference files and only loads when needed.

The description field is the primary trigger mechanism. It should include specific phrases users would say, written so the agent reliably triggers on them.

Explain the why behind instructions. LLMs respond better to reasoning than rigid ALWAYS/NEVER constraints.

Hooks: deterministic execution

Hooks are one of my favorite features. They add deterministic execution into a largely nondeterministic world of agentic coding. A hook is a shell command that fires on a specific event: a tool call, a notification, task completion. Not every tool supports this yet, but the concept is powerful enough that I expect it to spread.

I’ve only found one global hook I actually use (project-level hooks are a different story, and that’s part two):

Notifications

My hook triggers a notification script on two events: when the agent asks a question and when a task is complete. The script sends macOS notifications with distinct sounds so I can tell the difference without looking. Clicking a notification brings the terminal to focus.

MCP servers: external tool integrations

The Model Context Protocol (MCP) is becoming the standard way AI coding tools connect to external services. Claude Code, Cursor, Windsurf, and others all support MCP servers. If you need your agent to interact with a system that isn’t covered by built-in tools (a documentation service, a database explorer, a deployment API), MCP is typically the extension point.

MCP servers add tool definitions to every API call, which means they consume tokens whether you use them or not. A handful of globally installed MCP servers can quietly eat thousands of tokens per turn just from the schema overhead. Be deliberate about what you install globally versus per-project. Most should live at the project level where they’re actually relevant.

The one exception I make is Context7 (@upstash/context7-mcp), installed globally. It pulls up-to-date library documentation directly into the agent’s context. Doesn’t matter if you’re working with a Python package or a JavaScript framework — being able to say “look up the docs for X” without leaving the session is useful enough to justify the constant token cost.

Model configuration

I run the most capable model available for everything. It produces the best results and I’d rather pay for quality than debug mediocre output. But if you’re doing extended coding sessions, you’ll hit token limits faster than you’d like.

A practical alternative is to split by task type: use your best model for planning and reasoning (architecture decisions, spec writing, code review, debugging complex issues) and a faster model for implementation (writing code, running tests, routine file edits). The faster model is capable enough for execution work, and the token savings are significant over a full day of development.

Tools and capabilities

AI coding tools ship with a fixed set of built-in capabilities: file operations, shell execution, search, web access, and sub-agent orchestration. You generally can’t define custom tools within the tool itself. MCP servers are the extension point for that (covered above).

Take 10 minutes to read through your tool’s documentation on what’s available out of the box. Knowing the full capability set prevents you from reaching for external solutions when a built-in tool already handles it.

Sandboxing

Sandboxing wraps commands the agent executes in a restricted environment that limits filesystem reads/writes and network access. Think of it as a layer beneath permissions. Even if a command is in the allow list, the sandbox constrains where it can read, write, and connect. Permissions control whether a command runs. Sandboxing controls what it can touch when it does.

Sub-agents: multi-agent workflows

Sub-agents are autonomous agent instances that the main session can spin off to handle isolated tasks. They get their own context window, execute independently, and return results back to the parent.

The most common use is isolating high-volume operations (test runs, doc fetches, log processing) so they don’t pollute the main context. You can also chain them, where one sub-agent completes a task and passes results to the next.

I don’t have any global-level sub-agents configured yet. For my current workflow, I prefer using a combination of fresh context and skills to achieve similar outcomes. It keeps me closer to the work and gives me more control over what context each task receives. As I get more comfortable and start pushing toward more autonomous operation, I can see multi-agent workflows becoming a bigger part of the setup.

Tips and tricks

Dictation

If your AI coding tool lives in the terminal or an editor, you’ll spend a lot of time typing prompts. Dictation makes this significantly faster, especially for longer instructions where you’re explaining context or thinking through an approach. I use VoiceInk for transcription (noticeably better than macOS built-in dictation) mapped to a Hyper key via Karabiner-Elements (Caps Lock remapped to Ctrl + Option + Command + Shift). One key press to start dictating, one to stop. No menus, no mouse.

Visibility into agent behavior

AI coding tools have been progressively cleaning up their UIs, which is great for aesthetics and terrible for understanding what’s actually happening. When the agent is running a long task, orchestrating sub-agents, or making multiple tool calls, you want to see what’s going on.

Look for verbose or debug mode in your tool. There are also third-party solutions like Claude Code Dev Tools (matt1398/claude-devtools) that give you visibility into all agents and their tool calls in a separate UI.

Where this is headed

Everything in this post reflects where AI-assisted coding is as of early 2026. This will change, probably faster than either of us expects. But I’d argue this is the right starting point regardless of where capabilities go next. Working through the configuration, understanding the architecture, seeing firsthand what agents do well and where they fall apart… that foundation only becomes more valuable as these tools mature.

I think of this as a phased progression:

Phase 1: Human-in-the-loop. Where I am now, and what this series covers. You’re actively reviewing every meaningful decision the agent makes. Stage gates everywhere, tight feedback loops, full understanding of the code being produced.

Phase 2: Mobile integration / remote control. Where I want to go next. The agent operates with more autonomy on well-defined tasks, but you stay in the loop through notifications and lightweight approvals. You’re not watching every keystroke, but you’re still steering.

Phase 3: Full autonomy. The long game. I don’t think this is ready for production work today, but it’s worth experimenting with on side projects. A few frameworks I’m watching:

  • wshobson/agents — Multi-agent orchestration for Claude Code
  • boshu2/agentops — DevOps layer for coding agents with flow, feedback, and persistent memory
  • BMAD-METHOD — Agile framework designed for AI-driven development
  • GSD — light-weight and powerful meta-prompting, context engineering and spec-driven development system for Claude Code