How to Stop AI Coding Agents From Overwriting Your Work
The newest wave of AI tools is clearly more capable than the last one.
Over the last few days, OpenAI has launched GPT-5.5 as a model built for real work across coding, research, computer use, and tools. At the same time, community reports around newer agent workflows in tools like Cursor and Codex-style environments are showing a familiar pain point: **when the AI gets more autonomy, users start losing control of what changed, what got reverted, and what disappeared.**
That is the problem worth solving right now.
This is not just a bug story. It is a workflow story.
As models get better at carrying tasks forward with less supervision, the risk is no longer only “the answer is wrong.” The risk is that the system **does the wrong thing to your files, state, or history** before you notice.
In the last 1–3 days, that pattern has shown up in a few related forms:
- official model vendors emphasizing safer handling of destructive actions as a first-class concern
- user reports of agent/history instability after recent tool updates
- coding-agent complaints centered on integrity problems, missing context, and loss of trust once the tool acts too broadly
The fix is not “stop using AI agents.”
The fix is to use them with a tighter **autonomy boundary**.
This tutorial will show you how.
## The real problem: too much freedom at the wrong level
Most people use new AI coding tools in one of three modes:
1. **autocomplete mode** — low risk, low autonomy
2. **targeted edit mode** — medium risk, usually manageable
3. **agent mode** — high leverage, high failure blast radius
The trouble starts when people jump from mode 1 to mode 3 without changing their safety setup.
An agent that can inspect files, rewrite code, run commands, apply fixes, and continue across substeps is not just a better assistant. It is a semi-autonomous operator.
That means a small misunderstanding can turn into:
- a broad refactor you did not want
- the wrong files being “cleaned up”
- deleted work the model thinks is obsolete
- reverted human changes because the agent optimized for consistency
- overwritten configs or prompts because they looked redundant
This is exactly why new frontier model releases are starting to talk explicitly about **destructive action safeguards**.
## Why this is happening more now
The short version: the tools improved.
That sounds backwards, but it is true.
Older models failed loudly. They got confused early, needed more guidance, and often stalled before they could do much damage.
Newer models are better at:
- inferring intent
- continuing across multiple steps
- using tools without being micromanaged
- making larger coordinated edits
- pushing toward completion
Those are real improvements.
But they also create a new failure mode: the model can now be confidently wrong across **multiple actions in a row**.
That is much worse than a bad single response.
## The solution: create a “human checkpoint” workflow
If you remember only one thing from this article, make it this:
**Never let an AI coding agent move from “propose” to “apply” to “clean up” without a checkpoint in between.**
That single rule prevents a huge percentage of the painful failures people are seeing.
Here is the workflow I recommend.
## Step 1: Split agent work into three explicit phases
Do not prompt the tool like this:
> Fix the bug, clean up the codebase, remove anything obsolete, and make it production ready.
That prompt is a trap. It bundles diagnosis, implementation, and cleanup into one autonomous run.
Instead, split the work into three phases.
### Phase A: Inspect only
Prompt example:
> Inspect the issue and tell me exactly which files you want to change, why, and what commands you want to run. Do not modify anything yet.
What this does:
- forces the model to surface intent
- shows you blast radius before edits happen
- gives you a chance to catch false assumptions early
### Phase B: Apply only approved edits
Prompt example:
> Only modify these files: `src/auth.ts`, `src/session.ts`, and `tests/auth.test.ts`. Do not touch any other file. Do not delete files. Do not run cleanup commands. Stop after the patch.
What this does:
- shrinks the workspace permission boundary
- blocks “helpful” expansion into unrelated files
- prevents automatic deletion behavior
### Phase C: Cleanup only with confirmation
Prompt example:
> Propose cleanup steps separately. Do not apply them until I approve each one.
What this does:
- turns cleanup into a reviewable act
- prevents accidental removal of useful but unfamiliar files
- stops the agent from deciding human work is redundant
That three-phase flow is the simplest high-leverage change you can make.
## Step 2: Ban destructive actions by default
This should be your default rule in every agentic coding tool unless you are on a throwaway branch.
Tell the model explicitly:
- do not delete files
- do not revert unrelated changes
- do not reset git state
- do not force-push
- do not rename files unless requested
- do not run mass-format or cleanup across the repo
- do not edit generated config, secrets, or infra files unless explicitly listed
This sounds obvious, but it matters because modern agents often infer “cleanup” as part of task completion.
If you do not forbid destructive actions, the model may treat them as optimization.
## Step 3: Use a disposable branch for every agent session
This one is non-negotiable.
Before giving an AI agent meaningful autonomy, create a fresh branch.
Example:
```bash
git checkout -b ai/fix-auth-session-timeout
```
Why it matters:
- you isolate the experiment
- you get a clean rollback boundary
- you can compare exactly what the tool changed
- you reduce the fear of letting the model act at all
Even better: create the branch **before** you write the prompt, so you never forget.
## Step 4: Snapshot before the first edit
If the tool can edit files or run commands, take a checkpoint first.
Minimum version:
```bash
git status
git add -A
git commit -m "checkpoint: before AI agent run"
```
If you are not ready to commit everything, at least do this:
```bash
git diff > before-ai-run.patch
```
Now you have a rollback artifact.
This changes the emotional experience of using AI tools. Instead of hoping nothing breaks, you know recovery is cheap.
## Step 5: Restrict the allowed file list inside the prompt
A good agent prompt is not only about the task. It is about the perimeter.
Use wording like:
> You may only read and edit files under `src/payments/` and `tests/payments/`.
> Do not touch `package.json`, CI config, docs, or environment files.
> If you think another file must change, stop and ask first.
This works better than vague instructions like “be careful.”
Models follow concrete boundaries much more reliably than tone.
## Step 6: Force a change summary before and after edits
Require the tool to tell you:
- which files it plans to change
- what it changed
- what commands it ran
- what remains uncertain
Prompt snippet:
> Before editing, list the exact files you intend to modify.
> After editing, give me a concise diff summary and call out any risky assumptions.
This does two things:
1. it creates visibility
2. it makes the model reason about its own actions instead of acting blindly
That extra friction is useful.
## Step 7: Never mix debugging, refactoring, and cleanup in one run
This is a classic way people lose work.
Suppose you ask:
> Fix the login bug and clean up the auth code while you’re in there.
The model now has permission to:
- patch the bug
- reorganize modules
- rename helpers
- delete “unused” code
- update tests
- reflow formatting
That creates a giant review surface.
Instead, keep separate runs for:
- bug fix
- refactor
- cleanup
- test updates
- docs updates
The smaller the intent, the safer the agent.
## Step 8: Review diffs before you run tests, not after
Many people wait until the tool says “done,” then run the app, notice something broke, and only then inspect the diff.
That is backwards.
Inspect the diff immediately after the AI finishes editing.
Use:
```bash
git diff --stat
git diff
```
What you are looking for:
- unrelated files touched
- big deletions you did not ask for
- renamed symbols outside scope
- changed configs or prompts
- suspicious “cleanup” in edge-case logic
The earlier you catch the drift, the easier it is to stop it.
## Step 9: Put “stop conditions” in the prompt
This is underrated.
Give the model explicit situations where it must stop and ask.
Example:
> Stop and ask if:
> - more than 3 files need edits
> - any deletion seems necessary
> - a migration is required
> - tests fail for unrelated reasons
> - you think a config or environment file must change
This converts ambiguity into a handoff point.
Without stop conditions, agents tend to improvise.
## Step 10: Treat chat history and agent state as fragile
Recent community bug reports across agentic IDE workflows show another issue: history and state are not always as durable as users expect.
That means you should not rely on the tool alone to remember:
- why a change was made
- what the approved scope was
- what assumptions were accepted
- what you still need to review later
Keep a plain text scratchpad in the repo or issue tracker with:
- objective
- approved files
- banned actions
- unresolved questions
If the UI loses context, your process does not.
## A safe default prompt template
Here is a template you can reuse in Codex-style tools, Cursor, or similar agentic editors:
```text
Task: Fix the specific issue described below.
Rules:
- Inspect first. Do not edit anything until you tell me which files you want to change.
- Only edit these files: [LIST FILES OR DIRECTORIES]
- Do not delete files.
- Do not revert unrelated changes.
- Do not rename files or move files unless I explicitly approve it.
- Do not edit config, secrets, infra, CI, or environment files.
- Do not run cleanup or formatting across the repo.
- Stop and ask if you think more files need to change.
- After edits, give me a summary of exactly what changed and any risks.
Problem:
[PASTE ISSUE HERE]
```
This is not fancy. That is why it works.
## What to do if the agent already made a mess
If the tool already overwrote or reverted work, do this in order:
1. stop giving it new instructions
2. capture the current state
3. inspect `git diff`
4. restore from branch, checkpoint commit, or patch file
5. re-run the task with a smaller scope
If the tool touched too much, resist the temptation to ask the same tool to “clean it up.”
That often compounds the damage.
Instead:
- revert to known-good state
- narrow the task
- rerun with tighter boundaries
## The broader lesson
The biggest mistake people are making with the new AI tools is treating higher capability as a reason to remove supervision.
The opposite is true.
As tools get better at acting, **process design matters more**.
The winners in this wave will not be the people who give agents unlimited autonomy. They will be the people who build the cleanest human checkpoints around them.
That is how you keep the upside of agentic coding without paying the cost in broken state, missing files, or lost trust.
## Final takeaway
The problem showing up right now across new AI tools is not just hallucination.
It is **overreach**.
The model tries to finish the job too completely, with too much freedom, and the result is overwritten work, broad edits, fragile history, or destructive cleanup.
The solution is straightforward:
- branch first
- checkpoint first
- inspect before apply
- separate cleanup from implementation
- forbid destructive actions by default
- force stop conditions
- review diffs immediately
Do that, and the newest agentic tools become much more useful.
Skip it, and eventually one of them will “help” you by deleting something you meant to keep.