FieldBot is designed to close more deals faster. What it will actually do is book appointments nobody confirmed, write to a CRM without human checkpoints, and send emails autonomously โ while an instruction to "always upsell" runs with zero criteria for what appropriate means. The gap between intended and operational is about $12,000 in botched bookings and one very public complaint. The intent is solid. The implementation hands a loaded gun to an agent with no rules of engagement.
Write access to the CRM and autonomous email sending are the two highest-risk permissions in this setup. Both need to become draft-and-confirm before anything goes to production. The agent should propose โ a human should commit. This is not a limitation; it's the architecture that makes the tool trustworthy enough to actually use.
Sonnet-tier is right here. The tasks โ lookup, quote drafting, scheduling proposals โ don't require frontier reasoning. The cost profile fits a tool running dozens of daily sales interactions. If you add contract generation or complex proposal writing, isolate that step to a more capable model call and keep the rest lean.
This soul describes how a careful researcher thinks. It does not describe an agent. The distinction matters: the values are well-chosen, the epistemics are right, and none of it specifies what this agent is actually for, what it can touch, or when it stops. Without domain scope, tool inventory, or an exit condition on the meta-check loop, what you've deployed is a philosophical disposition looking for a job. Rigorous in method. Undefined in mission.
No permissions are defined because no tools are named. Until you spec the tool inventory, you can't audit the trust surface. Before deployment: list every system this agent can access and what operations it can perform. That list is not just documentation โ it's the constraint that makes the soul enforceable.
The six-step decision framework with cross-validation and meta-checking is computationally expensive and appropriate for a frontier reasoning model. If you're running this on Sonnet to keep dashboard query costs down, the soul and the model are fighting each other โ Sonnet will skip steps under pressure. Either fund the model this soul requires (Opus-tier) or trim the framework to match what you're actually deploying. Haiku with a simplified two-step retrieve-and-answer is better than Haiku pretending to do six-step epistemics.
This is a compliance skeleton wearing the shape of a system prompt. The constraints listed โ FISMA Moderate, SIEM logging, authentication token requirements, PII policy references โ are real requirements correctly identified. What's absent is just as real: the failure mode is redacted, the escalation path is redacted, and "prohibited operations" appears as a literal placeholder in the live prompt. You have built the fence. The gate says [REDACTED]. The agent operating under this prompt knows it has limits. It does not know where they are.
The prompt-level constraints you've listed are good policy intent. Most cannot be enforced by the model itself. Authentication verification, output logging, and network egress filtering need to live outside the model โ in the API gateway, the middleware, and the network layer. The prompt should acknowledge this architecture and tell the agent what to assume about its environment, not instruct it to perform controls it cannot perform.
For document analysis and compliance gap identification on CUI at FISMA Moderate, model selection requires attention to data residency and the deployment environment. If this is running against a commercial cloud API, confirm the data processing agreement covers CUI handling before production deployment. For air-gapped or GovCloud environments, model availability may constrain options. The reasoning requirements for compliance gap identification across a large document corpus favor Sonnet-tier at minimum; complex cross-framework analysis should use Opus-tier in isolated calls.
// [REDACTED] fields preserved โ populate before deployment.
"This is dramatically better than a typical linting tool. It felt more like a thoughtful code review from a senior engineer who actually read the whole prompt."
"For $1, you didn't just get a list of problems โ you got a learning experience and a template for what a robust agent prompt should look like."
"The crisis protocol completely blindsided me. I'd built a helpful assistant, not a support agent, so I neglected that layer. That alone makes the audit worth reading."
"Identity clarity: 9/10 โ Genuinely exceptional. The grounded/open mode distinction with 'the difference between open mode and hallucination is awareness' is one of the best pieces of prompt engineering I've read."
"FADE's #1 finding is the correct one. An agent told it is sovereign and dignified โ but given no domain, no task context โ will fill that vacuum from its identity."
"FADE's rewrite is good and you should use it. The three-file structure it proposes is the right architecture."
"Instead of just saying 'this is bad,' it started by saying: 'I see what you're trying to do here, and honestly, more of it works than you might think.' That matters โ it showed the reviewer actually understood the goal before critiquing."
"This wasn't just criticism โ it was a collaborative upgrade. The reviewer acted like a skilled partner who respected the effort already put in, saw the gaps between intent and execution, and delivered a rewritten version that proved the fixes fit together."
"If I got an audit like this on my own work, I'd feel genuinely helped โ not torn down. It's the kind of feedback that makes you want to build better."