Skip to main content

Building defensible AI: Why AI agents need continuous security evaluations

Zendesk prioritizes security prevention, with a virtuous cycle between product development and AI security.


Vinay Patel

Vinay Patel

Chief Trust and Security Officer at Zendesk

Last updated May 4, 2026

Building defensible AI: Why AI agents need continuous security evaluations

According to a recent forecast by Gartner, legal claims for "death by AI" - referring to safety-critical failures in autonomous or customer-facing systems - will exceed 2,000 globally by the end of 2026. As many companies rely on "black box" AI - where the internal decision-making is largely invisible and difficult for humans to understand or control - organizations without rigorous technical guardrails may face a future defined by litigation and repercussions. 

Recently, we’ve seen a rise in specialized "AI insurance" and external underwriting services. While these may mitigate direct financial repercussions, they can’t eliminate reputational damage and lost trust. This is why our mission is to focus on prevention. We do repeatable technical stress testing to ensure your AI agents will behave securely, before they even talk to a customer to minimize liability exposure.

Copy of 25_BlogVisuals - 10

AI sets a new frontier for software security 

In standard software, an input of "A" reliably results in "B." It is predictable and deterministic. AI agents are not, and that’s on purpose. Significant autonomy is what creates value for agentic service. But without a dedicated evaluation framework, their non-deterministic nature means that the same user intent can yield a helpful answer today and an unsafe outcome tomorrow. This is especially true when an attacker strategically reframes requests, escalates pressure, or exploits ambiguity across many turns. This unpredictability is particularly concerning in specialized fields or regulated industries.

Moreover, in traditional cybersecurity, we protect systems by patching known bugs and validating fixes with deterministic “one-and-done” tests. But AI agents operate in open-ended, multi-turn conversations. The industry’s hurdle is building adversarial testing and security evaluations that reflect how real attacks and malfunctions unfold over time.

Our process: Continuous AI security evaluations

Rather than treating security as a one-time audit, we have implemented an internal evaluation program that creates a virtuous cycle between product development and AI security. When a test fails, we don't just log it - we improve the system prompts and guardrails (at Zendesk, guardrails are a combination of prompt engineering techniques, code architecture, and traditional security features), then test again until the vulnerability is closed. This means your security posture improves every single day, a part of our Resolution Learning LoopTM. 

continuous-security3

Our AI security evaluation program is built on three pillars:

  1. Continuous adversarial, multi-turn simulation: Real-world threats rarely happen in a single prompt. We use a generative "attacker" model to regularly simulate complex, multi-turn dialogues. This isn’t a one-time check - it’s an ongoing stress test of our AI agents’ resistance to progressive manipulation - including attempts to steer it into unapproved behaviors, policy bypasses, or unsafe tool usage.
  2. Evaluations specific to support workflows: Generic safety and security benchmarks are too broad and too stale to be meaningful in a support environment. We build and constantly update our evaluations around real-world support workflows and policy boundaries — such as managing permissions and data access, processing refunds or credits, and navigating account changes. We perform these tests on realistic synthetic test accounts, ensuring that the evaluations never impact live customer data.

  3. The virtuous cycle of platform improvement: Through specialized AI security research of non-deterministic systems, we have identified statistical thresholds required for certainty - determining how many times a test must be repeated to prove a vulnerability is truly closed. This creates a powerful feedback loop: when a simulation ends with a successful attack, we analyze system logs to identify the specific reason for failure. We then research and implement fixes to our system prompts and guardrails, to ensure the system can’t succumb to that threat again. This process ensures our product evolves faster than the emerging threats, transforming every identified risk into a permanent improvement.

Closing the governance gap: Agentic AI needs continuous security evaluations

The era of "casual experimentation" with AI is over. AI has moved into the operational core, where AI agents now autonomously manage complex support workflows. But as this autonomy increases, so does the burden of proof. 

At Zendesk, we believe the winners won’t be the companies with the most extensive liability coverage, but those with the most rigorous engineering standards. A financial payout after a safety-critical failure is a reactive band-aid; a continuous, adversarial evaluation program is a proactive shield. By building a "virtuous cycle" of testing, security evaluations, validation against workflow-specific failure modes and constant product improvements, we aren’t just preparing for the possibility of a mistake - we are engineering the defensible evidence to prevent it from happening in the first place. 

Vinay Patel

Vinay Patel

Chief Trust and Security Officer at Zendesk

Vinay Patel is the Chief Trust & Security Officer at Zendesk, where he leads the charge in safeguarding Zendesk and oversees the fulfillment of Zendesk’s commitments to its customers and stakeholders on security and compliance. He firmly believes that Trust is central to our relationships with customers and that it is earned through transparent and consistent execution of information security controls.