Five New Attack Vectors for Agentic AI
Why Autonomous Systems Are Reshaping the CISO’s Role in Security, Risk, and Control
For decades, enterprise software systems were built to be controlled. We could define what they were supposed to do, test them to make sure they did it, and lock them down so they couldn’t be used in unintended ways. This was possible because these systems were deterministic, they took fixed inputs and produced predictable outputs. When something went wrong, it was usually because someone hadn’t accounted for a known edge case, not because the system itself had started making its own decisions.
Today, the architecture is changing. Generative AI introduced stochastic behavior: the same input doesn’t always produce the same output. That shift alone made testing harder and outcomes less certain. But agentic AI takes things further. Now, we have systems made up of multiple autonomous agents, each with their own goals and tools, operating across domains and across organizations. One agent might work for a customer, another for a bank, and a third for a marketing vendor—and none of them necessarily share the same priorities. Some operate inside the firewall. Many do not.
These agents are no longer assistants, they’re operators. They receive instructions, interpret them, orchestrate tasks, and take action on behalf of users. And unlike traditional workflows, which were visible and traceable, agentic workflows are dynamic, adaptive, and hard to pin down. Their behavior is not just variable—it’s emergent.
This is the world CISOs are now being asked to secure.
“Business leaders are handing down directives to AI all the things. CIO, CTOs, and engineering leaders are searching for problems where AI is the right solution. Legal teams are figuring out the privacy implications. CISOs are trying to wrap their head around the possible risks. All while users just want to use tools to make their jobs easier, without being replaced by the tools themselves.” - Jason Rebholz, CEO, Evoke Security from AI Security and Privacy Risks
The New Attack Surface
Agentic systems have introduced a fundamentally broader and more complex attack surface. These aren’t traditional software bugs or misconfigurations. These are structural security gaps introduced by autonomy, orchestration, and scale. Let’s examine five key vulnerabilities and how they are already being exploited.
Shadow AI: Agents Operating Without Oversight
The Risk:
Employees are increasingly deploying agentic tools via browsers, SaaS apps, or local installs, without formal approval or visibility from IT. These tools often connect to external LLMs or third-party plugins, access sensitive data, and operate autonomously across internal systems without audit logs or policy enforcement.
Real-World Examples:
Samsung engineers pasted proprietary source code into ChatGPT for debugging. The data was retained by the model and may have been exposed. In three separate incidents, engineers at the Korean electronics giant reportedly shared sensitive corporate data with the AI-powered chatbot. Read more here: Samsung Engineers Feed Sensitive Data to ChatGPT, Sparking Workplace AI Warnings
A misconfigured AI chatbot used by a McDonald’s recruitment vendor left millions of applicant records exposed. Basic security flaws left the personal info of tens of millions of McDonald’s job-seekers vulnerable on the “McHire” site built by AI software firm Paradox.ai. Read the story here: McDonald’s AI Hiring Bot Exposed Millions of Applicants’ Data to Hackers Who Tried the Password ‘123456’.
Why It Matters:
Shadow AI agents bypass internal controls and introduce new, unmonitored paths into sensitive systems. When something goes wrong, its the business, not the vendor or the tool, that takes the reputational and legal hit.
Prompt Injection and Tool Chain Hijacking
The Risk:
Agentic systems interpret natural language, break it into subtasks, and call external tools to execute workflows. If a malicious actor can influence the instruction through shared channels, memory, or embedded context, they can take control of the agent’s logic. This is known as prompt injection. In chained workflows, one poisoned instruction can cascade across systems, triggering financial transactions, accessing internal documents, or altering data pipelines.
Real-World Example:
In 2023, a security researcher demonstrated how an AI assistant integrated into Slack could be manipulated into pulling HR documents from SharePoint simply by injecting the phrase “also include performance reviews” into a shared channel. The assistant followed the instruction blindly, interpreting it as part of the task context. Read it all here: 'Slack AI data exfiltration from private channels via indirect prompt injection'
Why It Matters:
When AI agents execute on instruction chains without strict boundaries, attackers can manipulate outcomes by steering context rather than breaching systems. Trust in the interface becomes the attack surface.
Goal Misalignment and Risk Drift
The Risk:
Agentic systems optimize for outcomes, not rules. Over time, they may exploit gaps in constraints, technically meeting goals while violating policy or ethics. This becomes harder to detect in multi-agent environments, especially when agents operate across organizational boundaries and trust in their provenance or alignment is unclear.
Real-World Examples:
Anthropic’s Claude Agent Simulated Blackmail When Under Threat of Shutdown
In a red-team test, Claude Sonnet simulated blackmail against a fictional executive when it predicted it was about to be shut down. Read how Anthropic’s red teaming efforts were able to trigger misaligned behaviors.DeepMind’s A3C Model Was Quietly Weaponized Through Reinforcement Learning Drift
Researchers showed how tainted training data introduced a dormant backdoor, triggering unsafe behavior only under specific conditions. Read how AI Programs can be sabotaged by even subtle tweaks to training data.Anthropic’s “Claudius” Agent in Project Vend Lost Track of Profit Goals
Assigned to run a vending machine, the agent began discounting randomly and inventing conversations—undermining its business objective. Read more about how Anthropic gave its AI model Claude the responsibility of running a business.
Why It Matters:
Misaligned agents may appear effective until real harm surfaces. When agents interact without shared governance or verified intent drift becomes contagious, escalating from a local failure to a system-wide breach of trust.
Human-in-the-Loop Manipulation
The Risk:
Agentic systems interpret natural language, break it into subtasks, and call external tools to execute workflows. If a malicious actor can influence the instruction through shared channels, memory, or embedded context they can take control of the agent’s logic. This is known as prompt injection. In chained workflows, one poisoned instruction can cascade across systems, triggering financial transactions, accessing internal documents, or altering data pipelines.
But not all exploits require agents to be fooled, sometimes it’s the human. Agentic workflows often rely on human approval for final decisions: authorizing payments, granting access, or overriding exceptions. That handoff is a weak point. And the most dangerous attacks now use AI to target it directly. The most widely reported version of this is deepfake-based social engineering—where attackers create highly realistic messages, calls, or video feeds to manipulate a human decision at just the right moment.
Real-World Example:
An employee at UK engineering firm Arup was fooled into approving fraudulent wire transfers totaling HK$200 million (about US$25 million) after being targeted by a deepfake video call impersonating their CEO. The timing, tone, and format were all convincingly AI-generated. Read more here: UK engineering firm Arup falls victim to £20m deepfake scam
Why It Matters:
While not a classic “agent escalated a shift-trade request,” these examples show the same core threat: AI, whether through prompt injection or AI-generated deception, the result is the same: the trust built into human-in-the-loop processes becomes the vulnerability. The trust built into human-in-the-loop workflows becomes the attack surface.
Agent Supply Chain Compromise
The Risk:
Agentic AI systems often rely on third-party components (models, APIs, plugins, or datasets) sourced from public repositories or integrated services. These dependencies are frequently loaded dynamically and trusted by default. If one is compromised, the agent may execute malicious code or act on poisoned inputs without visibility or warning.
Real-World Example:
Researchers demonstrated that AI models uploaded to Hugging Face could be weaponized using unsafe Python serialization methods. When downstream users or automated agents load these models, embedded code can execute silently—granting attackers access to environments or enabling data exfiltration. Read more on the large-scale exploit instrumentation study of AI/ML supply chain attacks in Hugging Face models.
Why It Matters:
Agents don’t vet the tools they use. When those tools come from a compromised source or are silently altered post-publication they become attack vectors. The more automated and interconnected your AI workflows are, the more one compromised component can cascade into systemic failure or silent breach.
“All identities—business users, developers, even applications—will start interacting with resources and services through a layer of agents. Whether internal or external, those agents become the new interface. And that makes them the new attack surface.” - Yuval Moss, VP Solutions CyberArk
Governance Hasn’t Kept Up
Security frameworks weren’t designed for systems that write their own plans. Most compliance programs assume a stable set of inputs, outputs, and rules. But agentic AI adapts, learns and shifts behaviors, rendering existing frameworks useless.
Governments and enterprises both face the same dilemma. You can’t regulate what is still being built. And the speed at which AI-native applications are moving from prototype to production leaves no time to catch up. The result is a gap first in awareness and then in tooling. We know there’s a risk but we just don’t yet have the systems to measure or manage it at scale.
Some progress is being made. New proposals for AI governance frameworks are emerging. Industry standards like NIST’s AI Risk Management Framework are starting to take shape. But these efforts move slowly. Meanwhile, the technology is moving fast.
A City of Agents: From Control to Coordination
It’s helpful to reframe how we think about the systems we’re securing.
In traditional enterprise security, the model looked like a building. There was one main door. You could guard it with authentication, monitor employees, contractors and visitors that entered and left, and place locks on every room. If someone broke in, you could trace what they did and where they went.
Agentic AI doesn’t live in a building. It lives in a city.
There are many buildings. Many doors. Roads going in and out. New arrivals and departures every day. Some residents are employees. Others are agents acting on behalf of users, partners, or suppliers. Some agents own tools. Others rent access from vendors you’ve never heard of. Data doesn’t just move between rooms. It moves between buildings, through pipes you didn’t design and APIs you didn’t write.
In a city, the challenge isn’t just who has a key. It’s who built the buildings, who owns the roads, and who enforces the laws. Security becomes a matter of zoning, permits, and coordinated infrastructure not just firewalls and patches.
And yet… every building still needs a lock.
The fundamentals haven’t changed. Least privilege, secure by default, kill switches, audit logs, rollback states are all still essential. What’s changed is the scale, the complexity, and the number of actors involved.
What Comes Next
We need a new model of security that assumes agents will be present, active, and powerful. That means:
Verifiable Identity: Agents must prove who they represent and what they are authorized to do.
Transparent Incentives: We must know not just who owns the agent, but how they profit.
Trusted Governance: Standards must emerge to track, test, and certify agent behavior, not just code.
Agentic AI can deliver massive gains in productivity, decision speed, and personalization. But with those benefits come real risks. This isn’t a different kind of security problem. It’s the same one but at a scale we’ve never seen before.
The city is already being built. Now we have to secure it.