Runtime

Runtime protection

AI Defense runtime protection secures your LLM chat applications by inspecting user prompts and LLM responses in real time. When runtime protection detects content that violates your security, privacy, or safety policies, it raises an alert in the Events log and, if configured, blocks the content from reaching the user or the LLM.

Applications and policies

In AI Defense, each chat application is represented as an application visible in the Applications panel. Each application contains one or more connections to represent the LLM API(s) being protected. Once you've created an application and its connections, you apply a runtime protection policy to each connection in order to secure it.

Enforcement point

To protect an AI application and its users with your policy, you must set up a runtime enforcement point. This can be Multicloud Defense with AI Guardrails, an AI Defense Gateway, or the AI Defense Inspection API. The different options serve different use cases:

The Multicloud Defense approach allows you to enforce policies without any change to your AI application or AI models. Runtime protection enforced at the Cisco Multicloud Defense Egress Gateway provides a transparent deployment, as the egress gateway is typically the default route for cloud workloads. No major changes are required from your AI application team. TLS interception is necessary, and workloads must trust the configured egress gateway CA. See Set up Runtime for Protection via Multicloud Defense.
The AI Defense Gateway approach allows you to enforce policies without requiring code changes in your AI applications. In this approach, you must configure your AI applications to direct prompts to the AI Defense Gateway URL instead of the standard model URL (such as an OpenAI URL). See Set up Runtime for Gateway Interception.
The Inspection API approach lets you inspect prompts and responses on-demand via the AI Defense Inspection API and handle violations as you like, based on detection output from the API. This option is for scenarios where the AI application team wants only a disposition (good/bad) of each prompt and/or response, which they will handle at the AI application level. In this scenario, AI application developers build AI Inspection API calls into the AI application that they're developing, and then deploy that application to production with its built-in AI Defense protections. See Set up Runtime for the Inspection API.