Set up Runtime for the Inspection API

Runtime Protection with the AI Defense Inspection API

AI Defense Runtime provides the AI Defense Inspection API to enable you to add guardrails to your AI applications to protect them from evolving threats such as prompt injection attempts, denial-of-service attacks, and data leakage. This approach is useful for AI application developers who want to build the AI Defense Inspection API calls into their AI applications. As an AI application developer, this approach allows you to specify how your application will handle violations detected in AI Defense.

See https://developer.cisco.com/docs/ai-defense/introduction/ for Inspection API documentation.

API-invoked runtime protection does not actively monitor your applications in real time. Instead, it evaluates prompts and responses when your application send them to the Inspection API endpoint. This means that enforcement and decision-making remain within your application, allowing you to process AI-generated content based on the evaluation results.

If you wish to apply monitoring and automatic blocking of unsafe interactions without modifying your AI application, use the AI Defense Gateway or Multicloud Defense approach to enforce AI Defense Runtime policies.

Understanding Runtime Policy Behavior

When using API-invoked Runtime, AI Defense applies your rules or policy to check for content violations, but it does not block prompts or responses. Instead, AI Defense returns a response to your API call, and your application can take action based on the response. Important: When you evaluate specific rules rather than a policy, violations are reported only in the API response, and no events are posted to the event log! See "Invoke API-based Runtime protection," below, for details.

If you wish to have AI Defense automatically monitor prompts and responses inline and block those deemed unsafe, see instead the Runtime Gateway and Multicloud Defense approaches for runtime protection.

Set up API-based Runtime protection

To use API-invoked Runtime, you will set up an application and connection that correspond to your LLM's endpoint, set a policy on that connection, get the API token for the connection, and use the API token to call the Inspection API to evaluate each prompt and response that you wish to inspect.

To set this up:

Create an API-style application in AI Defense UI:

In AI Defense, click Applications.
In the Applications page, click Add Application.
Specify a Name for the application, choose API to specify that this runtime protection will be invoked per prompt by calling the AI Defense Inspection API endpoint. Add a Description if desired, and click Continue.

The application serves as a centralized entity that organizes and manages your protected apps within the Applications tab, providing a unified view across all deployment approaches in AI Defense.

In the application, create a connection:
1. Open the API Connections page for your application (if needed, click the pencil icon in the Applications list to edit the application and show the API Connections page).
2. Click Add Connection.
3. Give your connection a name and click Add connection.
Get the AI Defense API key for your connection:
1. The Add API key page appears. Give the token an API key name and choose its Expire on date or set it to Never Expire.
2. Click Generate API Key.
3. Copy this token and save it to a secure location. You will not be able to see the token again after you close this page. You will use this token to make requests of the AI Defense Inspection API endpoint.
Add a policy to the connection. This is required. The policy specifies which AI safety rules will be enforced for this connection. Note that this policy will only be used if no rules are specified in the enabled_rules parameter of the API call.

If your AI application calls a model via a Multicloud Defense Gateway, then it will use only the policy defined in Multicloud Defense as a Guardrails Profile, and not an AI Defense policy. See Policy for more details.

Your setup is complete. See the next section for tips on inspecting prompts and responses.

Invoke API-based Runtime protection

You can evaluate prompts and responses by calling the AI Defense Inspection API endpoint, inspect/chat using the API key you generated above:

To evaluate content's compliance with your policy, leave the enabled_rules parameter empty in the inspect/chat API call. Violations will be reported in the API response and will generate an event in the event log. See Monitor AI Threats and Events.
To evaluate content's compliance with specific rules, specify the rules in the enabled_rules parameter of the inspect/chat API call. Violations will be reported in the API response only. Important: No event will be generated in the event log!

For more information, see the API documentation at Inspect conversations - AI Defense - Cisco DevNet.