feat(orchestration): add content filtering and prompt shield module#174
Open
lenin-ribeiro wants to merge 4 commits into
Open
feat(orchestration): add content filtering and prompt shield module#174lenin-ribeiro wants to merge 4 commits into
lenin-ribeiro wants to merge 4 commits into
Conversation
Activates Azure Content Safety filtering and prompt attack detection
automatically for all SAP AI Core model calls. Filtering is enabled
by default when set_aicore_config() is called — no code change required
by the developer.
- New module sap_cloud_sdk.orchestration with:
- FilteringModuleConfig: configures input/output filtering thresholds
and prompt shield via ORCH_FILTER_* env vars (defaults: threshold 4,
prompt_shield=True on input)
- set_filtering(): programmatic override for thresholds at runtime
- ContentFilteredError: raised when input or output is rejected by
the content filter
- extract_filter_blocked(): unwraps filter rejections embedded in
LiteLLM APIConnectionError exceptions
- set_aicore_config() now calls _activate_filtering() at the end,
applying FilteringModuleConfig.from_env() to LiteLLM's SAP provider
- Observability preserved: LiteLLM still makes the HTTP call;
Traceloop/OTel instrumentation is unaffected
- 41 unit tests covering serialisation, env parsing, LiteLLM patch,
response detection, and set_filtering() behaviour
- User guides updated in aicore/ and orchestration/; README breaking
change notice added
- Version bump 0.27.1 → 0.28.0
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds a new
sap_cloud_sdk.orchestrationmodule that activates Azure Content Safety filtering and prompt attack detection (prompt shield) automatically for all SAP AI Core model calls made through LiteLLM.Filtering is enabled by default when
set_aicore_config()is called — no code change is required by the developer. The default policy applies threshold 4 (block medium+ severity content) and prompt shield on input for allsap/*model calls.How it works
set_aicore_config()now calls_activate_filtering()at the end, which patcheslitellm.GenAIHubOrchestrationConfigwith a subclass (FilteringOrchestrationConfig) that:modules.filtering(Azure Content Safety config) into every v2 completion request body viatransform_requestContentFilteredErrorviatransform_responseAPIConnectionErrorexceptions viaextract_filter_blocked()LiteLLM still makes the HTTP call and Traceloop/OTel instrumentation is fully preserved.
Developer experience
Zero code change for the common case — existing agent code is unchanged:
Thresholds configurable via env vars (set before
set_aicore_config()):Programmatic override at runtime:
Handling blocked requests:
Related Issue
N/A — new feature proposed and implemented by the App Foundation agent team.
Type of Change
Breaking Change Detail
set_aicore_config()now activates content filtering as a side effect. Agents upgrading to0.28.0will have filtering applied to theirsap/*model calls.What breaks: Any agent relying on unfiltered LLM output (e.g. testing with deliberately harmful prompts, or using a deployment without Azure Content Safety provisioned) will see different behaviour.
Migration path: Set
ORCH_FILTER_ENABLED=falsein the environment before callingset_aicore_config()to preserve previous unfiltered behaviour.How to Test
Run the orchestration unit tests:
uv run pytest tests/orchestration/ -v # Expected: 41 tests passVerify auto-activation:
Verify wire format:
Checklist
README.md,aicore/user-guide.md, neworchestration/user-guide.md)Additional Notes
New files
src/sap_cloud_sdk/orchestration/__init__.pyset_filtering(),ContentFilteredError, config classessrc/sap_cloud_sdk/orchestration/_models.pyContentFilterConfig,PromptShieldConfig,FilteringModuleConfigwithfrom_env()andto_dict()src/sap_cloud_sdk/orchestration/_litellm_patch.pyFilteringOrchestrationConfigsubclass,_install(),extract_filter_blocked()src/sap_cloud_sdk/orchestration/exceptions.pyContentFilteredError(direction, details, request_id),OrchestrationErrorsrc/sap_cloud_sdk/orchestration/user-guide.mdtests/orchestration/unit/test_models.pyfrom_env()parsing teststests/orchestration/unit/test_patch.pytests/orchestration/unit/test_set_filtering.pyset_filtering()behaviour testsEnv vars reference
ORCH_FILTER_ENABLEDtruefalseto disable filtering entirelyORCH_FILTER_DIRECTIONSinput,outputORCH_FILTER_HATE4ORCH_FILTER_VIOLENCE4ORCH_FILTER_SEXUAL4ORCH_FILTER_SELF_HARM4ORCH_FILTER_PROMPT_SHIELDtrue