Skip to content

LCORE-2336: Unified llama_stack.config schema + synthesizer#1963

Open
max-svistunov wants to merge 3 commits into
lightspeed-core:mainfrom
max-svistunov:lcore-2336-unified-config-synthesizer
Open

LCORE-2336: Unified llama_stack.config schema + synthesizer#1963
max-svistunov wants to merge 3 commits into
lightspeed-core:mainfrom
max-svistunov:lcore-2336-unified-config-synthesizer

Conversation

@max-svistunov

@max-svistunov max-svistunov commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Description

Implements LCORE-2336, the first implementation ticket of the unified
lightspeed-stack.yaml feature (LCORE-836 spike). It collapses the two-file
configuration (lightspeed-stack.yaml + an external Llama Stack run.yaml)
into a single file: operators describe inference providers at a high level and
LCORE synthesizes the full run.yaml at startup. run.yaml becomes an
implementation detail LCORE owns rather than an operator-facing artifact.

What this PR adds:

  • Schema (src/models/config.py): UnifiedInferenceProvider,
    InferenceConfiguration.providers (a root-level, backend-agnostic synthesis
    input per Decision S5), UnifiedLlamaStackConfig + LlamaStackConfiguration.config
    (baseline / profile / native_override). The nested check_llama_stack_model
    validator now accepts a config block in lieu of library_client_config_path;
    a new root check_unified_vs_legacy validator enforces mutual exclusion
    between unified synthesis inputs and the legacy path, pointing operators at
    --migrate-config.
  • Synthesizer (src/llama_stack_configuration.py): synthesize_configuration
    (baseline -> enrichment -> high-level inference -> native_override deep-merge),
    deep_merge_list_replace, apply_high_level_inference (with PROVIDER_TYPE_MAP
    and the sentence_transformers -> sentence-transformers provider_id fix from
    the PoC), load_default_baseline, and synthesize_to_file (writes mode 0600).
  • Shipped baseline (src/data/default_run.yaml): a thinner default whose
    external_providers_dir carries a default so it resolves when
    EXTERNAL_PROVIDERS_DIR is unset. src/data is a package so the file ships
    in built wheels (verified).
  • Wiring: library-mode synthesis in src/client.py; a
    --synthesized-config-output flag in src/lightspeed_stack.py propagated to
    the uvicorn workers; shared constants; .gitignore for ./.generated/.

Out of scope (separate tickets): migration tool (LCORE-2337), container/deploy
artifacts (LCORE-2338), deprecation WARN (LCORE-2339), e2e (LCORE-2341/2343),
docs and examples (LCORE-2345/2346).

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement
  • Benchmarks improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: Claude Opus 4.8
  • Generated by: Claude Opus 4.8

Related Tickets & Documents

  • Related Issue # LCORE-2336
  • Closes # LCORE-2336

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

Manual verification (library mode, end-to-end)

  1. Create a unified lightspeed-stack.yaml with no external run.yaml. The
    minimal form needs only a top-level inference.providers list — no
    llama_stack.config block:

    llama_stack:
      use_as_library_client: true
    inference:
      default_model: gpt-4o-mini
      default_provider: openai
      providers:
        - type: openai
          api_key_env: OPENAI_API_KEY
          allowed_models: [gpt-4o-mini]
        - type: sentence_transformers

    (Add an optional llama_stack.config block with baseline / profile /
    native_override when you need the escape hatch — verified separately.)

  2. export OPENAI_API_KEY=<key> and start library mode:
    uv run lightspeed-stack -c lightspeed-stack-unified.yaml

  3. Confirm the service boots and serves:

    $ curl -s localhost:8080/liveness
    {"alive":true}
    $ curl -s localhost:8080/readiness
    {"ready":true,"reason":"All providers are healthy","providers":[]}
    $ curl -s -X POST localhost:8080/v1/query -H 'Content-Type: application/json' \
        -d '{"query": "Name three primary colors. One sentence."}'
    {"response":"The three primary colors are red, blue, and yellow.", ...}
    
  4. Confirm the synthesized file (R6 / R10):

    $ ls -l ./.generated/run.yaml
    -rw-------. 1 user user 2545 ... ./.generated/run.yaml      # mode 0600
    $ grep -E 'provider_id:|api_key:' ./.generated/run.yaml
          api_key: ${env.OPENAI_API_KEY}                        # env ref, never resolved
          provider_id: openai
        - provider_id: sentence-transformers                    # hyphenated id
    

Tests specific to this change

uv run python -m pytest tests/unit/test_llama_stack_synthesize.py \
    tests/unit/models/config/test_llama_stack_configuration.py \
    tests/unit/test_client.py -q
# 55 passed

Full unit suite (Pydantic-config change)

uv run python -m pytest tests/unit/ -q
# 2714 passed, 1 skipped

uv run make verify passes for all changed files (black, pylint 10.00/10,
ruff, pyright, pydocstyle). NOTE: make verify currently also fails on a
pre-existing, unrelated mypy issue in
tests/integration/container_lifecycle/test_container_lifecycle.py (missing
type annotations on the current main), untouched by this PR.

Summary by CodeRabbit

  • New Features

    • Unified-mode configuration synthesis: configure inference providers and Llama Stack settings using a simplified, backend-agnostic format.
    • High-level inference provider definitions: specify providers with type, API keys, and allowed models.
    • CLI option --synthesized-config-output to customize the output path for synthesized configurations.
    • Built-in baseline configuration for unified-mode synthesis.
  • Documentation

    • Updated API schema to reflect unified-mode configuration options and new provider definitions.

…a_stack.config)

Introduce the Pydantic schema for unified-mode Llama Stack configuration, the
first piece of the single-file lightspeed-stack.yaml feature (LCORE-836 spike).
Operators describe inference providers at a high level and let LCORE synthesize
the Llama Stack run.yaml, instead of maintaining a separate run.yaml file.

- UnifiedInferenceProvider: a backend-agnostic inference provider entry
  (type, api_key_env, allowed_models, extra).
- InferenceConfiguration.providers: the high-level synthesis input
  (Decision S5 - lives at the configuration root so it survives a future
  backend change); default_model/default_provider keep their query-time
  routing meaning and are independent of this list.
- UnifiedLlamaStackConfig + LlamaStackConfiguration.config: backend-specific
  knobs (baseline, profile, native_override).

Mode detection lives entirely on the root Configuration model
(check_unified_vs_legacy), since the deciding inputs span the root
(inference.providers) and the nested llama_stack block:

- A synthesis input (a non-empty inference.providers or a llama_stack.config
  block) and the legacy library_client_config_path are mutually exclusive,
  with an error pointing operators at --migrate-config.
- Library mode requires some run source; inference.providers alone is
  sufficient, so no llama_stack.config block is required for the minimal
  unified config. The nested LlamaStackConfiguration no longer rejects a
  pathless library config (it cannot see inference.providers); it only
  file-checks a legacy path. Legacy behavior is otherwise unchanged.

Regenerate docs/openapi.json for the new schemas and fields. Add unit tests for
unified-mode validation, mutual exclusion, the inference.providers-only shape,
and the library-mode-without-run-source error; update the dump-config expected
dicts for the new config/providers fields.
Add the synthesis pipeline that builds a complete Llama Stack run.yaml from a
unified lightspeed-stack.yaml, so run.yaml becomes an implementation detail
LCORE owns rather than an operator-facing artifact.

- synthesize_configuration: baseline selection (built-in default, a profile
  file, or empty) -> existing enrichment (Azure Entra ID, BYOK RAG, Solr/OKP,
  for parity with legacy mode) -> high-level inference expansion -> a raw
  native_override deep-merged last.
- deep_merge_list_replace: maps merge recursively, lists and scalars replace
  wholesale (Decision T2).
- apply_high_level_inference: maps each provider type to a Llama Stack
  provider_type via PROVIDER_TYPE_MAP, hyphenates the emitted provider_id so
  inline embedders match the baseline convention (sentence_transformers ->
  sentence-transformers), replaces a baseline entry of the same id rather than
  duplicating it, and emits ${env.<VAR>} references for secrets.
- synthesize_to_file: writes the result with mode 0600, re-chmodded on each
  boot, since native_override may carry literal secrets.
- load_default_baseline + src/data/default_run.yaml: a thinner shipped baseline
  whose external_providers_dir carries a default so it resolves when
  EXTERNAL_PROVIDERS_DIR is unset. src/data is a package so the file ships in
  built wheels.

Add unit tests covering merge semantics, inference expansion, the full
pipeline, profile resolution, and the 0600 write.
Drive the synthesizer from the running service and let operators inspect or
relocate its output.

- client.py: _load_library_client treats a library_client_config_path with no
  unified config block as legacy (enrich the operator-supplied run.yaml) and
  every other library-mode shape as unified (synthesize a fresh run.yaml to a
  persistent path, overwritten each boot at mode 0600). Because the root
  Configuration validator guarantees a run source and forbids a path
  coexisting with unified inputs, a config driven only by the root-level
  inference.providers (no config block) correctly falls through to synthesis,
  and the client needs no access to the root configuration.
- lightspeed_stack.py: add --synthesized-config-output, propagated to the
  uvicorn workers (separate processes that perform the synthesis) via an
  environment variable, mirroring how the config path is already propagated.
- constants.py: shared env-var names and the default synthesized-config path
  (./.generated/run.yaml).
- .gitignore: ignore the generated run.yaml directory.
@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

Adds a unified-mode Llama Stack configuration synthesis pipeline. New UnifiedInferenceProvider and UnifiedLlamaStackConfig Pydantic models capture backend-agnostic provider entries and synthesis controls. A synthesis pipeline in llama_stack_configuration.py builds a complete run.yaml from a lightspeed-stack.yaml input, with a shipped default baseline in src/data/default_run.yaml. The client and CLI are updated to route between legacy enrichment and unified synthesis paths.

Changes

Unified-mode Llama Stack config synthesis

Layer / File(s) Summary
Unified-mode data contracts and constants
src/constants.py, src/models/config.py, docs/openapi.json
Adds UnifiedInferenceProvider and UnifiedLlamaStackConfig models; adds LlamaStackConfiguration.config and InferenceConfiguration.providers fields; adds check_unified_vs_legacy root validator enforcing mutual exclusivity with legacy path; updates check_llama_stack_model to allow unified-mode inputs; introduces three env-var name and default-path constants; reflects all new schemas in OpenAPI docs.
Built-in default baseline YAML
src/data/__init__.py, src/data/default_run.yaml
Adds the shipped baseline run.yaml used when baseline == "default", defining apis, providers (inference/embedding/files/safety/tool_runtime/vector_io/agents), storage backends, registered resources, vector_stores, and default safety shield.
Synthesis pipeline
src/llama_stack_configuration.py
Adds PROVIDER_TYPE_MAP, load_default_baseline(), deep_merge_list_replace(), apply_high_level_inference(), _resolve_profile_path(), synthesize_configuration(), and synthesize_to_file() implementing the full synthesis flow with mode-0600 file writes.
Client and CLI wiring
src/client.py, src/lightspeed_stack.py, .gitignore
Branches _load_library_client between legacy enrichment and unified synthesis via _synthesize_library_config; extends the CLI with --synthesized-config-output; propagates SYNTHESIZED_CONFIG_PATH_ENV_VAR to uvicorn workers; adds .generated/ to .gitignore.
Tests
tests/unit/test_llama_stack_synthesize.py, tests/unit/models/config/test_llama_stack_configuration.py, tests/unit/models/config/test_dump_configuration.py, tests/unit/test_client.py
Adds synthesis pipeline unit tests covering baseline loading, deep-merge semantics, inference expansion, full synthesis composition, and 0600 file-write/overwrite permissions; updates config model tests for unified-mode validation; updates dump tests for new default-null/empty-list fields; updates client test for synthesis error path.

Sequence Diagram(s)

sequenceDiagram
  participant CLI as lightspeed_stack CLI
  participant Client as _load_library_client
  participant Synth as _synthesize_library_config
  participant Pipeline as synthesize_configuration
  participant FS as File System (run.yaml, mode 0600)
  participant LlamaStack as AsyncLlamaStackAsLibraryClient

  CLI->>CLI: set CONFIG_PATH_ENV_VAR, optionally SYNTHESIZED_CONFIG_PATH_ENV_VAR
  Client->>Client: check library_client_config_path and config.config
  alt legacy mode (library_client_config_path set, config.config is None)
    Client->>Client: enrich legacy config path
  else unified mode
    Client->>Synth: invoke _synthesize_library_config
    Synth->>Synth: read operator YAML from CONFIG_PATH_ENV_VAR
    Synth->>Pipeline: synthesize_configuration(lcs_config, config_file_dir)
    Pipeline->>Pipeline: load baseline (default/profile/empty)
    Pipeline->>Pipeline: apply_high_level_inference
    Pipeline->>Pipeline: deep_merge_list_replace with native_override
    Pipeline-->>Synth: synthesized config dict
    Synth->>FS: synthesize_to_file (write + chmod 0600)
    Synth-->>Client: output_path
  end
  Client->>LlamaStack: initialize with _config_path
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • lightspeed-core/lightspeed-stack#1580: Implements the same unified-mode lightspeed-stack.yaml → synthesized run.yaml flow including the llama_stack.config / inference.providers models, config synthesis pipeline in src/llama_stack_configuration.py, and client/CLI wiring in src/client.py.

Suggested reviewers

  • asimurka
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'LCORE-2336: Unified llama_stack.config schema + synthesizer' directly summarizes the main change: introducing unified configuration schema and synthesis logic for the Llama Stack.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/data/default_run.yaml`:
- Around line 102-105: The provider_shield_id value in the shield_id:
llama-guard configuration is incorrectly set to openai/gpt-4o-mini, which is a
chat/generative model rather than a guard model. This prevents the safety shield
from properly gating queries. Update the provider_shield_id field under the
llama-guard shield to reference the actual Llama Guard model identifier such as
meta-llama/Llama-Guard-3-8B instead of the chat model identifier.

In `@src/lightspeed_stack.py`:
- Around line 169-175: The current code only sets the
SYNTHESIZED_CONFIG_PATH_ENV_VAR environment variable when
args.synthesized_config_output is provided, but does not unset it when the flag
is omitted, causing stale values to persist in the environment and preventing
workers from using the intended default fallback behavior. Add an else clause to
the existing if condition (after args.synthesized_config_output is not None
check) that explicitly removes constants.SYNTHESIZED_CONFIG_PATH_ENV_VAR from
os.environ using the pop method with a default value to safely handle cases
where the variable may not be set, ensuring workers always use the correct path
whether the override is provided or not.

In `@src/models/config.py`:
- Around line 708-713: In the Pydantic model in config.py, replace the `Any`
type annotation in the `extra` field (around line 708) and the `native_override`
field (around line 751) with specific type annotations instead of `dict[str,
Any]`. Review the usage of these fields throughout the codebase to determine the
appropriate concrete type for the dictionary values (such as `str`, `int`, or a
union of specific types) and replace `Any` with that specific type to maintain
strong schema typing and comply with the coding guidelines.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 84ac8fd7-6cbb-411f-b460-32eb3664dff5

📥 Commits

Reviewing files that changed from the base of the PR and between ee37367 and eef0738.

📒 Files selected for processing (13)
  • .gitignore
  • docs/openapi.json
  • src/client.py
  • src/constants.py
  • src/data/__init__.py
  • src/data/default_run.yaml
  • src/lightspeed_stack.py
  • src/llama_stack_configuration.py
  • src/models/config.py
  • tests/unit/models/config/test_dump_configuration.py
  • tests/unit/models/config/test_llama_stack_configuration.py
  • tests/unit/test_client.py
  • tests/unit/test_llama_stack_synthesize.py
📜 Review details
⏰ Context from checks skipped due to timeout. (6)
  • GitHub Check: integration_tests (3.13)
  • GitHub Check: ruff
  • GitHub Check: unit_tests (3.13)
  • GitHub Check: E2E Tests for Lightspeed Evaluation job
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-0-6-on-pull-request
🧰 Additional context used
📓 Path-based instructions (5)
src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Llama Stack imports: Use from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Use async def for I/O operations and external API calls
Use standard log levels with clear purposes: debug() for diagnostic info, info() for program execution, warning() for unexpected events, error() for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes: Configuration, Error/Exception, Resolver, Interface
Abstract classes must use ABC with @abstractmethod decorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes

Files:

  • src/data/__init__.py
  • src/constants.py
  • src/lightspeed_stack.py
  • src/client.py
  • src/models/config.py
  • src/llama_stack_configuration.py
src/**/__init__.py

📄 CodeRabbit inference engine (AGENTS.md)

Package __init__.py files must contain brief package descriptions

Files:

  • src/data/__init__.py
src/constants.py

📄 CodeRabbit inference engine (AGENTS.md)

Use constants.py for shared constants with descriptive comments and type hints using Final[type]

Files:

  • src/constants.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

  • tests/unit/test_client.py
  • tests/unit/test_llama_stack_synthesize.py
  • tests/unit/models/config/test_dump_configuration.py
  • tests/unit/models/config/test_llama_stack_configuration.py
src/models/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Pydantic models must use @model_validator and @field_validator for validation and complete type annotations for all attributes, avoiding Any type

Files:

  • src/models/config.py
🧠 Learnings (5)
📚 Learning: 2026-04-15T18:53:54.785Z
Learnt from: Lifto
Repo: lightspeed-core/lightspeed-stack PR: 1510
File: src/models/requests.py:742-769
Timestamp: 2026-04-15T18:53:54.785Z
Learning: In `src/models/requests.py`, keep `ResponsesRequest.validate_body_size` using `json.dumps(values)` with Python defaults (including ASCII escaping and the default separators). This default formatting is intentional to make the 65,536-character limit conservatively strict (accounting for small encoding/format overhead). Do not recommend changing it to `ensure_ascii=False` or using compact separators for this validator, since an exact wire-format byte match with the client payload is not achievable via `json.dumps` anyway.

Applied to files:

  • .gitignore
📚 Learning: 2026-04-20T15:09:45.119Z
Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1548
File: src/app/endpoints/rlsapi_v1.py:56-56
Timestamp: 2026-04-20T15:09:45.119Z
Learning: In `src/app/endpoints/rlsapi_v1.py`, treat the line `_get_rh_identity_context = get_rh_identity_context` as an intentional temporary backward-compatibility shim (introduced for PR `#1548`, Splunk HEC telemetry work). Do not flag it as dead/unnecessary/unused code during review until the planned part 3 is merged and the responses endpoint is fully wired up such that no tests or external consumers reference the underscore-prefixed name.

Applied to files:

  • .gitignore
📚 Learning: 2026-05-20T08:09:30.641Z
Learnt from: max-svistunov
Repo: lightspeed-core/lightspeed-stack PR: 1580
File: docs/design/llama-stack-config-merge/poc-results/library-mode/synthesized-run.yaml:107-110
Timestamp: 2026-05-20T08:09:30.641Z
Learning: In Llama-stack config YAMLs, when defining a Llama Guard safety shield entry, set `provider_shield_id` to the *guard model identifier* (e.g., `meta-llama/Llama-Guard-3-8B`). Do not use a chat/generative model id (e.g., `openai/gpt-4o-mini`): a chat-model id (or `native_override`) indicates only an override landed and does **not** mean the safety shield is actually gating queries. Ensure any E2E coverage for the related implementation (JIRA/E2E tests) exercises a real Llama Guard model to verify that the shield is effective.

Applied to files:

  • src/data/default_run.yaml
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.

Applied to files:

  • src/models/config.py
📚 Learning: 2026-02-25T07:46:33.545Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.

Applied to files:

  • src/models/config.py
🪛 ast-grep (0.44.0)
tests/unit/test_llama_stack_synthesize.py

[warning] 156-156: Do not make http calls without encryption
Context: "http://x"
Note: [CWE-319].

(requests-http)


[warning] 163-163: Do not make http calls without encryption
Context: "http://x"
Note: [CWE-319].

(requests-http)

🔇 Additional comments (15)
src/data/__init__.py (1)

1-8: LGTM!

src/data/default_run.yaml (1)

1-30: LGTM!

Also applies to: 31-99, 100-101, 106-124

src/llama_stack_configuration.py (8)

1-22: LGTM!


35-52: LGTM!


657-676: LGTM!


678-701: LGTM!


704-764: LGTM!


766-784: LGTM!


786-851: LGTM!


854-892: LGTM!

src/constants.py (1)

12-24: LGTM!

src/models/config.py (1)

656-707: LGTM!

Also applies to: 716-750, 832-898, 1718-1729, 2627-2677

docs/openapi.json (3)

13594-13601: LGTM!


14064-14075: LGTM!


19994-20087: LGTM!

Comment thread src/data/default_run.yaml
Comment on lines +102 to +105
shields:
- shield_id: llama-guard
provider_id: llama-guard
provider_shield_id: openai/gpt-4o-mini

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Incorrect provider_shield_id value — safety shield may not gate queries.

The provider_shield_id is set to openai/gpt-4o-mini, which is a chat/generative model, not a Llama Guard model. Based on learnings, provider_shield_id should reference the actual guard model identifier (e.g., meta-llama/Llama-Guard-3-8B). Using a chat model ID here indicates the safety shield may not be effectively gating queries.

🛡️ Proposed fix
 registered_resources:
   models: []
   shields:
   - shield_id: llama-guard
     provider_id: llama-guard
-    provider_shield_id: openai/gpt-4o-mini
+    provider_shield_id: meta-llama/Llama-Guard-3-8B
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
shields:
- shield_id: llama-guard
provider_id: llama-guard
provider_shield_id: openai/gpt-4o-mini
shields:
- shield_id: llama-guard
provider_id: llama-guard
provider_shield_id: meta-llama/Llama-Guard-3-8B
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/data/default_run.yaml` around lines 102 - 105, The provider_shield_id
value in the shield_id: llama-guard configuration is incorrectly set to
openai/gpt-4o-mini, which is a chat/generative model rather than a guard model.
This prevents the safety shield from properly gating queries. Update the
provider_shield_id field under the llama-guard shield to reference the actual
Llama Guard model identifier such as meta-llama/Llama-Guard-3-8B instead of the
chat model identifier.

Source: Learnings

Comment thread src/lightspeed_stack.py
Comment on lines +169 to +175
# Propagate the synthesized-config-output override to the workers (separate
# processes), which perform unified-mode library synthesis. Unset means the
# workers fall back to constants.DEFAULT_SYNTHESIZED_CONFIG_PATH.
if args.synthesized_config_output is not None:
os.environ[constants.SYNTHESIZED_CONFIG_PATH_ENV_VAR] = (
args.synthesized_config_output
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Unset inherited synthesized-output env var when no CLI override is provided.

When --synthesized-config-output is omitted, an already-set LIGHTSPEED_STACK_SYNTHESIZED_CONFIG_PATH remains in the process environment, so workers may use a stale path instead of the intended default fallback behavior.

🔧 Proposed fix
-    if args.synthesized_config_output is not None:
-        os.environ[constants.SYNTHESIZED_CONFIG_PATH_ENV_VAR] = (
-            args.synthesized_config_output
-        )
+    if args.synthesized_config_output is not None:
+        os.environ[constants.SYNTHESIZED_CONFIG_PATH_ENV_VAR] = (
+            args.synthesized_config_output
+        )
+    else:
+        os.environ.pop(constants.SYNTHESIZED_CONFIG_PATH_ENV_VAR, None)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Propagate the synthesized-config-output override to the workers (separate
# processes), which perform unified-mode library synthesis. Unset means the
# workers fall back to constants.DEFAULT_SYNTHESIZED_CONFIG_PATH.
if args.synthesized_config_output is not None:
os.environ[constants.SYNTHESIZED_CONFIG_PATH_ENV_VAR] = (
args.synthesized_config_output
)
# Propagate the synthesized-config-output override to the workers (separate
# processes), which perform unified-mode library synthesis. Unset means the
# workers fall back to constants.DEFAULT_SYNTHESIZED_CONFIG_PATH.
if args.synthesized_config_output is not None:
os.environ[constants.SYNTHESIZED_CONFIG_PATH_ENV_VAR] = (
args.synthesized_config_output
)
else:
os.environ.pop(constants.SYNTHESIZED_CONFIG_PATH_ENV_VAR, None)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lightspeed_stack.py` around lines 169 - 175, The current code only sets
the SYNTHESIZED_CONFIG_PATH_ENV_VAR environment variable when
args.synthesized_config_output is provided, but does not unset it when the flag
is omitted, causing stale values to persist in the environment and preventing
workers from using the intended default fallback behavior. Add an else clause to
the existing if condition (after args.synthesized_config_output is not None
check) that explicitly removes constants.SYNTHESIZED_CONFIG_PATH_ENV_VAR from
os.environ using the pop method with a default value to safely handle cases
where the variable may not be set, ensuring workers always use the correct path
whether the override is provided or not.

Comment thread src/models/config.py
Comment on lines +708 to +713
extra: dict[str, Any] = Field(
default_factory=dict,
title="Extra provider config",
description="Additional provider-config keys merged verbatim into the "
"synthesized provider's config block.",
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Replace Any in Pydantic model fields.

extra and native_override are declared as dict[str, Any], which weakens schema typing and violates the models guideline.

♻️ Proposed fix
-    extra: dict[str, Any] = Field(
+    extra: dict[str, object] = Field(
         default_factory=dict,
         title="Extra provider config",
         description="Additional provider-config keys merged verbatim into the "
         "synthesized provider's config block.",
     )
@@
-    native_override: dict[str, Any] = Field(
+    native_override: dict[str, object] = Field(
         default_factory=dict,
         title="Native override",
         description="Raw Llama Stack schema deep-merged last (maps merge "
         "recursively; lists and scalars replace).",
     )

As per coding guidelines, src/models/**/*.py: “Pydantic models must use @model_validator and @field_validator for validation and complete type annotations for all attributes, avoiding Any type”.

Also applies to: 751-756

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/models/config.py` around lines 708 - 713, In the Pydantic model in
config.py, replace the `Any` type annotation in the `extra` field (around line
708) and the `native_override` field (around line 751) with specific type
annotations instead of `dict[str, Any]`. Review the usage of these fields
throughout the codebase to determine the appropriate concrete type for the
dictionary values (such as `str`, `int`, or a union of specific types) and
replace `Any` with that specific type to maintain strong schema typing and
comply with the coding guidelines.

Source: Coding guidelines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant