Skip to content

Add PHP Health Metrics (PHM) with /proc-based worker collection#145

Open
songzhendong wants to merge 2 commits into
apache:masterfrom
songzhendong:feature/php-metrics-dev
Open

Add PHP Health Metrics (PHM) with /proc-based worker collection#145
songzhendong wants to merge 2 commits into
apache:masterfrom
songzhendong:feature/php-metrics-dev

Conversation

@songzhendong

Copy link
Copy Markdown

This description is provided for review reference. If later verification differs from what is stated here, corrections and feedback are welcome.

Summary

  • Add PHP Health Metrics (PHM): the reporter worker samples the parent PHP-FPM process via Linux /proc and reports six instance_php_* meters through native MeterReportService / collectBatch.

  • Meters (aligned with OAP php-runtime.yaml and Horizon UI widgets):

    Agent meter name Source
    instance_php_process_cpu_utilization /proc/{pid}/stat utime+stime delta
    instance_php_memory_used_mb /proc/{pid}/status VmRSS
    instance_php_memory_peak_mb /proc/{pid}/status VmHWM
    instance_php_virtual_memory_mb /proc/{pid}/status VmSize
    instance_php_thread_count /proc/{pid}/status Threads
    instance_php_open_fd_count /proc/{pid}/fd count
  • Linux only (/proc); requires a forked reporter worker (reporter_type=grpc or kafka, not standalone).

  • New INI settings:

    • skywalking_agent.metrics_enable — default On when the agent is active (aligned with Python PVM / Ruby runtime meters); set Off to disable.
    • skywalking_agent.metrics_report_period — default 30 seconds.
  • skywalking_agent.enable remains Off by default (unchanged PHP agent behavior). New deployments without the agent enabled are unaffected.

  • Bump workspace version to 1.2.0; documentation in docs/en/ and README.

  • CI: explicit ppa:ondrej/php before php-*-fpm install in rust.yml (see Appendix; independent of PHM logic).

Design notes

  • PHM uses the same gRPC transport as trace reporting; meters are not collected in PHP execute hooks.
  • Collector runs in the forked worker subprocess; target PID is the parent PHP-FPM worker (getppid()).
  • CPU utilization uses /proc/{pid}/stat delta over the report period; the first interval emits no CPU point (baseline sample required).
  • Meters are flushed via collectBatch (short-lived gRPC stream), matching periodic reporting rather than a long-lived meter stream.

Development branch CI verification (commit e6a279f)

All pipelines passed on the fork:

Workflow Result
Rust ✅ fmt, clippy, build, e2e (PHP 7.2–8.5 matrix + kafka-reporter)
License
PECL

Testing

  • Agent e2e (tests/e2e.rs):
    • FPM [docs] Update README #1 only: -d metrics_enable=On, -d metrics_report_period=5.
    • After plugin/trace requests and an 8s wait (≥2× report period), mock collector /dataValidate validates tests/data/expected_context.yaml.
    • Asserts six meters for serviceName: skywalking-agent-test-1 (ge 0 / ge 1 for thread and FD counts).
    • PHM values are real /proc samples from the CI Ubuntu runner's php-fpm worker, not mocked.

Related work (separate PRs, not in this change)

  • OAP: meter-analyzer-config/php-runtime.yaml, PHP e2e — open after this merges; pin SW_AGENT_PHP_COMMIT to the apache merge SHA.
  • UI: PHM widgets on General → Instance dashboard in skywalking-horizon-ui.

Agent-first order matches Python PVM and Go runtime meter: no proto change; safe to merge agent before OAP/UI.

References

Test plan

  • Fork CI: Rust / License / PECL green (e6a279f)
  • Upstream CI green after PR opened
  • Linux + enable=On: six meters reported to OAP (after OAP PR merges)
  • metrics_enable=Off: no PHM meters; tracing unchanged
  • standalone reporter: no PHM (documented)
  • Non-Linux: documented Linux-only; no /proc sampling

Notes for reviewers

  • Safe to merge agent first: enable default Off means no behavior change until the agent is explicitly enabled.
  • Please focus on:
    • worker/src/phm.rs/proc parsing and CPU delta math
    • Worker lifecycle and parent PID resolution
    • worker/src/reporter/meter_batch.rs — batch flush and retry semantics
  • 17 files, +620 / −11 vs master.

Appendix: CI fix — explicit ppa:ondrej/php in Setup php-fpm

Scope: .github/workflows/rust.yml only · Independent of PHM feature code

Background

On the fork, the Rust workflow began failing at Setup php-fpm for Linux after GitHub runner image updates (logs show ubuntu24/20260615.205), with errors such as:

E: Unable to locate package php7.2-fpm
E: Package 'php7.4-fpm' has no installation candidate
When Runner image Observation
2026-06-17 ubuntu24/20260607.184 Setup php-fpm and PHP matrix jobs succeeded
From 2026-06-19 ubuntu24/20260615.205 Above apt errors recur

Some runs on 6/17 failed overall due to cargo fmt / clippy, unrelated to php-fpm.

Other workflows at the same time: License and PECL still passed; Rust failed at apt install before reaching cargo clippy / cargo test.

Apache upstream (inferred, not re-tested): Rust last succeeded on master push around 2026-03-12; no master pushes since. Re-running the same rust.yml on master or a PR today may hit the same fpm issue (same workflow, rolling ubuntu-24.04 image). This is an inference only.

Relation to PHM: Even without PHM, new runners may exhibit this CI failure. The Rust CI fix and PHM business logic are independent.

Likely causes (analysis)

  1. runs-on: ubuntu-24.04 is a rolling label; apt environment can change after image rollout (~6/15).
  2. Matrix covers PHP 7.2–8.5; matching php*-fpm packages may not be in Ubuntu 24.04 default repos and typically require ppa:ondrej/php (same family as shivammathur/setup-php for CLI).
  3. Current flow runs setup-php then apt install php${version}-fpm, implicitly assuming ondrej is ready. setup-php configures ondrej for CLI, but the fpm step did not explicitly refresh apt sources; worked on older runners, may fail on newer ones.
  4. Not caused by PHM/agent changes; transient ondrej/apt issues cannot be fully ruled out.

Change in this PR

Before installing fpm, Setup php-fpm for Linux now runs:

sudo apt-get install -y software-properties-common
sudo add-apt-repository -y ppa:ondrej/php
sudo apt-get update

Then the existing apt install php${version}-fpm and symlink.

Intent: Not introducing ondrej for the first time, but making fpm install explicitly depend on ondrej + refreshed apt index, instead of implicit state left by setup-php.

Unchanged: matrix, Rust toolchain, docker compose, cargo tests; each job still installs only one matrix PHP fpm version.

Full step as committed:

- name: Setup php-fpm for Linux
  if: matrix.os == 'ubuntu-24.04'
  run: |
    sudo apt-get update
    sudo apt-get install -y software-properties-common
    sudo add-apt-repository -y ppa:ondrej/php
    sudo apt-get update
    sudo apt-get install -y php${{ matrix.flag.php_version }}-fpm
    sudo ln -sf /usr/sbin/php-fpm${{ matrix.flag.php_version }} /usr/sbin/php-fpm

Verification (fork)

After push to feature/php-metrics-dev (commit e6a279f):

Workflow Result
Rust ✅ (incl. PHP 7.2 / 7.4 / 8.5 matrix)
License
PECL

Single fork validation; does not guarantee all upstream runner regions/times. Not proven to be the only possible fix.

Summary for review

Dimension Note
Symptom php*-fpm install fails on new runners; fork change restores CI
Hypothesis Runner update + implicit ondrej dependency; upstream re-run may reproduce
Review focus Suitable for upstream? Consistent with setup-php practice? Reproduce on upstream first?

Report six instance_php_* process meters from the reporter worker via Linux /proc. PHM is enabled by default when the agent is active (aligned with Python PVM and Ruby runtime meters). Includes ondrej/php PPA fix in rust.yml CI.
@wu-sheng wu-sheng requested review from Copilot and jmjoy June 21, 2026 08:22
@wu-sheng wu-sheng added this to the 1.2.0 milestone Jun 21, 2026
@wu-sheng wu-sheng added the enhancement New feature or request label Jun 21, 2026

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds PHP Health Metrics (PHM) collection and reporting from the forked reporter worker, using Linux /proc to sample the parent PHP-FPM process and exporting the data via SkyWalking’s native meter protocol.

Changes:

  • Add /proc-based PHM collector and wire it into the worker lifecycle/config.
  • Add gRPC collectBatch-based meter batching path and filter meter items out of the existing trace/log collect stream.
  • Update e2e expectations/docs and adjust Rust CI to reliably install php*-fpm via ppa:ondrej/php on Ubuntu 24.04.

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
worker/src/reporter/reporter_grpc.rs Spawns meter batch reporter and wraps consumer to divert meters to batch path.
worker/src/reporter/mod.rs Registers new reporter submodules.
worker/src/reporter/meter_filter.rs Filters CollectItem::Meter out of main stream and forwards to meter batch channel.
worker/src/reporter/meter_batch.rs Implements collect_batch flush + bounded retry/retention logic for meters.
worker/src/phm.rs Implements Linux /proc sampling for PHM meters and reports them into the worker channel.
worker/src/lib.rs Adds PHM config to worker config and starts collector alongside heartbeat.
src/worker.rs Builds PHM worker configuration based on INI settings.
src/module.rs Adds lazy INI reads for new PHM settings and ensures logger uses cloned reporter.
src/lib.rs Registers new INI settings for PHM enablement and report period.
tests/common/mod.rs Enables PHM in one FPM fixture instance with shorter reporting period.
tests/e2e.rs Extends wait time to allow PHM meters to be reported before validation.
tests/data/expected_context.yaml Adds expected meter items assertions for PHM.
docs/en/setup/service-agent/php-agent/README.md Documents PHM feature, platform constraints, and meter list.
docs/en/configuration/ini-settings.md Documents new INI settings for PHM.
README.md Mentions PHM in the project description.
Cargo.toml Bumps workspace version to 1.2.0.
.github/workflows/rust.yml Adds explicit ondrej PPA setup before installing php*-fpm on Ubuntu 24.04.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/worker.rs Outdated
Comment on lines +84 to +93
phm: if *METRICS_ENABLE {
Some(PhmConfiguration {
service_name: SERVICE_NAME.clone(),
service_instance: SERVICE_INSTANCE.clone(),
report_period_secs: *METRICS_REPORT_PERIOD,
php_process_pid: libc::getpid() as i32,
})
} else {
None
},

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this. Addressed in 5ed3fb9: phm_configuration() now stores the parent PHP-FPM PID via getppid() for the fallback path, and PHM is only enabled on Linux when metrics_enable is On.

Comment thread src/lib.rs
"".to_string(),
Policy::System,
);
module.add_ini(SKYWALKING_AGENT_METRICS_ENABLE, true, Policy::System);

@songzhendong songzhendong Jun 21, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Addressed in 5ed3fb9: metrics_enable now defaults to true on Linux and false on other platforms via #[cfg(target_os = linux)] on the INI registration.

Comment thread docs/en/configuration/ini-settings.md Outdated
| skywalking_agent.instance_name | Instance name. You can set `${HOSTNAME}`, refer to [Example #1](https://www.php.net/manual/en/install.fpm.configuration.php) | |
| skywalking_agent.standalone_socket_path | Unix domain socket file path of standalone skywalking php worker. Only available when `reporter_type` is `standalone`. | |
| skywalking_agent.psr_logging_level | The log level reported to SkyWalking, based on PSR-3, one of `Off`, `Debug`, `Info`, Notice`, Warning`, Error`, Critical`, Alert`, Emergency`. | Off |
| skywalking_agent.metrics_enable | Enable PHP Health Metrics (PHM) meter reporting via native MeterReportService. **Linux only** (requires `/proc`). Enabled by default when the agent is active; set to `Off` to disable. Reports six process meters: CPU utilization, memory used/peak, virtual memory, thread count, and open FD count. See [PHP agent README](../setup/service-agent/php-agent/README.md#php-health-metrics-phm). | On |

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 5ed3fb9: the default value column now reads On (Linux); Off (other), matching the platform-specific INI default.

Use getppid for PhmConfiguration fallback PID, enable PHM only on Linux
at worker startup, and default metrics_enable to Off on non-Linux platforms.
Update ini-settings default column accordingly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants