Add PHP Health Metrics (PHM) with /proc-based worker collection#145
Add PHP Health Metrics (PHM) with /proc-based worker collection#145songzhendong wants to merge 2 commits into
Conversation
Report six instance_php_* process meters from the reporter worker via Linux /proc. PHM is enabled by default when the agent is active (aligned with Python PVM and Ruby runtime meters). Includes ondrej/php PPA fix in rust.yml CI.
There was a problem hiding this comment.
Pull request overview
Adds PHP Health Metrics (PHM) collection and reporting from the forked reporter worker, using Linux /proc to sample the parent PHP-FPM process and exporting the data via SkyWalking’s native meter protocol.
Changes:
- Add
/proc-based PHM collector and wire it into the worker lifecycle/config. - Add gRPC
collectBatch-based meter batching path and filter meter items out of the existing trace/log collect stream. - Update e2e expectations/docs and adjust Rust CI to reliably install
php*-fpmviappa:ondrej/phpon Ubuntu 24.04.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| worker/src/reporter/reporter_grpc.rs | Spawns meter batch reporter and wraps consumer to divert meters to batch path. |
| worker/src/reporter/mod.rs | Registers new reporter submodules. |
| worker/src/reporter/meter_filter.rs | Filters CollectItem::Meter out of main stream and forwards to meter batch channel. |
| worker/src/reporter/meter_batch.rs | Implements collect_batch flush + bounded retry/retention logic for meters. |
| worker/src/phm.rs | Implements Linux /proc sampling for PHM meters and reports them into the worker channel. |
| worker/src/lib.rs | Adds PHM config to worker config and starts collector alongside heartbeat. |
| src/worker.rs | Builds PHM worker configuration based on INI settings. |
| src/module.rs | Adds lazy INI reads for new PHM settings and ensures logger uses cloned reporter. |
| src/lib.rs | Registers new INI settings for PHM enablement and report period. |
| tests/common/mod.rs | Enables PHM in one FPM fixture instance with shorter reporting period. |
| tests/e2e.rs | Extends wait time to allow PHM meters to be reported before validation. |
| tests/data/expected_context.yaml | Adds expected meter items assertions for PHM. |
| docs/en/setup/service-agent/php-agent/README.md | Documents PHM feature, platform constraints, and meter list. |
| docs/en/configuration/ini-settings.md | Documents new INI settings for PHM. |
| README.md | Mentions PHM in the project description. |
| Cargo.toml | Bumps workspace version to 1.2.0. |
| .github/workflows/rust.yml | Adds explicit ondrej PPA setup before installing php*-fpm on Ubuntu 24.04. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| phm: if *METRICS_ENABLE { | ||
| Some(PhmConfiguration { | ||
| service_name: SERVICE_NAME.clone(), | ||
| service_instance: SERVICE_INSTANCE.clone(), | ||
| report_period_secs: *METRICS_REPORT_PERIOD, | ||
| php_process_pid: libc::getpid() as i32, | ||
| }) | ||
| } else { | ||
| None | ||
| }, |
There was a problem hiding this comment.
Thanks for catching this. Addressed in 5ed3fb9: phm_configuration() now stores the parent PHP-FPM PID via getppid() for the fallback path, and PHM is only enabled on Linux when metrics_enable is On.
| "".to_string(), | ||
| Policy::System, | ||
| ); | ||
| module.add_ini(SKYWALKING_AGENT_METRICS_ENABLE, true, Policy::System); |
There was a problem hiding this comment.
Agreed. Addressed in 5ed3fb9: metrics_enable now defaults to true on Linux and false on other platforms via #[cfg(target_os = linux)] on the INI registration.
| | skywalking_agent.instance_name | Instance name. You can set `${HOSTNAME}`, refer to [Example #1](https://www.php.net/manual/en/install.fpm.configuration.php) | | | ||
| | skywalking_agent.standalone_socket_path | Unix domain socket file path of standalone skywalking php worker. Only available when `reporter_type` is `standalone`. | | | ||
| | skywalking_agent.psr_logging_level | The log level reported to SkyWalking, based on PSR-3, one of `Off`, `Debug`, `Info`, Notice`, Warning`, Error`, Critical`, Alert`, Emergency`. | Off | | ||
| | skywalking_agent.metrics_enable | Enable PHP Health Metrics (PHM) meter reporting via native MeterReportService. **Linux only** (requires `/proc`). Enabled by default when the agent is active; set to `Off` to disable. Reports six process meters: CPU utilization, memory used/peak, virtual memory, thread count, and open FD count. See [PHP agent README](../setup/service-agent/php-agent/README.md#php-health-metrics-phm). | On | |
There was a problem hiding this comment.
Updated in 5ed3fb9: the default value column now reads On (Linux); Off (other), matching the platform-specific INI default.
Use getppid for PhmConfiguration fallback PID, enable PHM only on Linux at worker startup, and default metrics_enable to Off on non-Linux platforms. Update ini-settings default column accordingly.
Summary
Add PHP Health Metrics (PHM): the reporter worker samples the parent PHP-FPM process via Linux
/procand reports sixinstance_php_*meters through nativeMeterReportService/collectBatch.Meters (aligned with OAP
php-runtime.yamland Horizon UI widgets):instance_php_process_cpu_utilization/proc/{pid}/statutime+stime deltainstance_php_memory_used_mb/proc/{pid}/statusVmRSSinstance_php_memory_peak_mb/proc/{pid}/statusVmHWMinstance_php_virtual_memory_mb/proc/{pid}/statusVmSizeinstance_php_thread_count/proc/{pid}/statusThreadsinstance_php_open_fd_count/proc/{pid}/fdcountLinux only (
/proc); requires a forked reporter worker (reporter_type=grpcorkafka, notstandalone).New INI settings:
skywalking_agent.metrics_enable— default On when the agent is active (aligned with Python PVM / Ruby runtime meters); setOffto disable.skywalking_agent.metrics_report_period— default 30 seconds.skywalking_agent.enableremains Off by default (unchanged PHP agent behavior). New deployments without the agent enabled are unaffected.Bump workspace version to 1.2.0; documentation in
docs/en/and README.CI: explicit
ppa:ondrej/phpbeforephp-*-fpminstall inrust.yml(see Appendix; independent of PHM logic).Design notes
executehooks.getppid())./proc/{pid}/statdelta over the report period; the first interval emits no CPU point (baseline sample required).collectBatch(short-lived gRPC stream), matching periodic reporting rather than a long-lived meter stream.Development branch CI verification (commit
e6a279f)All pipelines passed on the fork:
Testing
tests/e2e.rs):-d metrics_enable=On,-d metrics_report_period=5./dataValidatevalidatestests/data/expected_context.yaml.serviceName: skywalking-agent-test-1(ge 0/ge 1for thread and FD counts)./procsamples from the CI Ubuntu runner's php-fpm worker, not mocked.Related work (separate PRs, not in this change)
meter-analyzer-config/php-runtime.yaml, PHP e2e — open after this merges; pinSW_AGENT_PHP_COMMITto the apache merge SHA.Agent-first order matches Python PVM and Go runtime meter: no proto change; safe to merge agent before OAP/UI.
References
Test plan
e6a279f)enable=On: six meters reported to OAP (after OAP PR merges)metrics_enable=Off: no PHM meters; tracing unchangedstandalonereporter: no PHM (documented)/procsamplingNotes for reviewers
enabledefault Off means no behavior change until the agent is explicitly enabled.worker/src/phm.rs—/procparsing and CPU delta mathworker/src/reporter/meter_batch.rs— batch flush and retry semanticsmaster.Appendix: CI fix — explicit
ppa:ondrej/phpin Setup php-fpmBackground
On the fork, the Rust workflow began failing at Setup php-fpm for Linux after GitHub runner image updates (logs show
ubuntu24/20260615.205), with errors such as:ubuntu24/20260607.184ubuntu24/20260615.205Some runs on 6/17 failed overall due to
cargo fmt/clippy, unrelated to php-fpm.Other workflows at the same time: License and PECL still passed; Rust failed at apt install before reaching
cargo clippy/cargo test.Apache upstream (inferred, not re-tested): Rust last succeeded on master push around 2026-03-12; no master pushes since. Re-running the same
rust.ymlon master or a PR today may hit the same fpm issue (same workflow, rollingubuntu-24.04image). This is an inference only.Relation to PHM: Even without PHM, new runners may exhibit this CI failure. The Rust CI fix and PHM business logic are independent.
Likely causes (analysis)
runs-on: ubuntu-24.04is a rolling label; apt environment can change after image rollout (~6/15).php*-fpmpackages may not be in Ubuntu 24.04 default repos and typically requireppa:ondrej/php(same family asshivammathur/setup-phpfor CLI).setup-phpthenapt install php${version}-fpm, implicitly assuming ondrej is ready.setup-phpconfigures ondrej for CLI, but the fpm step did not explicitly refresh apt sources; worked on older runners, may fail on newer ones.Change in this PR
Before installing fpm, Setup php-fpm for Linux now runs:
Then the existing
apt install php${version}-fpmand symlink.Intent: Not introducing ondrej for the first time, but making fpm install explicitly depend on ondrej + refreshed apt index, instead of implicit state left by
setup-php.Unchanged: matrix, Rust toolchain, docker compose, cargo tests; each job still installs only one matrix PHP fpm version.
Full step as committed:
Verification (fork)
After push to
feature/php-metrics-dev(commite6a279f):Single fork validation; does not guarantee all upstream runner regions/times. Not proven to be the only possible fix.
Summary for review
php*-fpminstall fails on new runners; fork change restores CIsetup-phppractice? Reproduce on upstream first?