Skip to content

AmdSmiPlugin update#230

Merged
alexandraBara merged 4 commits into
developmentfrom
alex_amdsmi_fix
Jun 22, 2026
Merged

AmdSmiPlugin update#230
alexandraBara merged 4 commits into
developmentfrom
alex_amdsmi_fix

Conversation

@alexandraBara

@alexandraBara alexandraBara commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

Summary

-Extend StaticFrequencyLevels to accept DPM levels 0–15
-Update for amdsmi 26.x

Test plan

  • pytest test/unit
  • pytest test/functional (if applicable)
  • pre-commit run --all-files

Checklist

  • Added/updated tests (or explained why not)
  • Updated docs/README if behavior changed
  • No secrets or credentials committed
    Before:
 alexbara@banff-cyxtera-s72-1:~/node-scraper$ node-scraper run-plugins AmdSmiPlugin
  2026-06-22 09:52:41 CDT       INFO               nodescraper | Log path: ./scraper_logs_banff_cyxtera_s72_1_2026_06_22-09_52_41_AM
...
  2026-06-22 09:52:41 CDT       INFO               nodescraper | Running data collector: AmdSmiCollector
  2026-06-22 09:52:42 CDT       INFO               nodescraper | amd-smi version: 26.2.2
  2026-06-22 09:52:42 CDT       INFO               nodescraper | ROCm version: 7.2.1
  2026-06-22 09:52:47 CDT    WARNING               nodescraper | (AmdSmiPlugin) task completed with warnings (8 warnings: Failed to parse static clock frequency_levels (x8))
  2026-06-22 09:52:47 CDT       INFO               nodescraper | Running data analyzer: AmdSmiAnalyzer
  2026-06-22 09:52:47 CDT      ERROR               nodescraper | GPU: 7 has 4 L0 recoveries
  2026-06-22 09:52:47 CDT       INFO               nodescraper | Expected XGMI link speed not set; skipping XGMI link speed analysis
  2026-06-22 09:52:47 CDT      ERROR               nodescraper | (AmdSmiPlugin) task detected errors (1 errors: GPU: 7 has 4 L0 recoveries)
  2026-06-22 09:52:47 CDT       INFO               nodescraper | Closing connections
  2026-06-22 09:52:47 CDT       INFO               nodescraper | Running result collators
  2026-06-22 09:52:47 CDT       INFO               nodescraper | Running TableSummary result collator
  2026-06-22 09:52:47 CDT       INFO               nodescraper |

+-------------------------+--------+-----------------------------+
| Connection              | Status | Message                     |
+-------------------------+--------+-----------------------------+
| InBandConnectionManager | OK     | task completed successfully |
+-------------------------+--------+-----------------------------+

+--------------+--------+-------------------------------------------------------------------------------+
| Plugin       | Status | Message                                                                       |
+--------------+--------+-------------------------------------------------------------------------------+
| AmdSmiPlugin | ERROR  | Collection warning: task completed with warnings (8 warnings: Failed to parse |
|              |        | static clock frequency_levels (x8)); Analysis error: task detected errors (1  |
|              |        | errors: GPU: 7 has 4 L0 recoveries)                                           |
+--------------+--------+-------------------------------------------------------------------------------+

After

node-scraper$ node-scraper run-plugins AmdSmiPlugin
...
  2026-06-22 10:29:51 CDT       INFO               nodescraper | Running plugin AmdSmiPlugin
  2026-06-22 10:29:51 CDT       INFO               nodescraper | Initializing connection: InBandConnectionManager
  2026-06-22 10:29:51 CDT       INFO               nodescraper | Using local shell
  2026-06-22 10:29:51 CDT       INFO               nodescraper | Checking OS family
  2026-06-22 10:29:51 CDT       INFO               nodescraper | OS Family: LINUX
  2026-06-22 10:29:51 CDT       INFO               nodescraper | Running data collector: AmdSmiCollector
  2026-06-22 10:29:51 CDT       INFO               nodescraper | amd-smi version: 26.2.2
  2026-06-22 10:29:51 CDT       INFO               nodescraper | ROCm version: 7.2.1
  2026-06-22 10:29:57 CDT       INFO               nodescraper | (AmdSmiPlugin) task completed successfully
  2026-06-22 10:29:57 CDT       INFO               nodescraper | Running data analyzer: AmdSmiAnalyzer
  2026-06-22 10:29:57 CDT      ERROR               nodescraper | GPU: 7 has 4 L0 recoveries
  2026-06-22 10:29:57 CDT       INFO               nodescraper | Expected XGMI link speed not set; skipping XGMI link speed analysis
  2026-06-22 10:29:57 CDT      ERROR               nodescraper | (AmdSmiPlugin) task detected errors (1 errors: GPU: 7 has 4 L0 recoveries)
  2026-06-22 10:29:57 CDT       INFO               nodescraper | Closing connections
  2026-06-22 10:29:57 CDT       INFO               nodescraper | Running result collators
  2026-06-22 10:29:57 CDT       INFO               nodescraper | Running TableSummary result collator
  2026-06-22 10:29:57 CDT       INFO               nodescraper |

+-------------------------+--------+-----------------------------+
| Connection              | Status | Message                     |
+-------------------------+--------+-----------------------------+
| InBandConnectionManager | OK     | task completed successfully |
+-------------------------+--------+-----------------------------+

+--------------+--------+-----------------------------------------------------------------------------+
| Plugin       | Status | Message                                                                     |
+--------------+--------+-----------------------------------------------------------------------------+
| AmdSmiPlugin | ERROR  | Analysis error: task detected errors (1 errors: GPU: 7 has 4 L0 recoveries) |
+--------------+--------+-----------------------------------------------------------------------------+

  2026-06-22 10:29:57 CDT       INFO               nodescraper | Data written to csv file: ./scraper_logs_banff_cyxtera_s72_1_2026_06_22-10_29_51_AM/nodescraper.csv

@alexandraBara alexandraBara merged commit 896f5c9 into development Jun 22, 2026
6 checks passed
@alexandraBara alexandraBara deleted the alex_amdsmi_fix branch June 22, 2026 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant