🚀 Evaluate base LLMs' agent capabilities in software engineering and deep research with APTBench for efficient, predictive performance insights.
cybersecurity penetration-testing testing-tools code-quality command-line-tool malware-analysis network-analysis open-source-project performance-test vulnerability-assessment system-security source-code-analysis automation-tools security-auditing apt-benchmarking
-
Updated
Jun 16, 2026 - Python