Raining CVEs On WordPress Plugins With Semgrep

By Shreya Pohekar , Syed Sheeraz Ali on 08 Sep 2022 @ Nullcon
πŸ“Š Presentation πŸ“Ή Video πŸ”— Link
#static-analysis #code-review #secure-coding #sqli #xss #security-testing
Focus Areas: πŸ” Application Security , βš™οΈ DevSecOps , 🦠 Malware Analysis , 🎯 Penetration Testing , 🌐 Web Application Security

Presentation Material

Abstract

Every organization has its own unique coding style and strategies. This can make it difficult for a static code analyzer to effectively find bugs in every codebase. The customizations available with these analyzers are prone to a lot of false positives.

In this research, we leveraged an open-source tool semgrep to write custom rules that ran over 80k WordPress plugins to find vulnerabilities like SQLi, XSS, and LFI in bulk. The first challenge to overcome was getting a large number of false positives. We understood the coding patterns and came up with better rules. The tweaking reduced the percentage of false positives drastically. Writing good rules helped us identify the vulnerable code just by looking at the semgrep’s output. This removed the overhead of manual validation by installing the plugin altogether.

The SQLIs we found were all time-based blind but we identified the code and converted lots of them to union-based SQLI. We bypassed filters to get SQLI and XSS and created custom rules for the code that contains the bypassed filter. In the XSS ruleset, we obtained thousands of results with possible XSS that lead to the creation of an automated XSS validator: XSSBomb.

The talk will have a demo for basic usage of semgrep, writing custom rules, and running them over the list of vulnerable plugin repos. We will also demo the tool XSSBomb. In this research, we identified some really good real-world examples of writing secure code and WordPress’s way of preventing attacks. As a result of this research, we collectively found 47 confirmed bugs and were assigned CVEs for the same.

AI Generated Summary

The research presented focuses on applying static code analysis to identify security vulnerabilities in WordPress plugins, aiming to improve the security of widely used open-source software. The primary tool used was Semgrep, an open-source static analysis engine supporting multiple languages, for which custom rules were written to detect reflected cross-site scripting (XSS) and SQL injection (SQLi) patterns.

Key findings revealed common developer misconceptions, particularly the misuse of WordPress sanitization functions like sanitize_text_field. This function was found insufficient for preventing XSS in attribute contexts (e.g., <img src>) or SQLi, as it does not escape single quotes or consider output context. Effective mitigation was observed when developers employed nested filters and context-aware output encoding. A significant protective feature, WordPress magic quotes, was noted as a default defense, though known bypasses (e.g., GBK encoding) exist.

The research process involved an iterative approach to refine Semgrep rules, reducing false positives by accounting for common sanitization filters and coding patterns. A major challenge was the manual validation of thousands of potential vulnerabilities. To address this, the team developed “Exercise Bomb,” an automation framework. This tool integrated a Semgrep output parser, a WordPress test environment manager, and the GXSS fuzzer to automatically install plugins, test for exploitability, and generate reports. Key innovations included resetting a base WordPress instance instead of recreating it to improve speed and using differential analysis to identify plugin-specific endpoints.

Practical implications include the release of custom Semgrep rules and the demonstration that large-scale, automated validation is essential for projects with vast codebases like WordPress’s 80,000+ plugins. The work underscores that relying solely on basic sanitization functions creates a false sense of security, and proper context-sensitive output encoding is critical. Future work involves expanding rule sets to other vulnerability classes (LFI, SSRF, RCE) and enhancing the automation framework into a general “Vulnerability Bomb.”

Disclaimer: This summary was auto-generated from the video transcript using AI and may contain inaccuracies. It is intended as a quick overview β€” always refer to the original talk for authoritative content. Learn more about our AI experiments.