TY - GEN
T1 - PATCHOULI
T2 - 32nd Asia-Pacific Software Engineering Conference, APSEC 2025
AU - Li, Binchang
AU - Li, Qingyuan
AU - Gao, Cuiyun
AU - Liao, Qing
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Software vendors often distribute vulnerability fixes silently, putting the users under the threaten of N-day attacks. Therefore, security patch detection (SPD) is crucial for software security maintenance. Recent research has increasingly focused on learning-based SPD, achieving promising results. However, challenges still exist in this field: (1) the granularity of patch identification is coarse, typically at the file level, (2) limited support for multiple programming languages due to the requirements of project-level dependencies extracted by language-dependent tools. To tackle these challenges, we present PATCHOULI, a security patch detection tool featuring fine-grained detection, multi-language capability, and good interpretability supported by the addressed vulnerability classification. PATCHOULI provides a user-friendly interface and accepts code changes as the sole input. It leverages Qwen2.5-Coder-0.5B-Instruct to identify security-related code changes at both patch- and block-level granularities, and UniXcoder to recognize the repaired vulnerability types. PATCHOULI is trained on a multilingual dataset containing C/C++, Java, and Python, thereby enabling multi-language patch analysis capabilities. Moreover, the small size of the base models enables PATCHOULI to be deployed on CPU-only devices, further enhancing its usability. We compare PATCHOULI with six state-of-the-art foundation models on this task across multiple programming languages. Experiment results demonstrate that PATCHOULI achieves higher accuracy, F1 scores, and MCC compared to mainstream foundation models. We disclose a demo at https://huggingface.co/spaces/traveler514/patchouli, and a demonstration video at https://youtu.be/Spaa_k50slE.
AB - Software vendors often distribute vulnerability fixes silently, putting the users under the threaten of N-day attacks. Therefore, security patch detection (SPD) is crucial for software security maintenance. Recent research has increasingly focused on learning-based SPD, achieving promising results. However, challenges still exist in this field: (1) the granularity of patch identification is coarse, typically at the file level, (2) limited support for multiple programming languages due to the requirements of project-level dependencies extracted by language-dependent tools. To tackle these challenges, we present PATCHOULI, a security patch detection tool featuring fine-grained detection, multi-language capability, and good interpretability supported by the addressed vulnerability classification. PATCHOULI provides a user-friendly interface and accepts code changes as the sole input. It leverages Qwen2.5-Coder-0.5B-Instruct to identify security-related code changes at both patch- and block-level granularities, and UniXcoder to recognize the repaired vulnerability types. PATCHOULI is trained on a multilingual dataset containing C/C++, Java, and Python, thereby enabling multi-language patch analysis capabilities. Moreover, the small size of the base models enables PATCHOULI to be deployed on CPU-only devices, further enhancing its usability. We compare PATCHOULI with six state-of-the-art foundation models on this task across multiple programming languages. Experiment results demonstrate that PATCHOULI achieves higher accuracy, F1 scores, and MCC compared to mainstream foundation models. We disclose a demo at https://huggingface.co/spaces/traveler514/patchouli, and a demonstration video at https://youtu.be/Spaa_k50slE.
KW - Foundation Model
KW - Security Patch Detection
KW - Software Vulnerability
UR - https://www.scopus.com/pages/publications/105035230884
U2 - 10.1109/APSEC66846.2025.00122
DO - 10.1109/APSEC66846.2025.00122
M3 - 会议稿件
AN - SCOPUS:105035230884
T3 - Proceedings - Asia-Pacific Software Engineering Conference, APSEC
SP - 1025
EP - 1028
BT - Proceedings - 2025 32nd Asia-Pacific Software Engineering Conference, APSEC 2025
A2 - Zhang, Tao
A2 - Luo, Xiapu
A2 - Keung, Jacky
A2 - Choi, Eunjong
PB - IEEE Computer Society
Y2 - 2 December 2025 through 5 December 2025
ER -