Evaluating Open-Source State-of-the-Art LLM Models for Memory Corruption Vulnerability Research in Binary Analysis
Category: Vulnerability Research
Location: any
Contact:
Damian Pfammatter
Introduction
The rapid advancement of Large Language Models (LLMs) has opened new avenues for automated security analysis, particularly in the domain of vulnerability research. This project aims to systematically evaluate the capabilities of state-of-the-art open-source LLMs in identifying memory corruption vulnerabilities in binary-only software. By focusing on decompiled binaries, the project seeks to explore how these models can be leveraged to enhance the security analysis of software without access to source code.
Objectives
The primary objectives of this project are:
- Evaluation and Comparison:
- Assess and compare the performance of various open-source LLMs in identifying memory corruption vulnerabilities in binary-only software.
- Benchmark these models against existing tools and techniques used in binary vulnerability analysis.
- Research and Reflection:
- Conduct a comprehensive review of current research and benchmarks related to the application of LLMs in binary vulnerability analysis.
- Investigate and implement techniques to enhance the effectiveness of LLMs in the task of vulnerability detection, including:
- Chain-of-Thought (CoT) reasoning
- Advanced prompting strategies
- Agentic behavior
- Fine-tuning on specific datasets
- Increasing context size
- Expanding model parameters
- Integrating with traditional Static Application Security Testing (SAST) tools
- Providing additional contextual information
- Find and compare the current software solutions that offer similar capabilities, such as Binary Ninja Sidekick.
- Focus specifically on memory corruption vulnerabilities to identify best practices and areas for improvement.
- Tool Integration:
- Develop a proposal for integrating the recommended models and techniques into popular decompilers such as Binary Ninja or IDA Pro.
Requirements
- Experience in binary analysis and reverse engineering
- Solid understanding of assembly languages and decompilation techniques
- Familiarity with Large Language Models (LLMs) and machine learning concepts
References
- AI-powered bug hunting
- Project Naptime: Evaluating Offensive Security Capabilities of Large Language Models
- From Naptime to Big Sleep: Using Large Language Models To Catch Vulnerabilities In Real-World Code
- LLM-Assisted Static Analysis for Detecting Security Vulnerabilities
- Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities
- How Does Naming Affect LLMs on Code Analysis Tasks? This suggest to first name variables/functions within the function before analysing it