Skip to content

Commit 13797f5

Browse files
update USENIXSec 2024 papers
1 parent ba1909c commit 13797f5

13 files changed

+227
-8
lines changed

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ We have systematically selected papers from the following venues, which are top-
3232

3333
- Security (Sec)
3434
- [S&P2023](data/papers/venues/S&P2023/README.md), [USENIXSec2023](data/papers/venues/USENIXSec2023/README.md), [CCS2023](data/papers/venues/CCS2023/README.md), [NDSS2023](data/papers/venues/NDSS2023/README.md)
35-
- [S&P2024](data/papers/venues/S&P2024/README.md), [NDSS2024](data/papers/venues/NDSS2024/README.md), [CCS2024](data/papers/venues/CCS2024/README.md)
35+
- [S&P2024](data/papers/venues/S&P2024/README.md), [USENIXSec2024](data/papers/venues/USENIXSec2024/README.md), [NDSS2024](data/papers/venues/NDSS2024/README.md), [CCS2024](data/papers/venues/CCS2024/README.md)
3636

3737
- Natural Language Processing (NLP)
3838
- [ACL2023](data/papers/venues/ACL2023/README.md), [EMNLP2023](data/papers/venues/EMNLP2023/README.md), [NAACL2023](data/papers/venues/NAACL2023/README.md)
@@ -71,9 +71,9 @@ This category focuses on typical tasks in Software Engineering (SE) and Programm
7171
- [Code Completion](data/papers/labels/code_completion.md) (22)
7272
- [Program Repair](data/papers/labels/program_repair.md) (41)
7373
- [Program Transformation](data/papers/labels/program_transformation.md) (31)
74-
- [Program Testing](data/papers/labels/program_testing.md) (54)
74+
- [Program Testing](data/papers/labels/program_testing.md) (55)
7575
- [General Testing](data/papers/labels/general_testing.md) (1)
76-
- [Fuzzing](data/papers/labels/fuzzing.md) (23)
76+
- [Fuzzing](data/papers/labels/fuzzing.md) (24)
7777
- [Library Testing](data/papers/labels/library_testing.md) (1)
7878
- [DBMS Testing](data/papers/labels/DBMS_testing.md) (1)
7979
- [Compiler Testing](data/papers/labels/compiler_testing.md) (4)
@@ -84,16 +84,16 @@ This category focuses on typical tasks in Software Engineering (SE) and Programm
8484
- [Debugging](data/papers/labels/debugging.md) (9)
8585
- [Bug Reproduction](data/papers/labels/bug_reproduction.md) (2)
8686
- [Vulnerability Exploitation](data/papers/labels/vulnerability_exploitation.md) (6)
87-
- [Static Analysis](data/papers/labels/static_analysis.md) (133)
87+
- [Static Analysis](data/papers/labels/static_analysis.md) (136)
8888
- [Syntactic Analysis](data/papers/labels/syntactic_analysis.md) (1)
8989
- [Pointer Analysis](data/papers/labels/pointer_analysis.md) (3)
9090
- [Call Graph Analysis](data/papers/labels/call_graph_analysis.md) (2)
9191
- [Data-flow Analysis](data/papers/labels/data-flow_analysis.md) (8)
9292
- [Type Inference](data/papers/labels/type_inference.md) (3)
93-
- [Specification Inference](data/papers/labels/specification_inference.md) (9)
93+
- [Specification Inference](data/papers/labels/specification_inference.md) (12)
9494
- [Equivalence Checking](data/papers/labels/equivalence_checking.md) (1)
9595
- [Code Similarity Analysis](data/papers/labels/code_similarity_analysis.md) (5)
96-
- [Bug Detection](data/papers/labels/bug_detection.md) (64)
96+
- [Bug Detection](data/papers/labels/bug_detection.md) (67)
9797
- [Program Verification](data/papers/labels/program_verification.md) (19)
9898
- [Program Optimization](data/papers/labels/program_optimization.md) (4)
9999
- [Program Decompilation](data/papers/labels/program_decompilation.md) (8)

data/labeldata/labeldata.json

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7442,6 +7442,37 @@
74427442
],
74437443
"url": "https://www.usenix.org/system/files/usenixsecurity24-zhao.pdf"
74447444
},
7445+
"When Threads Meet Interrupts: Effective Static Detection of Interrupt-Based Deadlocks in Linux": {
7446+
"type": "inproceedings",
7447+
"key": "chengfeng2024",
7448+
"title": "When Threads Meet Interrupts: Effective Static Detection of Interrupt-Based Deadlocks in Linux",
7449+
"author": "Chengfeng Ye, Yuandao Cai, and Charles Zhang,",
7450+
"booktitle": "33rd USENIX Security Symposium (USENIX Security 24)",
7451+
"year": "2024",
7452+
"venue": "USENIXSec2024",
7453+
"abstract": "Deadlocking is an unresponsive state of software that arises when threads hold locks while trying to acquire other locks that are already held by other threads, resulting in a circular lock dependency. Interrupt-based deadlocks, a specific and prevalent type of deadlocks that occur within the OS kernel due to interrupt preemption, pose significant risks to system functionality, performance, and security. However, existing static analysis tools focus on resource-based deadlocks without characterizing the interrupt preemption. In this paper, we introduce Archerfish, the first static analysis approach for effectively identifying interrupt-based deadlocks in the large-scale Linux kernel. At its core, Archerfish utilizes an Interrupt-Aware Lock Graph (ILG) to capture both regular and interrupt-related lock dependencies, reducing the deadlock detection problem to graph cycle discovery and refinement. Furthermore, Archerfish incorporates four effective analysis components to construct ILG and refine the deadlock cycles, addressing three core challenges, including the extensive interrupt-involving concurrency space, identifying potential interrupt handlers, and validating the feasibility of deadlock cycles. Our experimental results show that Archerfish can precisely analyze the Linux kernel (19.8 MLoC) in approximately one hour. At the time of writing, we have discovered 76 previously unknown deadlocks, with 53 bugs confirmed, 46 bugs already fixed by the Linux community, and 2 CVE IDs assigned. Notably, those found deadlocks are long-latent, hiding for an average of 9.9 years.",
7454+
"labels": [
7455+
"static analysis",
7456+
"bug detection",
7457+
"specification inference"
7458+
],
7459+
"url": "https://www.usenix.org/system/files/usenixsecurity24-zhao.pdf"
7460+
},
7461+
"Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing": {
7462+
"type": "inproceedings",
7463+
"key": "Asmita2024",
7464+
"title": "Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing",
7465+
"author": "Asmita, Yaroslav Oliinyk, Michael Scott, Ryan Tsang, Chongzhou Fang, and Houman Homayoun",
7466+
"booktitle": "33rd USENIX Security Symposium (USENIX Security 24)",
7467+
"year": "2024",
7468+
"venue": "USENIXSec2024",
7469+
"abstract": "BusyBox, an open-source software bundling over 300 essential Linux commands into a single executable, is ubiquitous in Linux-based embedded devices. Vulnerabilities in BusyBox can have far-reaching consequences, affecting a wide array of devices. This research, driven by the extensive use of BusyBox, delved into its analysis. The study revealed the prevalence of older BusyBox versions in real-world embedded products, prompting us to conduct fuzz testing on BusyBox. Fuzzing, a pivotal software testing method, aims to induce crashes that are subsequently scrutinized to uncover vulnerabilities. Within this study, we introduce two techniques to fortify software testing. The first technique enhances fuzzing by leveraging Large Language Models (LLM) to generate target-specific initial seeds. Our study showed a substantial increase in crashes when using LLM-generated initial seeds, highlighting the potential of LLM to efficiently tackle the typically labor-intensive task of generating target-specific initial seeds. The second technique involves repurposing previously acquired crash data from similar fuzzed targets before initiating fuzzing on a new target. This approach streamlines the time-consuming fuzz testing process by providing crash data directly to the new target before commencing fuzzing. We successfully identified crashes in the latest BusyBox target without conducting traditional fuzzing, emphasizing the effectiveness of LLM and crash reuse techniques in enhancing software testing and improving vulnerability detection in embedded systems. Additionally, manual triaging was performed to identify the nature of crashes in the latest BusyBox.",
7470+
"labels": [
7471+
"program testing",
7472+
"fuzzing"
7473+
],
7474+
"url": "https://www.usenix.org/system/files/usenixsecurity24-asmita.pdf"
7475+
},
74457476
"Gptscan: Detecting logic vulnerabilities in smart contracts by combining gpt with program analysis": {
74467477
"type": "inproceedings",
74477478
"key": "sun2024gptscan",
@@ -10045,6 +10076,8 @@
1004510076
]
1004610077
},
1004710078
"Hierarchical Repository-Level Code Summarization for Business Applications Using Local LLMs": {
10079+
"type": "INPROCEEDINGS",
10080+
"key": "nilesh2025",
1004810081
"author": "Nilesh Dhulshette, Sapan Shah, Vinay Kulkarni",
1004910082
"title": "Hierarchical Repository-Level Code Summarization for Business Applications Using Local LLMs",
1005010083
"url": "https://arxiv.org/pdf/2501.07857",
@@ -10059,6 +10092,8 @@
1005910092
"venue": "arXiv2025"
1006010093
},
1006110094
"Utilizing Precise and Complete Code Context to Guide LLM in Automatic False Positive Mitigation": {
10095+
"type": "INPROCEEDINGS",
10096+
"key": "jinbao2024",
1006210097
"author": "Jinbao Chen, Hongjing Xiang, Luhao Li, Yu Zhang, Boyao Ding, Qingwei Li",
1006310098
"title": "Utilizing Precise and Complete Code Context to Guide LLM in Automatic False Positive Mitigation",
1006410099
"url": "https://arxiv.org/pdf/2411.03079",
@@ -10069,6 +10104,34 @@
1006910104
],
1007010105
"venue": "arXiv2024"
1007110106
},
10107+
"Hermes: Unlocking Security Analysis of Cellular Network Protocols by Synthesizing Finite State Machines from Natural Language Specifications": {
10108+
"type": "INPROCEEDINGS",
10109+
"key": "hermes2024",
10110+
"author": "Abdullah Al Ishtiaq, Sarkar Snigdha Sarathi Das, Syed Md Mukit Rashid, Ali Ranjbar, Kai Tu, Tianwei Wu, Zhezheng Song, Weixuan Wang, Mujtahid Akon, Rui Zhang, Syed Rafiul Hussain",
10111+
"title": "Hermes: Unlocking Security Analysis of Cellular Network Protocols by Synthesizing Finite State Machines from Natural Language Specifications",
10112+
"url": "https://arxiv.org/abs/2310.04381",
10113+
"abstract": "In this paper, we present Hermes, an end-to-end framework to automatically generate formal representations from natural language cellular specifications. We first develop a neural constituency parser, NEUTREX, to process transition-relevant texts and extract transition components (i.e., states, conditions, and actions). We also design a domain-specific language to translate these transition components to logical formulas by leveraging dependency parse trees. Finally, we compile these logical formulas to generate transitions and create the formal model as finite state machines. To demonstrate the effectiveness of Hermes, we evaluate it on 4G NAS, 5G NAS, and 5G RRC specifications and obtain an overall accuracy of 81-87%, which is a substantial improvement over the state-of-the-art. Our security analysis of the extracted models uncovers 3 new vulnerabilities and identifies 19 previous attacks in 4G and 5G specifications, and 7 deviations in commercial 4G basebands.",
10114+
"labels": [
10115+
"static analysis",
10116+
"bug detection",
10117+
"specification inference"
10118+
],
10119+
"venue": "USENIXSec2024"
10120+
},
10121+
"CellularLint: A Systematic Approach to Identify Inconsistent Behavior in Cellular Network Specifications": {
10122+
"type": "INPROCEEDINGS",
10123+
"key": "CellularLint2024",
10124+
"author": "Mirza Masfiqur Rahman, Imtiaz Karim, and Elisa Bertino",
10125+
"title": "CellularLint: A Systematic Approach to Identify Inconsistent Behavior in Cellular Network Specifications",
10126+
"url": "https://www.usenix.org/system/files/usenixsecurity24-rahman.pdf",
10127+
"abstract": "In recent years, there has been a growing focus on scrutinizing the security of cellular networks, often attributing security vulnerabilities to issues in the underlying protocol design descriptions. These protocol design specifications, typically extensive documents that are thousands of pages long, can harbor inaccuracies, underspecifications, implicit assumptions, and internal inconsistencies. In light of the evolving landscape, we introduce CellularLint—a semi-automatic framework for inconsistency detection within the standards of 4G and 5G, capitalizing on a suite of natural language processing techniques. Our proposed method uses a revamped few-shot learning mechanism on domain-adapted large language models. Pre-trained on a vast corpus of cellular network protocols, this method enables CellularLint to simultaneously detect inconsistencies at various levels of semantics and practical use cases. In doing so, CellularLint significantly advances the automated analysis of protocol specifications in a scalable fashion. In our investigation, we focused on the Non-Access Stratum (NAS) and the security specifications of 4G and 5G networks, ultimately uncovering 157 inconsistencies with 82.67% accuracy. After verification of these inconsistencies on open-source implementations and 17 commercial devices, we confirm that they indeed have a substantial impact on design decisions, potentially leading to concerns related to privacy, integrity, availability, and interoperability.",
10128+
"labels": [
10129+
"static analysis",
10130+
"bug detection",
10131+
"specification inference"
10132+
],
10133+
"venue": "USENIXSec2024"
10134+
},
1007210135
"C2SaferRust: Transforming C Projects into Safer Rust with NeuroSymbolic Techniques": {
1007310136
"author": "Vikram Nitin, Rahul Krishna, Luiz Lemos do Valle, Baishakhi Ray",
1007410137
"title": "C2SaferRust: Transforming C Projects into Safer Rust with NeuroSymbolic Techniques",

data/papers/labels/bug_detection.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,12 @@
3030
- **Labels**: [static analysis](static_analysis.md), [bug detection](bug_detection.md)
3131

3232

33+
- [CellularLint: A Systematic Approach to Identify Inconsistent Behavior in Cellular Network Specifications](../venues/USENIXSec2024/paper_6.md), ([USENIXSec2024](../venues/USENIXSec2024/README.md))
34+
35+
- **Abstract**: In recent years, there has been a growing focus on scrutinizing the security of cellular networks, often attributing security vulnerabilities to issues in the underlying protocol design descriptions. These protocol design specifications, typically extensive documents that are thousands of pages long, can harbor inaccuracies, underspecifications, implicit assumptions, and internal inconsistencies. In light of the evolving landscape, we introduce CellularLint—a semi-automatic framework for inconsi...
36+
- **Labels**: [static analysis](static_analysis.md), [bug detection](bug_detection.md), [specification inference](specification_inference.md)
37+
38+
3339
- [Closing the Gap: A User Study on the Real-world Usefulness of AI-powered Vulnerability Detection & Repair in the IDE](../venues/ICSE2025/paper_1.md), ([ICSE2025](../venues/ICSE2025/README.md))
3440

3541
- **Abstract**: This paper presents the first empirical study of a vulnerability detection and fix tool with professional software developers on real projects that they own. We implemented DeepVulGuard, an IDE-integrated tool based on state-of-the-art detection and fix models, and show that it has promising performance on benchmarks of historic vulnerability data. DeepVulGuard scans code for vulnerabilities (including identifying the vulnerability type and vulnerable region of code), suggests fixes, provides na...
@@ -144,6 +150,12 @@
144150
- **Labels**: [static analysis](static_analysis.md), [bug detection](bug_detection.md)
145151

146152

153+
- [Hermes: Unlocking Security Analysis of Cellular Network Protocols by Synthesizing Finite State Machines from Natural Language Specifications](../venues/USENIXSec2024/paper_5.md), ([USENIXSec2024](../venues/USENIXSec2024/README.md))
154+
155+
- **Abstract**: In this paper, we present Hermes, an end-to-end framework to automatically generate formal representations from natural language cellular specifications. We first develop a neural constituency parser, NEUTREX, to process transition-relevant texts and extract transition components (i.e., states, conditions, and actions). We also design a domain-specific language to translate these transition components to logical formulas by leveraging dependency parse trees. Finally, we compile these logical for...
156+
- **Labels**: [static analysis](static_analysis.md), [bug detection](bug_detection.md), [specification inference](specification_inference.md)
157+
158+
147159
- [How Far Have We Gone in Vulnerability Detection Using Large Language Models](../venues/arXiv2023/paper_5.md), ([arXiv2023](../venues/arXiv2023/README.md))
148160

149161
- **Abstract**: As software becomes increasingly complex and prone to vulnerabilities, automated vulnerability detection is critically important, yet challenging. Given the significant successes of large language models (LLMs) in various tasks, there is growing anticipation of their efficacy in vulnerability detection. However, a quantitative understanding of their potential in vulnerability detection is still missing. To bridge this gap, we introduce a comprehensive vulnerability benchmark VulBench. This bench...
@@ -360,6 +372,12 @@
360372
- **Labels**: [static analysis](static_analysis.md), [bug detection](bug_detection.md), [benchmark](benchmark.md)
361373

362374

375+
- [When Threads Meet Interrupts: Effective Static Detection of Interrupt-Based Deadlocks in Linux](../venues/USENIXSec2024/paper_3.md), ([USENIXSec2024](../venues/USENIXSec2024/README.md))
376+
377+
- **Abstract**: Deadlocking is an unresponsive state of software that arises when threads hold locks while trying to acquire other locks that are already held by other threads, resulting in a circular lock dependency. Interrupt-based deadlocks, a specific and prevalent type of deadlocks that occur within the OS kernel due to interrupt preemption, pose significant risks to system functionality, performance, and security. However, existing static analysis tools focus on resource-based deadlocks without characteri...
378+
- **Labels**: [static analysis](static_analysis.md), [bug detection](bug_detection.md), [specification inference](specification_inference.md)
379+
380+
363381
- [Where is it? Tracing the Vulnerability-relevant Files from Vulnerability Reports](../venues/ICSE2024/paper_18.md), ([ICSE2024](../venues/ICSE2024/README.md))
364382

365383
- **Abstract**: With the widely usage of open-source software, supply-chain-based vulnerability attacks, including SolarWind and Log4Shell, have posed significant risks to software security. Currently, people rely on vulnerability advisory databases or commercial software bill of materials (SBOM) to defend against potential risks. Unfortunately, these datasets do not provide finer-grained file-level vulnerability information, compromising their effectiveness. Previous works have not adequately addressed this is...

data/papers/labels/fuzzing.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,12 @@
2424
- **Labels**: [program testing](program_testing.md), [fuzzing](fuzzing.md)
2525

2626

27+
- [Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing](../venues/USENIXSec2024/paper_4.md), ([USENIXSec2024](../venues/USENIXSec2024/README.md))
28+
29+
- **Abstract**: BusyBox, an open-source software bundling over 300 essential Linux commands into a single executable, is ubiquitous in Linux-based embedded devices. Vulnerabilities in BusyBox can have far-reaching consequences, affecting a wide array of devices. This research, driven by the extensive use of BusyBox, delved into its analysis. The study revealed the prevalence of older BusyBox versions in real-world embedded products, prompting us to conduct fuzz testing on BusyBox. Fuzzing, a pivotal software te...
30+
- **Labels**: [program testing](program_testing.md), [fuzzing](fuzzing.md)
31+
32+
2733
- [Fuzzing JavaScript Interpreters with Coverage-Guided Reinforcement Learning for LLM-Based Mutation](../venues/ISSTA2024/paper_22.md), ([ISSTA2024](../venues/ISSTA2024/README.md))
2834

2935
- **Abstract**: JavaScript interpreters, crucial for modern web browsers, require an effective fuzzing method to identify security-related bugs. However, the strict grammatical requirements for input present significant challenges. Recent efforts to integrate language models for context- aware mutation in fuzzing are promising but lack the necessary coverage guidance to be fully effective. This paper presents a novel technique called CovRL (Coverage-guided Reinforcement Learning) that combines Large Language Mo...

0 commit comments

Comments
 (0)