ANALYSIS ON LARGE LANGUAGE MODEL VULNERABLE CODE GENERATION AND SELF-REPAIR ABILITY

Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/248151

Title:	ANALYSIS ON LARGE LANGUAGE MODEL VULNERABLE CODE GENERATION AND SELF-REPAIR ABILITY
Authors:	KIM SUNG YONG
ORCID iD:	orcid.org/0009-0008-6885-4965
Keywords:	large language model, security, static application security testing tools, code generation
Issue Date:	19-Dec-2023
Citation:	KIM SUNG YONG (2023-12-19). ANALYSIS ON LARGE LANGUAGE MODEL VULNERABLE CODE GENERATION AND SELF-REPAIR ABILITY. ScholarBank@NUS Repository.
Abstract:	This thesis investigates Large Language Models' (LLMs) propensity to produce vulnerable code and their self-repair capabilities in coding. Analyzing a novel dataset from real-world prompts, including 751 instances of vulnerable code generated from 90 prompts by ChatGPT, the study employs Static Application Security Testing tools to examine these issues. It introduces two strategies for reducing vulnerabilities: "iteration repair," which iteratively corrects generated code, and "preshot repair," anticipating vulnerabilities to prevent insecure code generation. Implemented in "Codexity," a tool with a VS Code extension, these methods significantly reduced vulnerable code production, with "iteration repair" achieving a 60% reduction and "preshot repair" up to 36.5%. The effectiveness of these strategies is highlighted through comparisons with existing tools, demonstrating LLMs' potential to improve coding security and efficiency.
URI:	https://scholarbank.nus.edu.sg/handle/10635/248151
Appears in Collections:	Master's Theses (Open)

File	Description	Size	Format	Access Settings	Version
KimSY.pdf		553.14 kB	Adobe PDF	OPEN	None	View/Download

Check