Towards a Block-Level ML-Based Python Vulnerability Detection Tool

Amirreza Bagheri; Péter Hegedűs

doi:10.14232/actacyb.299667

Authors

Amirreza Bagheri Institute of Informatics, University of Szeged, Hungary https://orcid.org/0000-0001-9691-7937
Péter Hegedűs Institute of Informatics, University of Szeged, Hungary https://orcid.org/0000-0003-4592-6504

DOI:

https://doi.org/10.14232/actacyb.299667

Keywords:

deep learning, vulnerability detection, source code embedding, data mining

Abstract

Computer software is driving our everyday life, therefore their security is pivotal. Unfortunately, security flaws are common in software systems, which can result in a variety of serious repercussions, including data loss, secret information disclosure, manipulation, or system failure. Although techniques for detecting vulnerable code exist, the improvement of their accuracy and effectiveness to a practically applicable level remains a challenge. Many existing methods require a substantial amount of human expert labor to develop attributes that indicate vulnerabilities. In previous work, we have shown that machine learning is suitable for solving the issue automatically by learning features from a vast collection of real-world code and predicting vulnerable code locations. Applying a BERT-based code embedding, LSTM models with the best hyperparameters were able to identify seven different security flaws in Python source code with high precision (average of 91%) and recall (average of 83%). Upon the encouraging first empirical results, we go beyond this paper and discuss the challenges of applying these models in practice and outlining a method that solves these issues. Our goal is to develop a hands-on tool for developers that they can use to pinpoint potentially vulnerable spots in their code.

Downloads

Download data is not yet available.

Towards a Block-Level ML-Based Python Vulnerability Detection Tool

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

How to Cite

Issue

Section

Most read articles by the same author(s)

Developed By

Information

Make a Submission

Current Issue