Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Code clone detection is a fundamental task in software engineering that
underpins refactoring, debugging, plagiarism detection, and vulnerability
analysis. Existing methods often rely on singular representations such as
abstract syntax trees (ASTs), control flow graphs (CFGs), and data flow graphs
(DFGs), which capture only partial aspects of code semantics. Hybrid approaches
have emerged, but their fusion strategies are typically handcrafted and
ineffective. In this study, we propose MAGNET, a multi-graph attentional
framework that jointly leverages AST, CFG, and DFG representations to capture
syntactic and semantic features of source code. MAGNET integrates residual
graph neural networks with node-level self-attention to learn both local and
long-range dependencies, introduces a gated cross-attention mechanism for
fine-grained inter-graph interactions, and employs Set2Set pooling to fuse
multi-graph embeddings into unified program-level representations. Extensive
experiments on BigCloneBench and Google Code Jam demonstrate that MAGNET
achieves state-of-the-art performance with an overall F1 score of 96.5\% and
99.2\% on the two datasets, respectively. Ablation studies confirm the critical
contributions of multi-graph fusion and each attentional component. Our code is
available at https://github.com/ZixianReid/Multigraph_match
Authors (2)
Zixian Zhang
Takfarinas Saber
Submitted
October 28, 2025
Key Contributions
MAGNET is proposed as a novel multi-graph attentional framework that jointly leverages AST, CFG, and DFG representations for code clone detection. By integrating residual GNNs, self-attention, and a gated cross-attention mechanism, it captures richer syntactic and semantic features and their inter-dependencies, outperforming existing hybrid approaches with handcrafted fusion strategies.
Business Value
Improves software development efficiency and security by automating the detection of redundant or potentially vulnerable code, aiding in refactoring, debugging, and vulnerability analysis.