Provided below are supplements to the original research on extract method refactoring that was performed using AntiCopyPaster.
Experiment Data:
The data collected for our experiments has been made available on our website. The dataset also includes details on the project's and metrics. Also made available are the Convolutional Neural Network (CNN) used in our study's precision and recall experiment.
Code Metrics:
The goal of selecting metrics is to identify patterns in their values to allow distinguishing between the two classes of fragments. In total, we selected 78 metrics that can be divided into four main categories: Keywords, Design Size, Complexity, and Coupling. The list of metrics is available here [to be replaced with the web page that will be linked to the advanced setting].
Tool Correctness:
AntiCopyPaster is able to correctly extract code duplicates with a precision score of 82% and a recall score of 82% with an average F-score of 82%.