Projects

Software Engineering Tools & Research Projects

A collection of software engineering tools, research projects, and collaborative initiatives focused on code analysis, testing, and software quality improvement.

Current and Past Research Projects

MAINTAIN - Intelligent Maintenance of Software Systems

Team: Matheus Paixao, Paulo Henrique Maia, Ismayle De Sousa Santos, Jerffeson Teixeira De Souza, Allysson Allex de Paula Araújo, Anderson Uchôa, Thiago Ferreira, Chaiyong Ragkhitwetsagul

Funding: CNPq/MCTI No 10/2023 - Universal 2023

This research is led by Dr. Matheus Paixao and the team of UECE, Brazil. We plan to adopt the state-of-the-art in machine learning and software engineering to develop a novel approach for software maintenance.

Code Similarity Applications for Improving Software Quality

Team: Chaiyong Ragkhitwetsagul, Prof. Peter Haddawy (Mentor)

Funding: Research Grant for New Scholar (RGNS 2021) - Ministry of Higher Education Science Research and Innovation (MHESRI)

This research project aims to propose a novel approach to improve software by using code similarity techniques and create an automated tool for detecting similar code and giving relevant recommendations.

Automated Software Engineering for Thailand Software Industry (ASETSI)

Team: Chaiyong Ragkhitwetsagul, Jens Krinke, Morakot Choetkiertikul, Thanwadee Sunetnanta, Federica Sarro, ProGaming, Trinity Roots, Zwiz.AI, CourseSquare

This Thai-UK collaboration aims to improve the quality of software development processes and products in Thailand's software industry by transferring the knowledge of state-of-the-art automated software engineering (ASE) tools and techniques from the UK experts.

Toxic Code Snippets on Stack Overflow

Team: C. Ragkhitwetsagul, J. Krinke, M. Paixao, G. Bianco, R. Oliveto

Journal-First Presentation at ESEC/FSE 2019

Our clone detection found online clone pairs between 72,365 Java code snippets on Stack Overflow and 111 open source projects. We found 100 of them (66%) to be outdated and potentially harmful for reuse.

Image-based Code Clone Detection

Team: C. Ragkhitwetsagul, J. Krinke, B. Marnette

🏆 Won the People's Choice Award at IWSC 2018!

We introduce a new code clone detection technique based on image similarity. The technique captures visual perception of code seen by humans in an IDE by applying syntax highlighting and images conversion on raw source code text.

A Comparison of Code Similarity Analysers

Team: C. Ragkhitwetsagul, J. Krinke, D. Clark

Empirical Software Engineering Journal

We evaluate 30 code similarity detection techniques and tools using five experimental scenarios for Java source code. Our study strongly validates the use of compilation/decompilation as a normalisation technique.

Using Compilation-Decompilation to Enhance Clone Detection

Team: C. Ragkhitwetsagul, J. Krinke

🏆 Won the People's Choice Award at IWSC 2017!

We study effects of compilation and decompilation to code clone detection in Java. By combining clones found before and after decompilation, one can achieve higher recall without losing precision.

Measuring Code Similarity in Large-scaled Code Corpora

Team: C. Ragkhitwetsagul

Accepted at ICSME 2016 Doctoral Symposium!

We propose a scalable search system specifically designed for source code. Our proposed code search framework based on information retrieval, tokenisation, code normalisation, and variable-length gram.

Similarity of Source Code in the Presence of Pervasive Modifications

Team: C. Ragkhitwetsagul, J. Krinke, D. Clark

Accepted at SCAM 2016!

We compare 30 similarity detection techniques and tools against pervasive code modifications. This broad, thorough study is the largest in existence and potentially an invaluable guide for future users.

Searching for Configurations in Clone Evaluation: A Replication Study

Team: C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke and J.H. Drake

Accepted at SSBSE 2016!

We replicate a study of a genetic-algorithm based framework that optimises parameters for clone agreement (EvaClone). We observe that the optimised parameters outperform the tools' default parameters by 19.91% to 66.43%.

Software Tools & Applications

Achilles: Automated Tool for Detecting and Visualizing npm Dependency Vulnerabilities

Team: V. Jarukitpipat, W. Wanprasert, K. Chhun, C. Ragkhitwetsagul, R. G. Kula, T. Sunetnant, M. Choetkiertikul

Achilles assists programmers in comprehending and analyzing potential risks of vulnerabilities of npm packages via the dependency graph visualization and analysis report. The tool can detect vulnerabilities currently existing in several most-starred GitHub open source projects.

JARVAN: An Automatic Checker for Reused Stack Overflow Code Snippets in Java Software Projects

Team: P. Poolthong, P. Sirilertworakul, K. Wonwien, C. Ragkhitwetsagul

JARVAN is an automated checker for reused code snippets from Stack Overflow in Java software projects. It helps analyze code clones from Stack Overflow within GitHub repositories and can detect software license violations caused by reusing code from Stack Overflow.

Teddy: Automatic Recommendation of Pythonic Idiom Usage For Pull-Based Software Projects

Team: Phan-udom, P., Wattanakul, N., Sakulniwat, T., Ragkhitwetsagul, C., Sunetnanta, T., Choetkiertikul, M., & Kula, R. G.

🏆 Won the Best Tool Demo Award at ICSME 2020!
Teddy is an automated tool that helps check Pythonic idiom usage. It offers a prevention mode with Just-In-Time analysis and a detection mode with historical analysis to run thorough scans of idiomatic and non-idiomatic code.

Siamese: Scalable and Incremental Code Clone Search via Multiple Code Representations

Team: C. Ragkhitwetsagul, J. Krinke

Journal-First Presentation at ICSE 2020
Siamese offers the highest mean average precision of 95% and 99% on two clone benchmarks compared to seven state-of-the-art clone detection tools. It can return cloned code snippets within 8 seconds for a code corpus of 365 million lines of code.

My Master's Degree Projects

Honeydew: Predicting Meeting Date Using Machine Learning Algorithm

Team: C. Ragkhitwetsagul

Project from master degree

Honeydew is an intelligent agent that uses machine-learning algorithms to extract all required information from a meeting email and creates suggestions for the user. The agent can dramatically reduce user's workload of extracting information from emails and putting them in calendar applications.

FoxBeacon: Web Bug Detector Implementing P3P Compact Policy for Mozilla Firefox

Team: C. Ragkhitwetsagul

Project from master degree

FoxBeacon is a web bug detector designed as an extension of Mozilla Firefox. It reads every incoming web page and tries to find hidden web bugs, treating each image as a web bug according to pre-defined rules. It also includes the compact policy component of Platform for Privacy Policy (P3P).

Evaluating Genetic Algorithm for Selection of Similarity Functions for Record Linkage

Team: F. Shaikh and C. Ragkhitwetsagul

Project from master degree

We evaluate an approach in which the optimal combination of similarity function for a given type of input data records is searched using Genetic Algorithms, casting the record linkage problem as a classification problem.

Hercules File System: A Scalable Fault Tolerant Distributed File System

Team: F. Shaikh, C. Ragkhitwetsagul, A.R. Pathan

Project from master degree

We introduce the design of the Hercules File System (HFS), a distributed file system with scalable MDS cluster and scalable and fault-tolerant DS cluster. The file system allows servers to be dynamically added while the system is running without disrupting normal operations.