Software Engineering Tools & Research Projects
A collection of software engineering tools, research projects, and collaborative initiatives focused on code analysis, testing, and software quality improvement.
Team: Matheus Paixao, Paulo Henrique Maia, Ismayle De Sousa Santos, Jerffeson Teixeira De Souza, Allysson Allex de Paula Araújo, Anderson Uchôa, Thiago Ferreira, Chaiyong Ragkhitwetsagul
Funding: CNPq/MCTI No 10/2023 - Universal 2023
This research is led by Dr. Matheus Paixao and the team of UECE, Brazil. We plan to adopt the state-of-the-art in machine learning and software engineering to develop a novel approach for software maintenance.
Team: Chaiyong Ragkhitwetsagul, Prof. Peter Haddawy (Mentor)
Funding: Research Grant for New Scholar (RGNS 2021) - Ministry of Higher Education Science Research and Innovation (MHESRI)
This research project aims to propose a novel approach to improve software by using code similarity techniques and create an automated tool for detecting similar code and giving relevant recommendations.
Team: Chaiyong Ragkhitwetsagul, Jens Krinke, Morakot Choetkiertikul, Thanwadee Sunetnanta, Federica Sarro, ProGaming, Trinity Roots, Zwiz.AI, CourseSquare
This Thai-UK collaboration aims to improve the quality of software development processes and products in Thailand's software industry by transferring the knowledge of state-of-the-art automated software engineering (ASE) tools and techniques from the UK experts.
Team: C. Ragkhitwetsagul, J. Krinke, M. Paixao, G. Bianco, R. Oliveto
Journal-First Presentation at ESEC/FSE 2019
Our clone detection found online clone pairs between 72,365 Java code snippets on Stack Overflow and 111 open source projects. We found 100 of them (66%) to be outdated and potentially harmful for reuse.
Team: C. Ragkhitwetsagul, J. Krinke, B. Marnette
🏆 Won the People's Choice Award at IWSC 2018!
We introduce a new code clone detection technique based on image similarity. The technique captures visual perception of code seen by humans in an IDE by applying syntax highlighting and images conversion on raw source code text.
Team: C. Ragkhitwetsagul, J. Krinke, D. Clark
Empirical Software Engineering Journal
We evaluate 30 code similarity detection techniques and tools using five experimental scenarios for Java source code. Our study strongly validates the use of compilation/decompilation as a normalisation technique.
Team: C. Ragkhitwetsagul, J. Krinke
🏆 Won the People's Choice Award at IWSC 2017!
We study effects of compilation and decompilation to code clone detection in Java. By combining clones found before and after decompilation, one can achieve higher recall without losing precision.
Team: C. Ragkhitwetsagul
Accepted at ICSME 2016 Doctoral Symposium!
We propose a scalable search system specifically designed for source code. Our proposed code search framework based on information retrieval, tokenisation, code normalisation, and variable-length gram.
Team: C. Ragkhitwetsagul, J. Krinke, D. Clark
Accepted at SCAM 2016!
We compare 30 similarity detection techniques and tools against pervasive code modifications. This broad, thorough study is the largest in existence and potentially an invaluable guide for future users.
Team: C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke and J.H. Drake
Accepted at SSBSE 2016!
We replicate a study of a genetic-algorithm based framework that optimises parameters for clone agreement (EvaClone). We observe that the optimised parameters outperform the tools' default parameters by 19.91% to 66.43%.
Team: V. Jarukitpipat, W. Wanprasert, K. Chhun, C. Ragkhitwetsagul, R. G. Kula, T. Sunetnant, M. Choetkiertikul
Achilles assists programmers in comprehending and analyzing potential risks of vulnerabilities of npm packages via the dependency graph visualization and analysis report. The tool can detect vulnerabilities currently existing in several most-starred GitHub open source projects.
Team: P. Poolthong, P. Sirilertworakul, K. Wonwien, C. Ragkhitwetsagul
JARVAN is an automated checker for reused code snippets from Stack Overflow in Java software projects. It helps analyze code clones from Stack Overflow within GitHub repositories and can detect software license violations caused by reusing code from Stack Overflow.
Team: Phan-udom, P., Wattanakul, N., Sakulniwat, T., Ragkhitwetsagul, C., Sunetnanta, T., Choetkiertikul, M., & Kula, R. G.
🏆 Won the Best Tool Demo Award at ICSME 2020!
Teddy is an automated tool that helps check Pythonic idiom usage. It offers a prevention mode with Just-In-Time
analysis and a detection mode with historical analysis to run thorough scans of idiomatic and non-idiomatic code.
Team: C. Ragkhitwetsagul, J. Krinke
Journal-First Presentation at ICSE 2020
Siamese offers the highest mean average precision of 95% and 99% on two clone benchmarks compared to seven
state-of-the-art clone detection tools. It can return cloned code snippets within 8 seconds for a code corpus
of 365 million lines of code.
Team: C. Ragkhitwetsagul
Project from master degree
Honeydew is an intelligent agent that uses machine-learning algorithms to extract all required information from a meeting email and creates suggestions for the user. The agent can dramatically reduce user's workload of extracting information from emails and putting them in calendar applications.
Team: C. Ragkhitwetsagul
Project from master degree
FoxBeacon is a web bug detector designed as an extension of Mozilla Firefox. It reads every incoming web page and tries to find hidden web bugs, treating each image as a web bug according to pre-defined rules. It also includes the compact policy component of Platform for Privacy Policy (P3P).
Team: F. Shaikh and C. Ragkhitwetsagul
Project from master degree
We evaluate an approach in which the optimal combination of similarity function for a given type of input data records is searched using Genetic Algorithms, casting the record linkage problem as a classification problem.
Team: F. Shaikh, C. Ragkhitwetsagul, A.R. Pathan
Project from master degree
We introduce the design of the Hercules File System (HFS), a distributed file system with scalable MDS cluster and scalable and fault-tolerant DS cluster. The file system allows servers to be dynamically added while the system is running without disrupting normal operations.