Learn from yesterday, live for today, hope for tomorrow. The important thing is not to stop questioning – Albert Einstein

Potential Topics for Students

I'm looking for students (undergrad, master's, or PhD) who are interested in software engineering research and applications on the following topics (or any topic of your interest). Please feel free to contact me directly.


Topics for Students


For undergraduate students in 2022, I will be looking for students who are interested in:

   Scalable Code Clone Search Using Code Vector Representation
This project aims to leverage the infrastructure of Siamese to make code clone search using code vector representation (a vector created from a given code snippet) scalable to large code base such as Stack Overflow or GitHub.
This project is supported by Research Grant for New Scholar (RGNS 2021) - Ministry of Higher Education Science Research and Innovation (MHESRI).


Other Interesting Topics


   Automated recommendation of adapting code to Pythonic idioms
This is a follow-up work from the Teddy project. We aim to not only detect the appearance of non-Pythonic code snippets, but also to recommend how to transform a non-idiomatic code into a Pythonic idiom one.

   Are developers aware of clones when they make code changes?
This projects aim to leverage the rich information in code review process and automatic code clone detection to assess the awareness of developers to code cloning on a day-to-day basis. Code clone detection will be performed before and after each code review patch is submitted. With detected clones, code changes from the patch, and natural language text in the review, we can ask several interesting questions by performing a manual analysis of the results: (1) How often are developers aware, i.e., do they discuss them in code review, when clones are introduced into the systems? (2) When they discuss about clones, what happens to the clones in the system after the discussion? (3) What are common intents when developers introduce clones into the software? (4) What is the developer's perception about code clones? Should clones always be removed or it's just good to know they're there? This works has a possibility to collaborate with CREST researchers.

   Code clone detection using coding style
Base on the naturalness of software, it might be possible that a programmer's coding style can be an indicator of code copying and pasting. This project aims to investigate the potential application of applying coding style detection to code clone detection problem. One has to work finding a technique to identify coding style of a given software project, and then detect the newly submitted patch to see if the code conforms to the project's coding style.

   Automated software license identification using deep learning techniques
This project focuses on training a deep learning model to predict software license based on its license text. The benefits of the project include the analysis of software licenses on GitHub and automatic detection of software license conflicts.

   Virtual reality-based software visualisation
I am interested in exploring an application of the latest VR technology to visualise software projects and help programmers understand their software better.

   Automated testing for 3D games
Testing framework for games is still underexplored. We aim to facilitate the tedious task of game testing by developing automated testing techniques specifically for 3D games. There is a possibility to adopt search-based testing techniques succesfully applied to traditional software on games.