How to Refactor this Code? An Exploratory Study on Developer-ChatGPT Refactoring Conversations

AI-generated keywords: ChatGPT code refactoring developer interactions model performance learning settings

AI-generated Key Points

  • Study focuses on interactions between developers and ChatGPT in code refactoring context
  • Developers often provide code fragments and textual descriptions for refactoring needs
  • ChatGPT offers informative suggestions but has limited understanding of broader codebase context
  • Challenges can arise from missing dependencies, codependencies, compiler errors, and test failures
  • Quality of prompts influences ChatGPT's performance; high-quality prompts yield effective responses
  • Model performs well for non-urgent issues like code style improvements or adding comments
  • Issues arise when ChatGPT misunderstands or gets confused by reported code fragments or incomplete inputs
  • Different learning settings observed: zero-shot learning vs. few-shot learning during refactoring conversations
  • Manual inspection of GitHub repositories reveals 43 refactoring documentation patterns in developer-ChatGPT conversations
  • Findings contribute to understanding dynamics between developers and AI models in code refactoring, aiming to enhance model capabilities and software engineering best practices
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Eman Abdullah AlOmar, Anushkrishna Venkatakrishnan, Mohamed Wiem Mkaouer, Christian D. Newman, Ali Ouni

License: CC ZERO 1.0

Abstract: Large Language Models (LLMs), like ChatGPT, have gained widespread popularity and usage in various software engineering tasks, including refactoring, testing, code review, and program comprehension. Despite recent studies delving into refactoring documentation in commit messages, issues, and code review, little is known about how developers articulate their refactoring needs when interacting with ChatGPT. In this paper, our goal is to explore conversations between developers and ChatGPT related to refactoring to better understand how developers identify areas for improvement in code and how ChatGPT addresses developers' needs. Our approach relies on text mining refactoring-related conversations from 17,913 ChatGPT prompts and responses, and investigating developers' explicit refactoring intention. Our results reveal that (1) developer-ChatGPT conversations commonly involve generic and specific terms/phrases; (2) developers often make generic refactoring requests, while ChatGPT typically includes the refactoring intention; and (3) various learning settings when prompting ChatGPT in the context of refactoring. We envision that our findings contribute to a broader understanding of the collaboration between developers and AI models, in the context of code refactoring, with implications for model improvement, tool development, and best practices in software engineering.

Submitted to arXiv on 08 Feb. 2024

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2402.06013v1

In this study, we examine the interactions between developers and ChatGPT in the context of code refactoring. Our goal is to gain insights into how developers articulate their refactoring needs and how ChatGPT addresses those needs. Through our analysis of 17,913 ChatGPT prompts and responses, we have discovered that developers often copy and paste code fragments that require refactoring along with textual descriptions of their desired changes. While ChatGPT provides informative suggestions for refactorings, its understanding of the broader context of the codebase is limited. This can lead to potential issues such as missing dependencies and codependencies. Additionally, developers may encounter challenges with suggested code changes including compiler errors and test failures resulting from ChatGPT-provided solutions. Our research also highlights the importance of high-quality prompts in eliciting effective responses from ChatGPT. We have found that the model's performance is closely tied to the quality of its training data. For non-urgent issues like code style improvements or adding comments, insightful suggestions are offered by ChatGPT. However, complexities arise when ChatGPT misunderstands or becomes confused by reported code fragments or incomplete inputs. By identifying common patterns and challenges in these conversations between developers and AI models like ChatGPT, researchers and developers can work towards enhancing its capabilities to better meet developer needs. Furthermore, our exploration has uncovered variability in learning settings during refactoring conversations between developers and ChatGPT. Some developers rely on zero-shot learning by using the model's generative ability to propose fixes for unseen data based on prior training. Others opt for few-shot learning by providing code fragments and checking for design antipatterns. Through manual inspection of GitHub repositories to identify refactoring documentation patterns represented as keywords or phrases in developer-ChatGPT conversations, we have compiled a list of 43 such patterns. Overall, our findings shed light on the dynamics between developers and AI models like ChatGPT in the realm of code refactoring. By understanding these interactions more deeply, we can pave the way for improving model performance, tool development, and best practices in software engineering.
Created on 18 Mar. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.