skip to main content
10.1109/ICSE43902.2021.00077acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Can Program Synthesis be Used to Learn Merge Conflict Resolutions?: An Empirical Analysis

Published: 05 November 2021 Publication History

Abstract

Forking structure is widespread in the open-source repositories and that causes a significant number of merge conflicts. In this paper, we study the problem of textual merge conflicts from the perspective of Microsoft Edge, a large, highly collaborative fork of the main Chromium branch with significant merge conflicts. Broadly, this study is divided into two sections. First, we empirically evaluate textual merge conflicts in Microsoft Edge and classify them based on the type of files, location of conflicts in a file, and the size of conflicts. We found that ~28% of the merge conflicts are 1-2 line changes, and many resolutions have frequent patterns. Second, driven by these findings, we explore Program Synthesis (for the first time) to learn patterns and resolve structural merge conflicts. We propose a novel domain-specific language (DSL) that captures many of the repetitive merge conflict resolution patterns and learn resolution strategies as programs in this DSL from example resolutions. We found that the learned strategies can resolve 11.4% of the conflicts (~41% of 1-2 line changes) that arise in the C++ files with 93.2% accuracy.

References

[1]
G. Ghiotto, L. Murta, M. Barros, and A. van der Hoek, "On the nature of merge conflicts: A study of 2,731 open source Java projects hosted by GitHub," IEEE Transactions on Software Engineering, vol. 46, no. 8, pp. 892--915, 2020.
[2]
R. P. Buse and W. R. Weimer, "Automatically documenting program changes," in Proceedings of the IEEE/ACM international conference on Automated software engineering, 2010, pp. 33--42.
[3]
S. Rastkar and G. C. Murphy, "Why did this code change?" in 2013 35th International Conference on Software Engineering (ICSE). IEEE, 2013, pp. 1193--1196.
[4]
L. F. Cortés-Coy, M. Linares-Vásquez, J. Aponte, and D. Poshyvanyk, "On automatically generating commit messages via summarization of source code changes," in 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation. IEEE, 2014, pp. 275--284.
[5]
B. Fluri, M. Wuersch, M. Plnzger, and H. Gall, "Change distilling: Tree differencing for fine-grained source code change extraction," IEEE Transactions on software engineering, vol. 33, no. 11, pp. 725--743, 2007.
[6]
S. Jiang, A. Armaly, and C. McMillan, "Automatically generating commit messages from diffs using neural machine translation," in 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 2017, pp. 135--146.
[7]
M. Owhadi-Kareshk, S. Nadi, and J. Rubin, "Predicting merge conflicts in collaborative software development," in 2019 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). IEEE, 2019, pp. 1--11.
[8]
S. Apel, J. Liebig, B. Brandl, C. Lengauer, and C. Kästner, "Semistructured merge: Rethinking merge in revision control systems," 2011, pp. 190--200.
[9]
O. Leßenich, S. Apel, and C. Lengauer, "Balancing precision and performance in structured merge," Automated Software Engineering, vol. 22, no. 3, pp. 367--397, 2015.
[10]
J. Matsumoto, Y. Higo, and S. Kusumoto, "Beyond gumtree: a hybrid approach to generate edit scripts," in 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). IEEE, 2019, pp. 550--554.
[11]
S. Raghavan, R. Rohana, D. Leon, A. Podgurski, and V. Augustine, "Dex: A semantic-graph differencing tool for studying changes in large code bases," in 20th IEEE International Conference on Software Maintenance, 2004. Proceedings. IEEE, 2004, pp. 188--197.
[12]
D. Reuling, U. Kelter, J. Bürdek, and M. Lochau, "Automated n-way program merging for facilitating family-based analyses of variant-rich software," ACM Transactions on Software Engineering and Methodology (TOSEM), vol. 28, no. 3, pp. 1--59, 2019.
[13]
M. Sousa, I. Dillig, and S. K. Lahiri, "Verified three-way program merge," Proceedings of the ACM on Programming Languages, vol. 2, no. OOPSLA, pp. 1--29, 2018.
[14]
O. Polozov and S. Gulwani, "Flashmeta: a framework for inductive program synthesis," in Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, 2015, pp. 107--126.
[15]
A. Miltner, S. Gulwani, V. Le, A. Leung, A. Radhakrishna, G. Soares, A. Tiwari, and A. Udupa, "On the fly synthesis of edit suggestions," Proceedings of the ACM on Programming Languages, vol. 3, no. OOPSLA, pp. 1--29, 2019.
[16]
GitHub, "GitHub Fork Projects in C++," https://tinyurl.com/yxopll6f, 2020.
[17]
J.-R. Falleri, F. Morandat, X. Blanc, M. Martinez, and M. Monperrus, "Fine-grained and accurate source code differencing," in Proceedings of the 29th ACM/IEEE international conference on Automated software engineering, 2014, pp. 313--324.
[18]
M. PROSE, "Microsoft Program Synthesis using Examples SDK," https://github.com/Microsoft/prose, 2020.
[19]
S. Gulwani, "Automating string processing in spreadsheets using input-output examples," in Proceedings of the 38th annual ACM SIGPLANSIGACT symposium on Principles of programming languages, 2011, pp. 317--330.
[20]
V. Le and S. Gulwani, "Flashextract: A framework for data extraction by examples," in Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, ser. PLDI '14. New York, NY, USA: Association for Computing Machinery, 2014, p. 542--553. [Online]. Available: https://doi.org/10.1145/2594291.2594333
[21]
R. Rolim, G. Soares, L. D'Antoni, O. Polozov, S. Gulwani, R. Gheyi, R. Suzuki, and B. Hartmann, "Learning syntactic program transformations from examples," in 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). IEEE, 2017, pp. 404--415.
[22]
S. Horwitz, J. Prins, and T. Reps, "Integrating noninterfering versions of programs," ACM Trans. Program. Lang. Syst., vol. 11, no. 3, pp. 345--387, 1989.
[23]
C. Sung, S. K. Lahiri, M. Kaufman, P. Choudhury, and C. Wang, "Towards understanding and fixing upstream merge induced conflicts in divergent forks: An industrial case study," ICSE Software Engineering in Practice (ICSE SEIP), 2020.
[24]
J. Bader, A. Scott, M. Pradel, and S. Chandra, "Getafix: Learning to fix bugs automatically," Proceedings of the ACM on Programming Languages, vol. 3, no. OOPSLA, pp. 1--27, 2019.
[25]
V. Le and S. Gulwani, "Flashextract: a framework for data extraction by examples," in Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2014, pp. 542--553.
[26]
N. Meng, M. Kim, and K. S. McKinley, "Systematic editing: generating program transformations from an example," ACM SIGPLAN Notices, vol. 46, no. 6, pp. 329--342, 2011.
[27]
J. Andersen and J. L. Lawall, "Generic patch inference," Automated software engineering, vol. 17, no. 2, pp. 119--148, 2010.
[28]
J. Andersen, A. C. Nguyen, D. Lo, J. L. Lawall, and S.-C. Khoo, "Semantic patch inference," in 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering. IEEE, 2012, pp. 382--385.
[29]
N. Meng, M. Kim, and K. S. McKinley, "Lase: Locating and applying systematic edits by learning from examples," in Proceedings of the 2013 International Conference on Software Engineering, ser. ICSE '13. IEEE Press, 2013, p. 502--511.
[30]
V. Basili, F. Shull, and F. Lanubile, "Building knowledge through families of experiments," in IEEE Transactions on Software Engineering, 1999, pp. 456--473.

Cited By

View all
  • (2024)Understanding the Impact of Branch Edit Features for the Automatic Prediction of Merge Conflict ResolutionsProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3644433(149-160)Online publication date: 15-Apr-2024
  • (2024)ConflictBenchJournal of Systems and Software10.1016/j.jss.2024.112084214:COnline publication date: 1-Aug-2024
  • (2023)Programming by Example Made EasyACM Transactions on Software Engineering and Methodology10.1145/360718533:1(1-36)Online publication date: 7-Jul-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '21: Proceedings of the 43rd International Conference on Software Engineering
May 2021
1768 pages
ISBN:9781450390859

Sponsors

Publisher

IEEE Press

Publication History

Published: 05 November 2021

Check for updates

Author Tags

  1. automated fixing
  2. merge conflict
  3. program synthesis

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICSE '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)3
Reflects downloads up to 04 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Understanding the Impact of Branch Edit Features for the Automatic Prediction of Merge Conflict ResolutionsProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3644433(149-160)Online publication date: 15-Apr-2024
  • (2024)ConflictBenchJournal of Systems and Software10.1016/j.jss.2024.112084214:COnline publication date: 1-Aug-2024
  • (2023)Programming by Example Made EasyACM Transactions on Software Engineering and Methodology10.1145/360718533:1(1-36)Online publication date: 7-Jul-2023
  • (2023)A Characterization Study of Merge Conflicts in Java ProjectsACM Transactions on Software Engineering and Methodology10.1145/354694432:2(1-28)Online publication date: 31-Mar-2023
  • (2023)Automatic prediction of developers’ resolutions for software merge conflictsJournal of Systems and Software10.1016/j.jss.2023.111836206:COnline publication date: 1-Dec-2023
  • (2022)Synthesizing code quality rules from examplesProceedings of the ACM on Programming Languages10.1145/35633506:OOPSLA2(1757-1787)Online publication date: 31-Oct-2022
  • (2022)Towards Merge Conflict Resolution by Combining Existing Lines of CodeProceedings of the XXXVI Brazilian Symposium on Software Engineering10.1145/3555228.3555229(425-434)Online publication date: 5-Oct-2022
  • (2022)Detecting Build Conflicts in Software Merge for Java Programs via Static AnalysisProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3556950(1-13)Online publication date: 10-Oct-2022
  • (2022)AutoTSG: learning and synthesis for incident troubleshootingProceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3540250.3558958(1477-1488)Online publication date: 7-Nov-2022
  • (2022)Program merge conflict resolution via neural transformersProceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3540250.3549163(822-833)Online publication date: 7-Nov-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media