skip to main content
10.1109/ESEM.2017.12acmconferencesArticle/Chapter ViewAbstractPublication PagesesemConference Proceedingsconference-collections
research-article

An empirical examination of the relationship between code smells and merge conflicts

Published: 09 November 2017 Publication History

Abstract

Background: Merge conflicts are a common occurrence in software development. Researchers have shown the negative impact of conflicts on the resulting code quality and the development workflow. Thus far, no one has investigated the effect of bad design (code smells) on merge conflicts. Aims: We posit that entities that exhibit certain types of code smells are more likely to be involved in a merge conflict. We also postulate that code elements that are both "smelly" and involved in a merge conflict are associated with other undesirable effects (more likely to be buggy). Method: We mined 143 repositories from GitHub and recreated 6,979 merge conflicts to obtain metrics about code changes and conflicts. We categorized conflicts into semantic or non-semantic, based on whether changes affected the Abstract Syntax Tree. For each conflicting change, we calculate the number of code smells and the number of future bug-fixes associated with the affected lines of code. Results: We found that entities that are smelly are three times more likely to be involved in merge conflicts. Method-level code smells (Blob Operation and Internal Duplication) are highly correlated with semantic conflicts. We also found that code that is smelly and experiences merge conflicts is more likely to be buggy. Conclusion: Bad code design not only impacts maintainability, it also impacts the day to day operations of a project, such as merging contributions, and negatively impacts the quality of the resulting code. Our findings indicate that research is needed to identify better ways to support merge conflict resolution to minimize its effect on code quality.

References

[1]
Ahmed, I., Mannan, U. A., Gopinath, R., & Jensen, C. An empirical study of design degradation: How software projects get worse over time. Empirical Software Engineering & Measurement (ESEM), 2015, pp.1--10.
[2]
Apache Software Foundation. Apache maven project. http://maven.apache.org
[3]
Apel, S., Leßenich, O., & Lengauer, C. Structured merge with auto-tuning: balancing precision and performance. International Conference on Automated Software Engineering, 2012, (pp. 120--129).
[4]
Apel, S., Liebig, J., Brandl, B., Lengauer, C., & Kästner, C. Semistructured merge: rethinking merge in revision control systems. 13th European conference on Foundations of software engineering, 2011, (pp. 190--200).
[5]
Bakota, T., Ferenc, R., & Gyimothy, T. Clone smells in software evolution. IEEE International Conference on Software Maintentance, 2007, (pp. 24--33).
[6]
Biehl, J. T., Czerwinski, M., Smith, G., & Robertson, G. G. FASTDash: a visual dashboard for fostering awareness in software teams. Human factors in computing systems, 2007, (pp. 1313--1322).
[7]
Bird, C., Bachmann, A., Aune, E., Duffy, J., Bernstein, A., Filkov, V., & Devanbu, P. Fair and balanced?: bias in bug-fix datasets. European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, 2009, (pp. 121--130).
[8]
Boehm, B. W., Brown, J. R., & Lipow, M. (Quantitative evaluation of software qualityinternational conference on Software engineering, 1976 (pp. 592--605).
[9]
Brun, Y., Holmes, R., Ernst, M. D., & Notkin, D. Proactive detection of collaboration conflicts. European conference on Foundations of software engineering, 2011, (pp. 168--178).
[10]
Buckley, J., Mens, T., Zenger, M., Rashid, A., & Kniesel, G. (2005). Towards a taxonomy of software change. Journal of Software Maintenance and Evolution: Research and Practice, 17(5), 309--332.
[11]
C. K. Roy and J. R. Cordy, "A survey on software clone detection research," Queen's University, Kingston, Canada, Tech. Rep. 2007-541, 2007.
[12]
Canfora, G., Cerulo, L., & Di Penta, M. (2007, May). Identifying Changed Source Code Lines from Version Repositories. In MSR (Vol. 7, p. 14).
[13]
Cataldo, M., & Herbsleb, J. D. (2013). Coordination breakdowns and their impact on development productivity and software failures. IEEE Transactions on Software Engineering, 39(3), 343--360.
[14]
Cleidson R. B. de Souza, David Redmiles, and Paul Dourish. 2003. "Breaking the Code", Moving Between Private and Public Work in Collaborative Software Development. ACM SIGGROUP Conference on Supporting Group Work (GROUP '03) pp. 105--114.
[15]
Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2013). Applied multiple regression/correlation analysis for the behavioral sciences. Routledge.
[16]
Companion Website: https://goo.gl/ORpLkU
[17]
Costa, C., Figueiredo, J. J., Ghiotto, G., & Murta, L. (2014). Characterizing the Problem of Developers' Assignment for Merging Branches. International Journal of Software Engineering and Knowledge Engineering, 24(10), 1489--1508.
[18]
Cunningham, W. (1993). The WyCash portfolio management system. ACM SIGPLAN OOPS Messenger, 4(2), 29--30.
[19]
Da Silva, I. A., Chen, P. H., Van der Westhuizen, C., Ripley, R. M., & Van Der Hoek, A. Lighthouse: coordination through emerging design. 2006 OOPSLA workshop on eclipse technology eXchange (pp. 11--15).
[20]
De Souza, L. B. L., & de Almeida Maia, M. Do software categories impact coupling metrics?. In Mining Software Repositories (MSR), 2013, (pp. 217--220).
[21]
Deligiannis, I., Shepperd, M., Roumeliotis, M., & Stamelos, I. (2003). An empirical investigation of an object-oriented design heuristic for maintainability. Journal of Systems and Software, 65(2), 127--139.
[22]
Deligiannis, I., Stamelos, I., Angelis, L., Roumeliotis, M., & Shepperd, M. (2004). A controlled experiment investigation of an object-oriented design heuristic for maintainability. Journal of Systems and Software, 72(2), 129--143.
[23]
Dewan, P., & Hegde, R. (2007). Semi-synchronous conflict detection and resolution in asynchronous software development. ECSCW 2007, 159--178.
[24]
Dourish, P., & Bellotti, V. Awareness and coordination in shared workspaces. 1992 ACM conference on Computer-supported cooperative work (pp. 107--114). ACM.
[25]
El Emam, K., Benlarbi, S., Goel, N., & Rai, S. N. (2001). The confounding effect of class size on the validity of object-oriented metrics. IEEE Transactions on Software Engineering, 27(7), 630--650.
[26]
Falleri, J. R., Morandat, F., Blanc, X., Martinez, M., & Monperrus, M. (2014). Fine-grained and accurate source code differencing. 29th international conference on Automated software engineering (pp. 313--324).
[27]
Fontana, F. A., Mäntylä, M. V., Zanoni, M., & Marino, A. (2016). Comparing and experimenting machine learning techniques for code smell detection. Empirical Software Engineering, 21(3), 1143--1191.
[28]
Fontana, F. A., Mariani, E., Mornioli, A., Sormani, R., & Tonello, A. (2011, March). An experience report on using code smells detection tools. 2011 IEEE Fourth International Conference on Software Testing, Verification and Validation Workshop (ICSTW), pp. 450--457.
[29]
Fowler, M., & Beck, K. (1999). Refactoring: improving the design of existing code. Addison-Wesley Professional.
[30]
Godfrey, M. W., & Zou, L. (2005). Using origin analysis to detect merging and splitting of source code entities. IEEE Transactions on Software Engineering, 31(2), 166--181.
[31]
Gorton, I., & Liu, A. (2002). Software component quality assessment in practice: successes and practical impediments. 24th International Conference on Software Engineering (pp. 555--558). ACM.
[32]
Guimarães, M. L., & Silva, A. R. (2012) Improving early detection of software merge conflicts. 34th International Conference on Software Engineering (ICSE), (pp. 342--352).
[33]
Hall, T., Zhang, M., Bowes, D., & Sun, Y. (2014). Some code smells have a significant but small effect on faults. ACM Transactions on Software Engineering and Methodology (TOSEM), 23(4), 33.
[34]
Hattori, L., & Lanza, M. (2010, May). Syde: A tool for collaborative software development. 32nd International Conference on Software Engineering-Volume 2 (pp. 235--238).
[35]
Hensher, D. A., & Stopher, P. R. (Eds.). (1979). Behavioural travel modelling. London: Croom Helm.
[36]
InFusion, http://www.intooitus.com/inFusion.html. (accessed at January 2014)
[37]
Izurieta, C., & Bieman, J. M. (2007,). How software designs decay: A pilot study of pattern evolution. Empirical Software Engineering and Measurement. (pp. 449--451).
[38]
Juergens, E., Deissenboeck, F., Hummel, B., & Wagner, S. (2009). Do code clones matter?. International Conference on Software Engineering. (pp. 485--495).
[39]
Kagdi, H., Gethers, M., Poshyvanyk, D., & Collard, M. L. (2010, October). Blending conceptual and evolutionary couplings to support change impact analysis in source code. In Reverse Engineering (WCRE), 2010 17th Working Conference on (pp. 119--128). IEEE.
[40]
Kasi, B. K., & Sarma, A. Cassandra: Proactive conflict minimization through optimized task scheduling. 2013 International Conference on Software Engineering (pp. 732--741). IEEE Press.
[41]
Khomh, F., Di Penta, M., Guéhéneuc, Y. G., & Antoniol, G. (2012). An exploratory study of the impact of antipatterns on class change-and fault-proneness. Empirical Software Engineering, 17(3), 243--275.
[42]
Kim, S., Zimmermann, T., Pan, K., & James Jr, E. (2006,). Automatic identification of bug-introducing changes. International Conference on Automated Software Engineering, ASE'06. (pp. 81--90).
[43]
Koschke, R. (2007). Survey of research on software clones. In Dagstuhl Seminar Proceedings. Schloss Dagstuhl-Leibniz-Zentrum für Informatik.
[44]
Kruchten, P., Nord, R. L., & Ozkaya, I. (2012). Technical debt: From metaphor to theory and practice. Ieee software, 29(6), 18--21.
[45]
Lanza, M., & Marinescu, R. (2007). Object-oriented metrics in practice: using software metrics to characterize, evaluate, and improve the design of object-oriented systems. Springer Science & Business Media.
[46]
Li, W., & Shatnawi, R. (2007). An empirical study of the bad smells and class error probability in the post-release object-oriented system evolution. Journal of systems and software, 80(7), 1120--1128.
[47]
Lippe, E., & Van Oosterom, N. (1992, November). Operation-based merging. In ACM SIGSOFT Software Engineering Notes (Vol. 17, No. 5, pp. 78--87). ACM.
[48]
Marinescu, R. (2001). Detecting design flaws via metrics in object-oriented systems. International Conference and Exhibition on Technology of Object-Oriented Languages and Systems, (pp. 173--182).
[49]
Marinescu, R. (2004, September). Detection strategies: Metrics-based rules for detecting design flaws. In Software Maintenance, 2004. Proceedings. 20th IEEE International Conference on (pp. 350--359). IEEE.
[50]
Martin, R. C. (2003). Agile software development: principles, patterns, and practices. Prentice Hall PTR.
[51]
Mens, T. (2002). A state-of-the-art survey on software merging. IEEE transactions on software engineering, 28(5), 449--462.
[52]
Moha, N., Rezgui, J., Guéhéneuc, Y. G., Valtchev, P., & El Boussaidi, G. (2008). Using FCA to suggest refactorings to correct design defects. In Concept Lattices and Their Applications (pp. 269--275). Springer Berlin Heidelberg.
[53]
Nieminen, A. (2012,). Real-time collaborative resolving of merge conflicts. International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom), (pp. 540--543).
[54]
Olbrich, S., Cruzes, D. S., Basili, V., & Zazworka, N. (2009). The evolution and impact of code smells: A case study of two open source systems. 2009 3rd international symposium on empirical software engineering and measurement (pp. 390--400). IEEE Computer Society.
[55]
Palomba, F., Bavota, G., Di Penta, M., Oliveto, R., De Lucia, A., & Poshyvanyk, D. (2013). Detecting bad smells in source code using change history information. Automated software engineering (ASE), (pp. 268--278).
[56]
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Vanderplas, J. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12(Oct), 2825--2830.
[57]
Sarma, A., & Van Der Hoek, A. (2006, October). Towards awareness in the large. In Global Software Engineering, 2006. ICGSE'06. International Conference on (pp. 127--131).
[58]
Sarma, A., Noroozi, Z., & Van Der Hoek, A. (2003). Palantir: raising awareness among configuration management workspaces. International Conference on Software Engineering, (pp. 444--454).
[59]
Schumacher, J., Zazworka, N., Shull, F., Seaman, C., & Shaw, M. (2010). Building empirical support for automated code smell detection. International Symposium on Empirical Software Engineering and Measurement (p. 8).
[60]
Servant, F., Jones, J. A., & Van Der Hoek, A. (2010). CASI: preventing indirect conflicts through a live visualization. ICSE Workshop on Cooperative and Human Aspects of Software Engineering (pp. 39--46).
[61]
Tian, Y., Lawall, J., & Lo, D. (2012, June). Identifying linux bug fixing patches. In Proceedings of the 34th International Conference on Software Engineering (pp. 386--396). IEEE Press.
[62]
Tiobe, http://tiobe.com/index.php/content/paperinfo/tpci/index.html
[63]
Understand™ Static Code Analysis Tool. (2017).
[64]
Weyuker, E. J., Ostrand, T. J., & Bell, R. M. (2008). Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models. Empirical Software Engineering, 13(5), 539--559.
[65]
Wloka, J., Ryder, B., Tip, F., & Ren, X. (2009, May). Safe-commit analysis to facilitate team software development. In Proceedings of the 31st International Conference on Software Engineering (pp. 507--517). IEEE Computer Society.
[66]
Yamashita, A. F., & Moonen, L. Do developers care about code smells? An exploratory survey. WCRE, 2013 (Vol. 13, pp. 242--251).
[67]
Zazworka, N., Shaw, M. A., Shull, F., & Seaman, C. Investigating the impact of design debt on software quality. 2nd Workshop on Managing Techinical Debt, 2011
[68]
Zimmermann, T., Kim, S., Zeller, A., & Whitehead Jr, E. J. Mining version archives for co-changed lines. 2006 International Workshop on Mining Software Repositories (pp. 72--75).

Cited By

View all
  • (2024)Understanding the Impact of Branch Edit Features for the Automatic Prediction of Merge Conflict ResolutionsProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3644433(149-160)Online publication date: 15-Apr-2024
  • (2024)ConflictBenchJournal of Systems and Software10.1016/j.jss.2024.112084214:COnline publication date: 1-Aug-2024
  • (2023)Analysis of the Technical Debt of Software Projects Based on Merge Code CommentsProceedings of the 17th Brazilian Symposium on Software Components, Architectures, and Reuse10.1145/3622748.3622751(21-30)Online publication date: 25-Sep-2023
  • Show More Cited By
  1. An empirical examination of the relationship between code smells and merge conflicts

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ESEM '17: Proceedings of the 11th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement
    November 2017
    481 pages
    ISBN:9781509040391

    Sponsors

    Publisher

    IEEE Press

    Publication History

    Published: 09 November 2017

    Check for updates

    Author Tags

    1. code smell
    2. empirical analysis
    3. machine learning
    4. merge conflict

    Qualifiers

    • Research-article

    Conference

    ESEM '17
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 130 of 594 submissions, 22%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 05 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Understanding the Impact of Branch Edit Features for the Automatic Prediction of Merge Conflict ResolutionsProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3644433(149-160)Online publication date: 15-Apr-2024
    • (2024)ConflictBenchJournal of Systems and Software10.1016/j.jss.2024.112084214:COnline publication date: 1-Aug-2024
    • (2023)Analysis of the Technical Debt of Software Projects Based on Merge Code CommentsProceedings of the 17th Brazilian Symposium on Software Components, Architectures, and Reuse10.1145/3622748.3622751(21-30)Online publication date: 25-Sep-2023
    • (2023)An empirical study of the relationship between refactorings and merge conflicts in Javascript codeProceedings of the XXXVII Brazilian Symposium on Software Engineering10.1145/3613372.3613402(89-98)Online publication date: 25-Sep-2023
    • (2023)A Characterization Study of Merge Conflicts in Java ProjectsACM Transactions on Software Engineering and Methodology10.1145/354694432:2(1-28)Online publication date: 31-Mar-2023
    • (2023)Automatic prediction of developers’ resolutions for software merge conflictsJournal of Systems and Software10.1016/j.jss.2023.111836206:COnline publication date: 1-Dec-2023
    • (2023)Code smell prioritization in object‐oriented software systemsJournal of Software: Evolution and Process10.1002/smr.253635:12Online publication date: 29-Jan-2023
    • (2022)The Private Life of Merge ConflictsProceedings of the XXXVI Brazilian Symposium on Software Engineering10.1145/3555228.3555240(353-362)Online publication date: 5-Oct-2022
    • (2022)Detecting Build Conflicts in Software Merge for Java Programs via Static AnalysisProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3556950(1-13)Online publication date: 10-Oct-2022
    • (2022)ConE: A Concurrent Edit Detection Tool for Large-scale Software DevelopmentACM Transactions on Software Engineering and Methodology10.1145/347801931:2(1-26)Online publication date: 30-Apr-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media