Live Code Smell Detection of Data Clumps in an Integrated Development

Environment

Nils Baumgartner, Firas Adleh and Elke Pulverm

uller

Research Group Software Engineering, Institute of Computer Science, Department of Mathematics and Computer Science,

University of Osnabr

uck, Osnabrueck, Germany

{nils.baumgartner, ﬁadleh, elke.pulvermueller}@uni-osnabrueck.de

Keywords:

Code Smell, Data Clumps, Refactoring, Integrated Development Environment

Abstract:

Code smells in software systems create maintenance and extension challenges for developers. While many

tools detect code smells, few provide refactoring suggestions. Some of the tools support live detection in an

integrated development environment. We present a tool for the live detection of data clumps in Java with

generated suggestions and semi-automatic refactoring. To achieve this, our research examines projects and

their associated abstract syntax trees and analyzes types of variables. Thereby, we aim to detect data clumps, a

type of code smells, and generate suggestions to counteract them. We implemented our approach to live data

clumps detection as an IntelliJ integrated development environment application plugin. The live detection

achieved a median of less than 0.5 s for the ArgoUML software project, which we analyzed as an example.

From over 1500 investigated ﬁles, our approach detected 125 ﬁles with data clumps and that of CBSD (Code

Bad Smell Detector) detected 97 ﬁles with data clumps. For both approaches, 92 of the ﬁles found were

the same. We combined the manual steps for refactoring, resulting in a semi-automatic elimination of data

clumps.

1 INTRODUCTION

Expenses for the continuing development and mainte-

nance of software projects are not negligible aspects

and must be considered during the planning of such

projects. Maintaining software may account for be-

tween 40 % and 75 % of the total costs, according to

(Brown et al., 1998). For software development, these

costs include the time for employees to learn a new li-

brary or the structure of an existing project. However,

a means of lowering the cost of maintenance may

already exist during the setup of a software project

via ﬂexible and future-oriented structures and a clean

code. One such way to maintain software is refactor-

ing, which improves the software quality and, thus,

its maintainability without affecting its recognizable

behavior (Becker et al., 1999). The exact criteria to

identify a clean and good source code are various

and not sharply deﬁned. In contrast, recognition of a

bad source code is easier and is often associated with

the terms “anti-pattern” and “code smell,” the latter

coined by Kent Beck and adopted by Martin Fowler

(Becker et al., 1999). Code smells are places in the

implementation that point to potential errors or main-

tenance problems.

A well-known type of code smell is data clumps.

According to (Lacerda et al., 2020), data clumps

are among the top 10 code smells and are the sec-

ond most common in the web domain (Delchev and

Harun, 2015). Thus, they present a problem in soft-

ware projects that should not be underestimated. Data

clumps are a group of variables that appear together

in different areas in a source code and that point to a

possible new data structure. Due to the distribution

of data clumps across a software project, detection is

difﬁcult for a developer.

Integrated software development environments

(IDEs) for software provide useful tools for a de-

veloper and facilitate the work on software. Sup-

port for the automatic, timely and early detection

of data clumps within an IDE is an important issue

for software development and the associated develop-

ment costs. Already, (Simon et al., 2001), (Gronback,

2003), (Salehie et al., 2006), and (Habra and Lopez

Martin, 2006) have employed metrics to detect code

smells. In addition to detection, an equally impor-

tant point is refactoring the data clumps, for which

manual implementation can be time-consuming and

monotonous. A useful effort, therefore, would be an

automatic or semi-automatic tool for this.

Baumgartner, N., Adleh, F. and Pulvermüller, E.

Live Code Smell Detection of Data Clumps in an Integrated Development Environment.

DOI: 10.5220/0011727500003464

In Proceedings of the 18th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE 2023), pages 64-76

ISBN: 978-989-758-647-7; ISSN: 2184-4895

 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)

In this study, we approach a means of detection

and support for eliminating data clumps integrated in

an IDE to ease the developer’s work. Furthermore,

we demonstrate that the detection of data clumps can

be sufﬁciently fast so that live code smell detection

is possible. Finally, we evaluate how our approach

performs against comparable tools for detection.

The remainder of this paper is organized as fol-

lows. Section 2 provides an overview and background

of the method presented in this study. Section 3 dis-

cusses related approaches, and Section 4 describes

our proposed method for detecting and refactoring

data clumps. In Section 5, we present our evaluation

and results, which are discussed in Section 6, along

with the challenges and limitations of our approach.

Finally, Section 7 presents closing remarks and out-

lines recommendations for future work.

2 BACKGROUND

This section provides the background to our approach.

To develop a form of live code smell detection, we

ﬁrst focus on the term code smell in Section 2.1. Then

in Section 2.2, we provide a deﬁnition of data clumps,

followed by the procedure for refactoring them in

Section 2.3. Subsequently, the abstract syntax tree

(AST), a representation of a source code, is explained

in Section 2.4, followed in Section 2.5 by a deﬁnition

of a program structure interface (PSI) built on top of

the AST.

2.1 Code Smell

A code smell is not necessarily a bug but is often an

indicator of a deeper problem (Fowler, 2019). Struc-

tures that must be improved can be identiﬁed by

code smells. Various domains can have speciﬁc code

smells and, therefore, different impacts on the areas

of architecture, database, design and implementation

(Sharma and Spinellis, 2017). Poor design as a symp-

tom of code smells poses a potential risk for future

bugs and loss of code structure. Additionally, it has

a negative impact on code in terms of understandabil-

ity, testability, extensibility and reusability (Fowler,

2019). As a result, code smells can have a nega-

tive impact on maintainability. A smell can enter a

project in various ways, such as inattention, knowl-

edge gaps, a change in requirements, chosen tech-

nologies and frameworks, work processes, organiza-

tional structure, team culture or poor resource plan-

ning (Sharma and Spinellis, 2017). Each of these

factors can inﬂuence a project and increase its costs.

To avoid a reduction in quality, refactoring may be

applied to a software project (Becker et al., 1999).

Fowler designates no ﬁxed time for the refactoring but

states that it may be included in the workﬂow.

Different sources (Zhang et al., 2008) and

(Fowler, 2019) provide various deﬁnitions and quan-

tities of code smells, which result from the subjec-

tive deﬁnition of a particular code smell (M

antyl

a and

Lassenius, 2006). Therefore, a complete list of code

smells is not available. A list of the originally deﬁned

(Fowler, 2019) and most frequently mentioned code

smells can be found in (Lacerda et al., 2020), which

proposes measures to counteract them. Data clumps

are on this list.

2.2 Data Clumps

One manifestation of code smells is data clumps,

which are mentioned in Fowler’s list (Fowler, 2019)

along with countermeasures. Fowler deﬁnes them as

data items that “tend to be like children: They en-

joy hanging around together” (Fowler, 2019). These

groups of data items can be found in various places

such as class ﬁelds and method parameters.

For a more precise deﬁnition, (Zhang et al., 2008)

and (Hall et al., 2014) may be consulted, wherein ex-

perts were asked their explanation of data clumps.

Their resulting deﬁnition is divided into two in-

stances: ﬁelds and parameters.

2.2.1 Fields Instance

According to (Zhang et al., 2008) and (Hall et al.,

2014) a ﬁelds instance of data clumps is present if:

• More than three data ﬁelds are shared in two or

more classes.

• The data ﬁelds have the same signatures, consist-

ing of names, types and visibility.

• Instance ﬁelds are not necessarily found in the

same order and can be distributed over an in-

stance.

An example of data clumps in this instance is pre-

sented in Listing 1, which contains two classes that

both share the same ﬁelds: foo, bar and foobar. These

ﬁelds may be extracted into another class to eliminate

the code smell.

Listing 1: Fields Instance Example of Data Clumps.

1 p ub l i c c l a s s MyClass {

2 p r i v a t e i n t f o o ;

3 p r i v a t e i n t b ar ;

4 p r i v a t e i n t f o o b a r ;

5 p u b l i c vo id m ethod ( ) { }

6 }

7 p u b l i c c l a s s M y O t h e r C l a s s {

Live Code Smell Detection of Data Clumps in an Integrated Development Environment

8 p r i v a t e i n t b ar ;

9 p r i v a t e i n t f o o ;

10 p u b l i c vo id m ethod ( i n t c ) { }

11 p r i v a t e i n t f o o b a r ;

12 }

Although the deﬁnition requires that the names

must be identical, in our understanding, semantic

equality rather than exact equivalence is more rele-

vant.

2.2.2 Parameters Instance

According to (Zhang et al., 2008) and (Hall et al.,

2014), a parameters instance of data clumps is present

if:

• More than three input parameters are shared in

two or more method declarations.

• The input parameters have the same signatures,

consisting of names, types and visibility.

• Method parameters are not necessarily found in

the same order.

• The same inheritance hierarchy and method sig-

nature should not be present in these methods.

An example of data clumps in this instance is

presented in Listing 2, which contains two classes

that both employ a method in which the parameters

foo, bar and foobar share the same signature. These

parameters may be replaced with a class containing

these parameters to eliminate the code smell.

Listing 2: Parameters Instance Example of Data Clumps.

1 p u b l i c c l a s s MyClass {

2 p u b l i c vo id m ethod (

3 S t r i n g s , i n t fo o ,

4 i n t b a r , i n t f o o b a r

5 ) { }

6 }

7 p u b l i c c l a s s M y O t h e r C l a s s {

8 p u b l i c vo id m ethod (

9 i n t b a r , i n t x ,

10 i n t foo , i n t f o o b a r

11 ) { }

12 }

2.3 Data Clumps Refactoring

In this section, the procedure to counteract data

clumps described by Fowler in (Fowler, 2019) is ex-

plained brieﬂy. In the ﬁrst step, Fowler suggests iden-

tifying a data clump. Thereupon, by means of Extract

Class (Fowler, 2019), for example, the data ﬁelds are

to be extracted into an object. With Extract Class, a

class with too many ﬁelds and methods may be sep-

arated into two classes, which improves the under-

standing of both of them. Then, the input param-

eters of methods must be checked and replaced, if

necessary, such as using Introduce Parameter Object

or Preserve Whole Object (Fowler, 2019). Preserve

Whole Object reduces the size of a parameter list by

passing the whole object to a method instead of only

the necessary parameters. Introduce Parameter Ob-

ject shortens the size of a parameter list by grouping

parameters that always appear together in method sig-

natures into an object. Fowler argues that Introduce

Parameter Object or Preserve Whole Object reveals

the beneﬁt directly apparent through the shortened pa-

rameter lists or simpliﬁed method calls.

Fowler addresses the problems and code smells

resulting from the refactoring of data clumps. A re-

sulting code smell from this refactoring is data class,

which Fowler advises is contrary to the production of,

for example, a record structure. The purpose of a data

class is only to store data without further functional-

ity. For refactoring data clumps, it is not problematic

to use only some ﬁelds of the new objects, provided

that at least two ﬁelds have been replaced with the

new object. After refactoring, it is important to look

for the feature envy type of code smell, which exists

when the purpose of two methods in different classes

is only to communicate with each other. The class

newly created during the refactoring of data clumps

may be enriched with meaningful behavior structures.

This, in turn, helps to avoid many code duplications

and speeds up future development (Fowler, 2019).

To the best of our knowledge, there are only two

different refactoring types for data clumps. The ﬁrst

involves manual steps for Extract Class, Introduce

Parameter Object and Preserve Whole Object. The

second type involves machine learning, which is not

within the scope of this study.

2.4 Abstract Syntax Tree

For the analysis of source code ﬁles, a representation

of the source code in a suitable data structure is help-

ful. An AST is a representation of the source end in

the form of a tree. Here, the individual components

of the code such as expressions, operators, literals and

variables are assigned to groups and appended to the

root node. By means of traversing over the tree, each

place in a program can be addressed in such a way.

2.5 Program Structure Interface

As a layer on top of the AST, a PSI can be used. As a

layer in the IntelliJ Platform, it parses ﬁles and creates

ENASE 2023 - 18th International Conference on Evaluation of Novel Approaches to Software Engineering

a syntactic and semantic code model. With the PSI

code inside, the IDE can be highlighted. As a result,

the highlighted code may present the developer with

a description of the problem.

3 RELATED WORK

Several proposals have recently examined the detec-

tion and correction of code smells. Many studies and

investigations (dos Santos Neto et al., 2015), (Khrishe

and Alshayeb, 2016), (Mehta et al., 2018), (Palomba

et al., 2018) and (Guggulothu and Abdul Moiz, 2019)

have been performed to identify the optimal sequence

of refactoring steps to remove a code smell. The order

of detecting and refactoring code smells may have an

impact on the resulting software quality. According to

(dos Santos Neto et al., 2015), the literature on code

smells may be categorized into four groups: code

smell detection, code smell correction, code quality

evaluation and preservation of observable behavior.

The primary search for related work is directed at the

ﬁrst two groups because of the focus of our approach.

In the code smell removal experiments in (Ar-

celli Fontana et al., 2015), no automatic approach is

suggested or named for refactoring data clumps. In-

stead, the recommended refactoring steps align with

those in (Fowler, 2019). To perform these steps,

the following tools are speciﬁed in (Arcelli Fontana

et al., 2015): Eclipse (Eclipse Foundation, 2022),

IntelliJ integrated development environment applica-

tion (IDEA) (JetBrains, 2022), and RefactorIT (Aqris

Software, 2016), although the last one does not sup-

port the Extract Class feature.

According to the review in (Lacerda et al., 2020)

of eight scientiﬁc databases and the 40 identiﬁed

secondary studies between 1992 and 2018, there is

currently no tool for automatic refactoring for data

clumps. To the best of our knowledge, a tool for live

code smell detection with refactoring options for data

clumps remains an innovation of this study, even from

2018 until 2022. Thus, most tools for code smells fo-

cus on detection or visualization, and only a few offer

refactoring suggestions at all. According to (Lacerda

et al., 2020), there are still code smells from Fowler’s

list for which tools with refactoring suggestions do

not yet exist. Below, we present an overview of the

frequently cited tools for code smell detection from

(Lacerda et al., 2020), (Felix and Vinod, 2016), (Pes-

soa et al., 2012) and (Fernandes et al., 2016) extended

by our own experiments with the following tools.

• cASpER (De Stefano et al., 2020) is an IntelliJ

IDEA plugin that aims to assist developers in the

identiﬁcation and refactoring of code smells ex-

cluding data clumps. The plugin provides visual

and semi-automated support for detecting and re-

moving four different types of code smells.

• CBSD (Code Bad Smell Detector) (Hall et al.,

2013), is a standalone tool that can examine Java

source-code ﬁles for ﬁve types of code smells.

These code smells may be found in Fowler’s list

(Fowler, 2019), but some have not been studied

thoroughly (Lacerda et al., 2020). CBSD uses an

AST approach to discover data clumps and other

code smells. The results can be viewed in detail

using an extensible markup language (XML) ex-

port or a graphical user interface (GUI).

• CCFinder (CCFinder, 2008) is a standalone code

clone detection tool. It is token-based and uses a

sufﬁx tree matching algorithm. This tool supports

various programming languages such as Java, C,

C++ and others.

• JDeodorant (Mazinanian et al., 2016) is a plu-

gin for Eclipse that detects 5 code smells exclud-

ing data clumps in Java source code and provides

refactoring suggestions. This is one of the few de-

tection tools that supports refactoring suggestions.

• PMD (PMD, 2023) is a tool for static analysis of

Java source code. This tool scans the source code

and looks for possible problems including bugs,

dead code, improvable code, simpliﬁed expres-

sions and duplicate code. This tool ﬁnds unnec-

essary variables, methods, statements and loops.

PMD can be integrated with a variety of other

tools such as Eclipse, IntelliJ IDEA and many oth-

ers.

• RefactorIT (Aqris Software, 2016) is a plugin

for Eclipse and NetBeans development environ-

ments. It offers the ability to detect and refactor

code smells such as data clumps. RefactorIT has a

limitation in comparison to other refactoring tools

as it does not support the Extract Class refactoring

technique, as reported in (Arcelli Fontana et al.,

2015)

• Stench Blossom (Murphy-Hill and Black, 2010)

uses visual elements to provide developers with a

quick and comprehensive overview of a variety of

code smells in the source code. This tool is avail-

able for Eclipse as a plugin and provides different

views for visualizing eight code smells. Feedback

to the developer is provided via a series of bars,

in the form of petals, at the edge of the IDE ed-

itor, and the size of the petals is used for assess-

ment and relevance to the developer. The plugin

does not offer refactor suggestions. According to

(Felix and Vinod, 2016) it is able to detect data

clumps.

Live Code Smell Detection of Data Clumps in an Integrated Development Environment

A detailed description of each tool is beyond the

scope of this paper. These include NosePrints (Parnin

et al., 2008), Borland Together (Micro Focus, 2023),

inCode (Intooitus srl, 2013), inFusion (Intooitus srl,

2012) and FindBugs (Rutar et al., 2004). Some

are outdated, some have low download numbers and

some others cannot be found.

In the approaches examined (Stench Blossom and

CBSD), the focus is only on the detection of data

clumps. In this research, we take it a step further and

demonstrate the possibility of live code data clumps

smell detection as well as support the developers with

recommendations for semi-automatic refactoring.

4 APPROACH

The live code smell detection of data clumps is pro-

posed as a plugin for IntelliJ to provide support to

the development of a software project by means of

live, or at least fast, feedback for the developer. Our

plugin, called Live Code Smell Detection (LCSD) is

Java-based.

In the following passages, the general structure of

our approach is discussed, along with the associated

conﬁguration options. This is followed by a more de-

tailed description of our approach, which consists of

three phases, illustrated in Fig. 1: detecting, report-

ing and refactoring.

Phase 1

Detecting

Phase 2

Reporting

Phase 3

Refactoring

Figure 1: Phases of our approach.

In short, our tool works as follows. After an initial

phase to load the plugin, the ﬁrst phase, detecting,

passes the project or ﬁles to be examined from the

IDE to the tool. After the developer makes changes to

the project, the changed ﬁles are examined and com-

pared with other classes, methods and ﬁelds that are

of interest for the detection of data clumps. There

is also the possibility of extending the tool to detect

other code smells.

In the second phase, reporting, this information

is presented to the user via the IDE’s interface so

that they can perceive and react to the detection. In

this phase, information about the positions and occur-

rences of possible data clumps is prepared and pre-

sented. If a developer decides to refactor and provides

the necessary input (name for the new class), refactor-

ing is started.

In the third phase, refactoring, the refactoring

proposal accepted by the developer is applied. The

provided name of the new class is used, and the refac-

toring steps suggested by Fowler (Fowler, 2019) are

applied using the IDE’s interfaces.

To realize these three phases our tool needs to

adapt the interfaces provided by the IDE of the tool

we provide the plugin for. For our approach, we

ﬁrst adapted these interfaces for the IDE of IntelliJ.

Therefore, in the following description, we start gen-

erally from the IntelliJ application programming in-

terface (API), which may be replaced by interfaces

from other IDEs.

The simpliﬁed uniﬁed modeling language (UML)

diagram of our approach with the dependency on the

IDE is illustrated in Fig. 2. The class diagram has

been simpliﬁed for clarity. At the core of our ap-

proach is the class Inspection, which is responsible

for starting the process of detection. In addition, this

class has the task of presenting the detected feedback

to the user.

<<interface>>

IntelliJ Platform API

Visitor Inspection Fix

0..* 1 1 0..*

CacheManager

Utilities

RefactoringService

LocalInspection

Tool LocalQuickFix

JavaElement

Visitor

RefactoringDialog

Figure 2: Class diagram of the general LCSD structure (no-

tation UML 2.5).

At the initial phase to load the plugin, when

opening a project, the CacheManager collects and

prepares information about the entire project. The

CacheManager allows quick access to relevant infor-

mation, such as the list of PSI representations of all

classes. After a change to the source code or after a

scan has been performed, each Visitor component is

presented with the affected code. A Visitor is respon-

sible for scanning the AST of the source code to re-

ENASE 2023 - 18th International Conference on Evaluation of Novel Approaches to Software Engineering

veal a particular code smell. The AST is provided by

the IntelliJ IDEA. A Fix class is responsible for per-

forming an action to eliminate a speciﬁc code smell.

As both the Visitor and Fix class are highly dependent

on the class Inspection. Therefore both classes are il-

lustrated as a composition. To extend our approach

for a speciﬁc code smell, a Visitor and a Fix must be

implemented. Consequently, multiple Visitor and Fix

classes may be illustrated with the one-to-many rela-

tionship to the Inspection class.

After a user selects the feedback displayed by

the Inspection component, they are guided through a

refactoring dialog using the RefactoringDialog com-

ponent. More details about the data clumps found are

then presented, such as names and types of the du-

plicated variables with the affected methods, classes,

and ﬁles. In addition, the user can enter a name for the

new class. If the user approves the refactoring pro-

posal, the related Fix component receives the relevant

information and is responsible for manipulating the

AST. There is one Fix component for each of the two

data clump expressions, parameters and ﬁelds. The

Fix component uses the data clump refactoring ac-

tions from the RefactoringService component and the

generally supporting methods from the Utilities com-

ponent, such as extracting variables, creating getter

and setter methods or counting common parameters.

4.1 Detection

A correct detection of data clumps depends on the ex-

act deﬁnition. For this purpose, Section 2.2 provides

deﬁnitions of data clumps. Due to the subjectivity

in detecting data clumps, users may have a different

deﬁnition. Therefore, a conﬁguration of parameters

for the deﬁnition of data clumps would be useful. To

support various interpretations of data clumps, we in-

cluded a manner to conﬁgure the parameters and other

settings in our approach. In our tool, the conﬁguration

can be made using the IDE interface. By default, the

conﬁguration is set according to the improved deﬁni-

tion of data clumps:

• More than three data ﬁelds are shared in two or

more classes.

• More than three parameters are shared in two or

more methods.

• If the hierarchy for method parameters should be

considered.

• If the hierarchy for data ﬁelds should be consid-

ered.

• The severity level of data clumps (i.e., whether

they should be considered as information, warn-

ing or error).

• Data clumps are detected with a frequency of two

repetitions.

• More than two groups of variables for data ﬁelds.

• More than two groups of variables for method pa-

rameters.

In our approach, an AST is used for the actual de-

tection of code smells and, therefore, data clumps.

This AST is already provided by the IntelliJ PSI at

this point. For this, a Visitor component visits the PSI

code element that represents the source code part to

be examined. The Visitor uses methods provided by

the Utilities component. If a method signature is ex-

amined, it is compared with all other methods in the

project in terms of data clumps. However, if a ﬁeld

in a class is examined instead, the entire class and its

associated ﬁelds are compared to all other classes and

their ﬁelds in the project with respect to data clumps.

Two types of scans can be distinguished: a live scan

and a full scan.

4.1.1 Live Scan

Rapid feedback to the user about issues and errors

may be helpful and it may be implemented via live

feedback within the IDE. Information about potential

errors is immediately displayed to the user, allowing

this direct feedback to leverage the potential of the

testing effect (K

uhl et al., 2019). Live scanning sup-

port is provided using the IDE interface, and continu-

ous scanning leads to an increased load on the system

being executed. Therefore, the live scan begins only

when a new source code ﬁle is opened or the user ﬁn-

ished writing code elements that might include data

clumps such as method signatures and class ﬁelds.

4.1.2 Full Scan

While a live scan examines only the current ﬁle and

the directly associated components in a project, it

might be of interest to identify all data clumps within

the entire project. Therefore, our approach allows the

option for a full analysis, which takes longer than the

live scan due to the larger amount of ﬁles to exam-

ine. There are various reasons to have a project com-

pletely analyzed, such as for a performance compari-

son with other tools and their ﬁndings, or when an ex-

isting project should be improved. When a user com-

pletes a full scan, feedback is provided within the IDE

via reporting.

4.2 Reporting

An important issue is not to disturb the developer un-

necessarily, and a user should not be overwhelmed

Live Code Smell Detection of Data Clumps in an Integrated Development Environment

with information (Murphy-Hill and Black, 2010).

Thus, the references to information, unless explic-

itly desired and requested, should be conveyed to the

user in a subtle manner. Furthermore, a user is more

familiar with managing known workﬂows. There-

fore, in our approach, IntelliJ’s workﬂow has been

followed to report issues regarding data clumps. The

code smells detected by our approach are presented

in the identical manner in which dead code or dupli-

cate code is identiﬁed by IntelliJ. Our goal is that the

user can employ the tool to detect and remove data

clumps in a familiar way. The affected code parts are

highlighted in the editor according to the designated

severity level.

Furthermore, it is possible for a user to view prob-

lems found in the project as a list within IntelliJ. This

information about the detected issues or warnings is

displayed in the inspection results window, which also

lists other problems such as dead code or spelling

errors. In this window, the occurrences of the data

clumps are highlighted in more detail.

Figure 3 depicts an example of reporting of a class

with two methods data clumps. These methods (ask

and greet) share the three parameters of ﬁrstname,

lastname and age. These parameters form a data

clump, which was discovered live by our approach

and is displayed to the user. Fig. 3 presents the hint in

the lower area after the user has hovered their mouse

over the problematic location. The user may then be-

gin the refactoring process by selecting the Extract

method.

4.3 Refactoring

After the tool has detected a code smell or data clump,

the user can act on this report. The tool offers guides

after the decision has been made to ﬁx the problem.

The user is informed about the issue in detail using the

RefactoringDialog component and may enter a name

for a new class and agree to the refactoring. The ac-

tual refactoring is based on the refactoring steps rec-

ommended by Fowler (Fowler, 2019): Extract Class,

Introduce Parameter Object and Preserve Whole Ob-

ject. First, the affected parameters or ﬁelds are ex-

tracted into a new class with private visibility. While

a record class would be sufﬁcient for this step, Fowler

explicitly recommends creating a class. For this new

class, a user may identify functions in other parts of

the project that could be moved into the class.

• The extraction requires the new class name ob-

tained from the user via the dialog ﬁeld. For the

extracted ﬁelds or parameters, new getter and set-

ter methods are included in the created class. This

provides access to the private ﬁelds.

• In the second step, all affected parameters or ﬁelds

in the project are refactored to use an instance of

the class created in the previous step.

• All signatures and bodies of the affected methods

in parameter instances are modiﬁed to use the new

class.

• In ﬁeld instances, the affected ﬁelds are replaced

with an instance of the extracted class.

The user can revoke all modiﬁcations during the

refactoring in one step, as can be done for any other

action in IntelliJ IDEA.

If this refactoring is performed repeatedly, it may

result in another problem: the creation of duplicate

classes. To prevent this, our approach searches for

suitable classes that are characterized by the fact that

they have ﬁelds matching the parameter or ﬁeld to be

refactored. The user can then decide whether to create

a new class or to use the found class. Such an option

may be helpful for new developers who are involved

in the project.

Fig. 4 depicts an example of refactoring of a data

clump. The starting point was the code from Fig. 3.

During the refactoring dialog, the user provided the

information that the new ﬁle is called “Person”. Fig. 4

presents two ﬁles: The ﬁle on the left corresponds to

the modiﬁed original ﬁle in which the data clumps

have been replaced by the newly introduced class,

while the ﬁle on the right presents the automati-

cally introduced new class with the user-deﬁned name

“Person”. This class was automatically created with

a constructor, private ﬁelds and associated getter and

setter methods.

4.4 Extensibility

Some code smell types have common features and are

highly similar, such as long method and data clumps.

Our system architecture aims to further integrate code

smell refactorings. To do so, we moved general func-

tions to the Utilities component so that extensions of

the tool may access them. Furthermore, extensions

may use the CacheManager, in which the PSI repre-

sentations of all classes in a project are maintained

with the information about the super classes and inter-

faces. We have successfully extended our presented

approach for testing purposes: The extension is for

detecting and refactoring global data. We used the

deﬁnition and suggestions in (Fowler, 2019) to coun-

teract global data. The methods for generating getter

and setter functions in the refactoring of data clumps

can be reused. Further veriﬁcation of the accuracy

and measurement of the time required for use as a live

scan has to be made.

ENASE 2023 - 18th International Conference on Evaluation of Novel Approaches to Software Engineering

Figure 3: Example of live detected data clumps (before executing the refactoring step).

Figure 4: Result after executing the refactoring step for the Example of Fig. 3

5 EVALUATION

This section discusses detection accuracy. In addi-

tion to accuracy, speed is important in live code smell

detection. All evaluations and tests were performed

on the same computer with an Intel Core i7-6700HQ

CPU and with 16 GB RAM, running a 64-bit version

of Windows 10.

5.1 Accuracy

To assess the accuracy of our approach, we compared

our results with those from (Hall et al., 2014). Stench

Blossom does not support a full scan of entire projects

with a text output, so we have not considered this tool

when checking accuracy. In (Hall et al., 2014), the

CBSD tool was used to search for code smells in three

open-source projects, one of which is ArgoUML (ver-

sion 0.26 Beta). The results for ArgoUML were then

manually reviewed by two people in the study (Fer-

enc et al., 2020) and later were published in the Uni-

ﬁed Bug Data Set (Ferenc et al., 2019), in which the

number of occurrences of data clumps in the corre-

sponding source code ﬁles was listed. For the com-

parison between our approach and the data clumps

found by CBSD, the deﬁnition of data clumps used

by CBSD has been applied, which can be summarized

as follows: There must be at least three duplicates of

parameters or ﬁelds on two different classes, and du-

plicates cannot occur between classes with hierarchi-

cal dependency. During this comparison, we noticed

a challenge regarding the method for counting data

clumps.

This challenge is illustrated in the following ex-

ample: A method shares parameter duplicates in com-

mon with two other methods in different classes.

Thus, the question arises, are all occurrences of the

same duplicated ﬁelds and parameters counted as:

1. one data clump?

2. individual data clumps?

3. individual data clumps, where the original is not

counted?

These different counting rules may distort the re-

sults. For the comparison, we only consider whether

a data clump was detected in a ﬁle.

In the results from (Ferenc et al., 2020) for Ar-

goUML (version 0.26 Beta), 97 ﬁles containing data

clumps were detected using CBSD after we removed

non-existing ﬁle entries. Our approach found 125

ﬁles with data clumps. However, only 92 ﬁles were

the same. We analyzed the 5 ﬁles with data clumps

Live Code Smell Detection of Data Clumps in an Integrated Development Environment

our approach did not ﬁnd. For 2 ﬁles, based on man-

ual inspection, we are conﬁdent that our approach did

not miss any data clumps. For the remaining 3 ﬁles,

the decision is not clear after manual inspection. We

examined the 33 additional ﬁles in which we found

data clumps. Among them, 2 ﬁles contain enums

like classes, for which the decision is unclear whether

these are data clumps. The manual examination of the

remaining 31 ﬁles conﬁrmed that the detections were

data clumps. It is worth mentioning that there may

be additional data clumps that are neither detected by

our tool nor listed in the dataset.

Furthermore, our tests revealed several differences

in the detection of data clumps between Stench Blos-

som, CBSD and our approach. We have created

seven ﬁles for testing, each containing different data

clumps. The seven tests were for: data clumps in

simple parameters between two classes, data clumps

in simple class ﬁelds between two classes, polymor-

phism data clumps where a class extends another,

data clumps in an anonymous class, data clumps

in interfaces, data clumps in inner classes and data

clumps between two methods in a same class. In the

following we will go into the tests for the tools that

were not successful.

Stench Blossom could not detect the following:

data clumps in simple class ﬁelds, data clumps in

anonymous classes and data clumps in interfaces.

CBSD could not detect the following: polymor-

phism data clumps, data clumps in an anonymous

class, data clumps in interfaces, data clumps in in-

ner classes and data clumps between two methods in

a same class.

We tested our tool in all those scenarios. Our tool

was able to detect all of the special data clumps cases.

5.2 Speed

For plugins that are meant to support live code smell

detection, the required time is essential. Time delays

could otherwise negatively affect the workﬂow of a

developer. This is in contrast to the analyses of ex-

isting projects in which live detection may be consid-

ered less relevant. For the measurements, we com-

pared our approach with Stench Blossom and CBSD.

Due to the fact that our approach supports both sce-

narios — live detection as well as a full scan — the

evaluation of the speed was separated into two parts.

The results of the live detection are discussed ﬁrst,

followed by the measurement results of full scans.

5.2.1 Live Detection

According to (Miller, 1968) and (Nielsen, 1993), re-

action times up to 0.1 seconds are considered instan-

taneous. Response times up to 1 second are deﬁned

as the limit at which a user’s thought processes are

not interrupted, although a delay may be perceived.

Reaction times of 10 seconds are the limit at which a

user’s attention persists. Therefore, for live detection

of code smells, we aimed for a maximum of 1 second.

Below, we ﬁrst compared our approach with

Stench Blossom. The CBSD tool was ignored here,

as it supports only a complete scan of a project. For

the comparison with Stench Blossom, we have modi-

ﬁed the source code so that only the detection of data

clumps was activated, and a timer was included to

measure the required time.

The open-source project ArgoUML (version 0.26

Beta) was used as the basis for the time measurement

against Stench Blossom. From more than 1500 source

code ﬁles, the 20 largest were used for the evaluation.

We assumed that larger classes require more time than

smaller ones. For this time measurement, it should be

noted that the initial time of about 5 seconds to open

the project in the IDE is not considered. To obtain

the results, we repeated the measurements 10 times.

Fig. 5 depicts the results of the time measurement as

a boxplot. In this, the X-axis illustrates the LCSD

(our approach) and Stench Blossom tools, while the

Y-axis displays in seconds the time measured to open

a source code ﬁle and analyze it for data clumps.

From the ﬁgure, it can be seen that Stench Blossom

produced only small deviations in the required time.

In contrast, our approach revealed larger variations,

reaching values 3 seconds at the upper outliers, which

were caused by the largest ﬁles. However, based on

the lower median of our approach, it can be concluded

that less time was required for the data clump analy-

sis for most ﬁles. For more than 50 % of all ﬁles, the

time needed to scan with our approach was less than

1 second.

Furthermore, we conducted tests to assess the fea-

sibility of live scan ﬁle analysis and evaluated the re-

quired time. To gauge its practicality, we analyzed

ﬁve open-source projects of varying sizes. The small-

est project, Flyway 8.1, had approximately 26 KLOC

(thousands of lines of code), and the largest project,

Flowable 6.7.2, contained approximately 680 KLOC.

In this analysis, we selected the 20 largest ﬁles from

each of the projects and measured the time needed for

the analysis of data clumps. The results are depicted

in Fig. 6 as a boxplot. The ﬁgure presents the ﬁve

projects along the X-axis and plots the required time

for analyzing a ﬁle for data clumps on the Y-axis. The

median remained below 1 second in all projects. In

four of ﬁve projects, for more than 50 % of all ﬁles,

the required time to scan with our approach was less

than 1 second. It remains open to investigate which

ENASE 2023 - 18th International Conference on Evaluation of Novel Approaches to Software Engineering

Figure 5: Time required to scan for data clumps per ﬁle

from the 20 largest ﬁles in ArgoUML.

Figure 6: Time required to scan for data clumps per ﬁle

from the 20 largest ﬁles in each project.

ﬁles were responsible for the outliers and what trig-

gered this increased duration in the analysis.

5.2.2 Full Scan

For the investigation of the time for full scan function-

ality, we compared our approach with the CBSD tool.

We did not consider the Stench Blossom tool here,

since it does not support full scans. For the compari-

son, we modiﬁed CBSD so that only data clumps are

analyzed, and we included a timer to measure the time

required.

For the time measurement, we considered the

open-source project ArgoUML (version 0.26 Beta),

for which over 1500 Java ﬁles were examined com-

pletely for data clumps. To obtain the results, we

repeated the measurements 10 times. For our ap-

proach, we added the initial time needed for build-

ing the cache in each case. The results are presented

in Fig. 7 as a boxplot. The X-axis displays the tools

LCSD (our approach) and CBSD, while the Y-axis

indicates the time needed for each tool in seconds.

The time required for our approach ranged from 29

to 39 seconds, with a median of 32.5 seconds, while

the time needed for CBSD was between 763 and 802

seconds, with a median of 789 seconds. In all repeti-

tions, our approach required fewer than 40 seconds to

scan all ﬁles in ArgoUML for data clumps. Thus, we

were able to reduce the time required by CBSD for a

full scan by at least 15 times.

Figure 7: Time required to scan all ArgoUML project ﬁles

for data clumps.

Furthermore, we extended the investigation of the

time required for a full scan to the same ﬁve open-

source projects described in Section 5.2.1. How-

ever, we did not succeed in performing the measure-

ments with CBSD, because, for all measurements in

the projects, CBSD stopped and issued errors. Thus,

the following results have limited comparability with

CBSD and are applicable only to our approach, the

timing results for which can be seen in Fig 8 as a

boxplot. The X-axis depicts the various open-source

projects, while the Y-axis displays the measured time

of a full scan for the respective project. The anal-

ysis reveals that the median time for the full scan

of the project Flowable 6.7.2 was 706 seconds, and

for the other projects, it was less than 100 seconds.

Live Code Smell Detection of Data Clumps in an Integrated Development Environment

We suspect that the longer analysis time required for

the Flowable 6.7.2 project, compared to the other

projects, is due to the number of lines of code. Flow-

able 6.7.2 has approximately 680 KLOC, whereas the

second-largest project, Apache RocketMQ 4.9.1, con-

tains about 100 KLOC.

Figure 8: Time required to scan GitHub projects for data

clumps.

6 DISCUSSION

This section discusses several limitations to the valid-

ity of our results and approach. In addition, we further

address the increased usability for developers.

Two groups of users can beneﬁt from the use of

such a data clumps detection tool: inexperienced de-

velopers and experienced developers. The ﬁrst group,

inexperienced developers, may lack knowledge of

best practices for writing clean code, but with the help

of a data clumps detection tool, they can learn and im-

prove their coding skills by identifying areas for im-

provement. As a disadvantage, this group of develop-

ers faces the challenge of choosing a suitable name for

the new class. The second group, experienced devel-

opers, may still ﬁnd value in using the tool as a quick

check to conﬁrm their code meets best practices, or to

discover edge cases they may have missed. In either

case, the use of a data clumps detection tool can help

both inexperienced and experienced developers.

However, despite the perceived limitations for ex-

perienced developers, our ﬁndings highlight a crucial

gap in the current research. Speciﬁcally, we found

limited data on data clumps, with only the labeled

data set from (Ferenc et al., 2020), we found hardly

any other data. As a result, the signiﬁcance of our

results is limited with respect to other data. Ac-

cordingly, we see a need for a uniﬁed data set and

with manually evaluated data clumps, where again

the problem of subjective judgment arises. Similarly,

in addition to the limited possibility of comparative

data to measure accuracy, we found barely any data

for the time needed to analyze data clumps, which

is essential for the validity of live code smell detec-

tion. (Arcelli Fontana et al., 2015) reported that the

varying deﬁnitions of data clumps pose an additional

challenge in comparing the tool with others. One way

to counteract this issue is that an appropriate tool may

support parameters for the various deﬁnitions of data

clumps.

The signiﬁcance regarding the accuracy of our ap-

proach is limited due to the comparison with the other

tools. We achieved a match of over 90 %, which raises

the question of what the differences are, although the

identical deﬁnition for data clumps was used as it is

in CBSD. The 33 additional ﬁles detected with our

approach were examined manually and found to con-

tain 31 data clumps according to the deﬁnition. Con-

sequently, the precise effect of different implementa-

tions for the detection of data clumps remains to be

determined.

The Fig. 5 indicates that Stench Blossom could be

fast enough for live detection of data clumps as it took

less than 1 second in our measurements. The speed in-

crease seen with our approach for live detection could

be due to the use of the (maybe faster) API of IntelliJ

instead of Eclipse.

In our results for timing, we note the possibility

of the live code smell detection of data clumps by a

tool. While in some projects and ﬁles the required

response time was greater than the time deﬁned by

(Miller, 1968) and (Nielsen, 1993), at which a de-

veloper’s ﬂow of thoughts is interrupted, the median

was below this threshold for all projects examined,

which, to our knowledge, are not extremely complex

edge cases of classes. All of our measurements were

performed in an isolated test, whereas in reality, it

may well be that other programs or plugins on the

developer’s computer may negatively affect the per-

formance of our approach.

Furthermore, according to (Vidal et al., 2014), the

order in which data clumps are removed is relevant.

It may be helpful to ﬁrst remove other code smells,

which may ﬁx the issue of data clumps. In this re-

spect, our approach cannot provide any guidance on

the relevance of the detected data clumps and the in-

teraction of code smells.

Another challenge for the detection of data

clumps is the identiﬁcation of identical variables and

ENASE 2023 - 18th International Conference on Evaluation of Novel Approaches to Software Engineering

parameters. In our approach, we assume that these

have the same names, ignoring semantic identity be-

yond naming, based on the data clump deﬁnitions (c.f.

2.2). For example, a parameter xVal may have a con-

nection to a variable named xPos, which might be

identical in a semantic way. Therefore, a data ﬂow

analysis would be needed. Given that the serialVer-

sionUID ﬁeld has been detected in data clumps, it is

advisable to warn the user that automatic refactoring

has its limits.

Finally, despite the subtle way of presenting issues

in IntelliJ’s IDE, we face the question of the extent to

which this type of reporting contributes to distraction.

In this regard, there is a need for further research into

the representation of data clumps. Related to this,

there is still the possibility in our approach to make

the developer more aware of the data clumps found

and the potential problems involved.

7 CONCLUSION

With the approach proposed in this paper, we were

able to demonstrate that live detection of data clumps

is quite feasible in terms of response time. The mea-

surements recorded median times for the analysis of

data clumps below 1 second. Thus, we were able

to conﬁrm our hypothesis regarding the sufﬁciently

fast detection of data clumps. In addition to the im-

plementation of live detection, we successfully inte-

grated the full scan functionality, achieving a signiﬁ-

cant increase in speed for the ArgoUML project ﬁles

compared to CBSD, Stench Blossom, and other tools.

Regarding accuracy, our approach achieved a sat-

isfactory rate of 90 % in detecting data clumps com-

pared to CBSD. Moreover, 10 % of data clumps of

data clumps detected through our approach were not

identiﬁed by Stench Blossom or CBSD. Objective

or standardized deﬁnitions of data clumps and tools

may facilitate comparability of parameterization in

this regard, along with clarifying how code smells are

counted. Furthermore, a data set for comparing tim-

ing and accuracy, which are manually veriﬁed, would

be useful. The comparison of the detection of data

clumps using the ﬁles we created with existing data

clumps could be extended. Additional manually cre-

ated test cases could be considered from (Ferenc et al.,

2020). The differences between our approach and

CBSD come from the particular test cases. Since the

accuracy of detection of our approach is 90 % com-

pared to CBSD, the task of investigating where the

differences in detection arise remains.

To the best of our knowledge, implementing live

detection of data clumps is novel to this study. Our in-

novative approach supports both new and experienced

developers in the creation of a project through live de-

tection and direct refactoring suggestions.

Despite these beneﬁcial features and signiﬁcant

potential in this form of support, we would like to im-

prove our approach to semantic detection of related

parameters and ﬁelds. We are planning an exten-

sion or development for the programming language

JavaScript. As for future goals, we aim to provide

better support for inexperienced programmers, who

could increase their knowledge through examples and

solid explanations. Furthermore, there is a need to

consider how experienced programmers could be sup-

ported even further. We can imagine approaches

aligned with continuous integration or continuous de-

livery with refactoring suggestions here.

REFERENCES

Aqris Software (2016). Refactorit. https://sourceforge.net/

projects/refactorit/. Accessed: Feb. 04, 2023.

Arcelli Fontana, F., Mangiacavalli, M., Pochiero, D., and

Zanoni, M. (2015). On Experimenting Refactoring

Tools to Remove Code Smells. pages 7:1–7:8.

Becker, P., Fowler, M., Beck, K., Brant, J., Opdyke, W., and

Roberts, D. (1999). Refactoring - Improving the De-

sign of Existing Code. Addison-Wesley Professional,

Boston.

Brown, W. H., Malveau, R. C., McCormick, H. W., and

Mowbray, T. J. (1998). AntiPatterns: Refactoring

Software, Architectures, and Projects in Crisis.

CCFinder (2008). AIST CCFinderX is a Code-Clone De-

tector. http://www.ccﬁnder.net/ccﬁnderxos.html. Ac-

cessed: Mar. 27, 2022.

De Stefano, M., Gambardella, M. S., Pecorelli, F., Palomba,

F., and De Lucia, A. (2020). cASpER: A Plug-in for

Automated Code Smell Detection and Refactoring. In

Proceedings of the International Conference on Ad-

vanced Visual Interfaces, AVI ’20, New York, NY,

USA. Association for Computing Machinery.

Delchev, M. and Harun, M. F. (2015). Investigation of

Code Smells in Different Software Domains. Full-

scale Software Engineering, page 31.

dos Santos Neto, B. F., Ribeiro, M., Da Silva, V. T.,

Braga, C., De Lucena, C. J. P., and de Barros Costa,

E. (2015). AutoRefactoring: A platform to build

refactoring agents. Expert systems with applications,

42(3):1652–1664.

Eclipse Foundation (2022). Eclipse. https://www.eclipse.

org/. Accessed: Feb. 04, 2023.

Felix, S. B. and Vinod, V. (2016). A Study on Code Smell

Detection with Refactoring Tools in Object Oriented

Languages. International journal of business, 5:38–

40.

Ferenc, R., Toth, Z., Lad

anyi, G., Siket, I., and Gyim

othy,

T. (2019). Uniﬁed Bug Dataset (1.2). http://www.inf.

Live Code Smell Detection of Data Clumps in an Integrated Development Environment

u-szeged.hu/

∼

ferenc/papers/UniﬁedBugDataSet/. Ac-

cessed: Feb. 04, 2023.

Ferenc, R., Toth, Z., Lad

anyi, G., Siket, I., and Gyim

othy,

T. (2020). A public uniﬁed bug dataset for java and

its assessment regarding metrics and bug prediction.

Software Quality Journal, 28.

Fernandes, E., Oliveira, J., Vale, G., Paiva, T., and

Figueiredo, E. (2016). A Review-Based Comparative

Study of Bad Smell Detection Tools. In Proceedings

of the 20th International Conference on Evaluation

and Assessment in Software Engineering, EASE ’16,

New York, NY, USA. Association for Computing Ma-

chinery.

Fowler, M. (2019). Refactoring - Improving the Design of

Existing Code. Addison-Wesley, Amsterdam.

Gronback, R. C. (2003). Software Remodeling: Improving

Design and Implementation Quality.

Guggulothu, T. and Abdul Moiz, S. (2019). An Approach

to Suggest Code Smell Order for Refactoring, pages

250–260.

Habra, N. and Lopez Martin, M.-A. (2006). On the use

of Measurement in Software Restructuring Research.

In Duchien, L., D’Hondt, M., and Mens, T., editors,

Proceedings of the International ERCIM Workshop on

Software Evolution (2006), pages 81–89. Publication

editors : Laurence Duchien, Maja D’Hondt and Tom

Mens.

Hall, T., Zhang, M., Bowes, D., and Sun, Y. (2013). Code

Bad Smell Detector. https://sourceforge.net/projects/

cbsdetector/. Accessed: Feb. 04, 2023.

Hall, T., Zhang, M., Bowes, D., and Sun, Y. (2014). Some

Code Smells Have a Signiﬁcant but Small Effect on

Faults. ACM Trans. Softw. Eng. Methodol., 23(4).

Intooitus srl (2012). inFusion Hydrogen. https:

//marketplace.eclipse.org/content/infusion-hydrogen.

Accessed: Feb. 04, 2023.

Intooitus srl (2013). inCode Helium. https://marketplace.

eclipse.org/content/incode-helium. Accessed: Feb.

04, 2023.

JetBrains (2022). List of Java Inspections. https://www.

jetbrains.com/help/idea/list-of-java-inspections.html.

Accessed: Feb. 04, 2023.

Khrishe, Y. and Alshayeb, M. (2016). An empirical study

on the effect of the order of applying software refac-

toring. In 2016 7th International Conference on Com-

puter Science and Information Technology (CSIT),

pages 1–4.

uhl, S. J., Schneider, A., Kestler, H. A., Toberer, M.,

uhl, M., and Fischer, M. R. (2019). Investigating

the self-study phase of an inverted biochemistry class-

room – collaborative dyadic learning makes the differ-

ence. BMC Medical Education, 19(1):64.

Lacerda, G., Petrillo, F., Pimenta, M., and Gu

eneuc, Y. G.

(2020). Code smells and refactoring: A tertiary sys-

tematic review of challenges and observations. Jour-

nal of Systems and Software, 167:110610.

antyl

a, M. V. and Lassenius, C. (2006). Subjec-

tive Evaluation of Software Evolvability Using Code

Smells: An Empirical Study. Empirical Softw. Engg.,

11(3):395–431.

Mazinanian, D., Tsantalis, N., Stein, R., and Valenta, Z.

(2016). JDeodorant: Clone Refactoring. In 2016

IEEE/ACM 38th International Conference on Soft-

ware Engineering Companion (ICSE-C), pages 613–

616.

Mehta, Y., Singh, P., and Sureka, A. (2018). Analyzing

Code Smell Removal Sequences for Enhanced Soft-

ware Maintainability. In 2018 Conference on Informa-

tion and Communication Technology (CICT), pages

1–6.

Micro Focus (2023). Together: Visual Modeling Soft-

ware. https://www.microfocus.com/en-us/products/

together. Accessed: Feb. 04, 2023.

Miller, R. B. (1968). Response Time in Man-Computer

Conversational Transactions. In Proceedings of the

December 9-11, 1968, Fall Joint Computer Confer-

ence, Part I, AFIPS ’68 (Fall, part I), page 267–277,

New York, NY, USA. Association for Computing Ma-

chinery.

Murphy-Hill, E. and Black, A. P. (2010). An Interactive

Ambient Visualization for Code Smells. In Proceed-

ings of the 5th International Symposium on Software

Visualization, SOFTVIS ’10, page 5–14, New York,

NY, USA. Association for Computing Machinery.

Nielsen, J. (1993). Chapter 5 – Usability Heuristics.

Palomba, F., Bavota, G., Di Penta, M., Fasano, F., Oliveto,

R., and De Lucia, A. (2018). A large-scale empirical

study on the lifecycle of code smell co-occurrences.

Information and Software Technology, 99:1–10.

Parnin, C., G

org, C., and Nnadi, O. (2008). A catalogue

of lightweight visualizations to support code smell in-

spection. pages 77–86.

Pessoa, T., Brito e Abreu, F., Monteiro, M., and Bryton, S.

(2012). An Eclipse Plugin to Support Code Smells

Detection.

PMD (2023). PMD - An extensible cross-language static

code analyzer. https://pmd.github.io/. Accessed: Feb.

04, 2023.

Rutar, N., Almazan, C. B., and Foster, J. S. (2004). A com-

parison of bug ﬁnding tools for Java. 15th Interna-

tional Symposium on Software Reliability Engineer-

ing, pages 245–256.

Salehie, M., Li, S., and Tahvildari, L. (2006). A

Metric-Based Heuristic Framework to Detect Object-

Oriented Design Flaws. volume 2006, pages 159–

168.

Sharma, T. and Spinellis, D. (2017). A Survey on Software

Smells. Journal of Systems and Software, 138.

Simon, F., Steinbruckner, F., and Lewerentz, C. (2001).

Metrics Based Refactoring.

Vidal, S., Marcos, C., and Diaz-Pace, A. (2014). An ap-

proach to prioritize code smells for refactoring. Auto-

mated Software Engineering, 23.

Zhang, M., Baddoo, N., Wernick, P., and Hall, T. (2008).

Improving the Precision of Fowler’s Deﬁnitions of

Bad Smells. pages 161 – 166.

ENASE 2023 - 18th International Conference on Evaluation of Novel Approaches to Software Engineering