NEGWeb


Suresh Thummalapenta and Tao Xie. NEGWeb: Detecting Neglected Conditions via Mining Programming Rules from Open Source Code North Carolina State University Department of Computer Science Technical report TR-2007-24, September 16, 2007. Download: [PDF][BibTex]

ABSTRACT

Neglected conditions, also referred as missing paths, are known to be an important class of software defects. Revealing neglected conditions around individual API calls in an application requires the knowledge of programming rules that must be obeyed while reusing those APIs. To mine those implicit programming rules and hence to detect neglected conditions, we develop a novel framework, called NEGWeb, that substantially expands mining scope to billions of lines of open source code available on the web by leveraging a code search engine. We evaluated NEGWeb to detect violations of mined rules in local code bases or open source code bases. In our evaluation, we show that NEGWeb finds three real defects in Java code reported in the literature and also finds three previously unknown defects in a large-scale open source project called Columba (91, 508 lines of Java code) that reuses 541 classes and 2225 methods. We also report a high percentage of real rules among the top 25 reported patterns mined for APIs provided by five popular open source applications.

DATA FORMATS

Mined Patterns Sheet Format

Detected Violations Sheet Format

EVALUATIONS

This section presents the evaluation results in the form of excel sheets. Each sheet can contain results that are inspected and not inspected. The results that are not inspected are shown in gray color.

Evaluation 1: Open Source Applications

We inspected the top 25 mined condition patterns and classified the patterns into three categories: Rules, Usage Patterns, and False Positives

Format: Mined Patterns
Subjects:

Evaluation 2: Defects in open source world

We inspected the violations of a mined rule from the top 25 mined condition patterns of each subject that is used to detect the largest number of violations.
We classified the detected violations into five categories: Defect, Code Smell, Wrapper, Hint, and False Positive.

Format: Detected violations

Subjects:

Evaluation 3: Case Study: Columba

We inspected the violations detected by the top 25 mined patterns. The top 25 patterns detected 70 violations. We classified the detected violations into five categories: Defect, Code Smell, Wrapper, Hint, and False Positive.

Format: Detected violations

Data: Columbaviolations



Please contact for any additional information.