Mining Software Engineering Data Bibliography
-- How are
data mining techniques used in software engineering?--
Association rule and
frequent pattern based methods
- Association Rules
- Frequent Itemset Mining
- Frequent Sequence Mining
- Frequent Graph Mining
- Misc.
-
Data Mining Library Reuse Patterns in User-Selected Applications. Amir
Michail. ASE 1999. [PDF]
-
Data Mining Library Reuse Patterns using Generalized Association Rules. Amir
Michail. ICSE 2000. [PDF]
-
Interaction-Pattern Mining: Extracting Usage Scenarios from
Run-time Behavior Traces, M. El-Ramly, E. Stroulia, P. Sorenson. KDD
2002. [PDF]
CELLEST
Mohammad El-Ramly Stroulia
-
Mining Version Histories to Guide Software Changes. Thomas Zimmermann, Peter
Weibgerber, Stephan Diehl, and Andreas
Zeller. ICSE 2004. [PDF][eRose
tool implementation]
-
Predicting Source Code Changes by Mining Change History.
Annie Ying, Gail Murphy, Raymond Ng, and Mark Chu-Carroll. TSE 2004. [PDF]
-
Aspect Mining Using Event Traces. Silvia Breu and Jens Krinke. ASE 2004. [PDF]
-
Mining Control Patterns from Java Program Corpora. Deng-Jyi Chen, Chung-Chien
Hwang, Shih-Kun Huang, and David T. K. Chen. JISE 2004. [PDF]
-
Mining System-User Interaction Logs for Interaction Patterns. Mohammad El-Ramly
and Eleni Stroulia. MSR 2004. [PDF]
-
CP-Miner: A Tool for Finding Copy-paste and Related Bugs in
Operating System Code. Zhenmin Li, Shan Lu, Suvda Myagmar, and Yuanyuan Zhou.
OSDI 2004. [PDF]
-
PR-Miner: Automatically Extracting Implicit Programming
Rules and Detecting Violations in Large Software Code. Zhenmin Li and
Yuanyuan Zhou. ESEC/FSE 2005. [PDF]
-
DynaMine: Finding Common Error Patterns by Mining Software Revision Histories.
Benjamin Livshits and Thomas Zimmermann.
ESEC/FSE 2005. [PDF]
-
Mining Temporal Specifications for Error Detection. Westley Weimer and George
Necula. TACAS 2005. [PDF]
-
Locating Matching Method Calls by Mining Revision History Data. Benjamin
Livshits and Thomas Zimmermann. BUG 2005. [PDF]
-
Mining Change and Version Management Histories to Evaluate an Analysis Tool,
Danhua Shao, Sarfraz Khurshid and Dewayne E. Perry. 2005. [PDF]
-
When Do Changes Induce Fixes? Jacek Sliwerski, Thomas
Zimmermann, and Andreas Zeller. MSR 2005. [PDF]
[slides]
-
Mining Behavior Graphs for Backtrace
of Noncrashing Bugs. Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han, and Philip S. Yu.
SDM 2005. [PDF]
-
Data Mining and Cross-checking of Execution Traces. TristanDenmat, Mireille Ducasse and Olivier Ridoux. ASE
2005. [PDF]
-
Applying Webmining Techniques to Execution Traces to Support the Program
Comprehension Process. Andy Zaidman, Toon
Calders, Serge Demeyer,
and Jan Paredaens.
CSMR 2005. [PDF]
-
Software Defect Association Mining and Defect Correction Effort Prediction. Qinbao
Song, Martin Shepperd, Michelle Cartwright, and Carolyn Mair. TSE 2006. [PDF]
- MAPO: Mining API Usages from Open Source Repositories. Tao
Xie and Jian Pei. MSR 2006. [PDF]
-
Mining Control Flow Abnormality for Logic Error Isolation. Chao Liu, Xifeng
Yan, and Jiawei Han. SDM 2006 [PDF]
-
XSnippet: Mining for Sample Code. Naiyana Tansalarak and Kajal T. Claypool.
OOPSLA 2006 [PDF]
- Predicting Faults from Cached History. Sunghun Kim, Thomas Zimmermann, E. James Whitehead Jr., and Andreas Zeller. ICSE 2007. [PDF]
- Path-Sensitive Inference of Function Precedence Protocols. Murali
Krishna Ramanathan, Ananth Grama, Suresh Jagannathan. ICSE 2007. [PDF]
- Static Specification Inference Using Predicate Mining. Murali Krishna Ramanathan, Ananth Grama, Suresh Jagannathan. PLDI 2007 [PDF]
- Finding What's Not There: A New Approach to Revealing Neglected
Conditions in Software. Ray-Yaung Chang, Andy Podgurski, and Jiong
Yang. ISSTA 2007 [PDF]
- Mining API Patterns as Partial Orders from Source Code: From Usage
Scenarios to Specifications, Mithun Acharya, Tao Xie, Jian Pei, Jun Xu.
ESEC/FSE 2007. [PDF]
- Detecting Object Usage Anomalies. Andrzej Wasylkowski, Andreas Zeller, and Christian Lindig. ESEC/FSE 2007. [PDF]
- Efficient Mining of Iterative Patterns for Software Specification Discovery. David Lo, Siau-Cheng Khoo and Chao Liu. KDD 2007. [PDF]
- Mining Specifications of Malicious Behavior.
Mihai Christodorescu, Somesh Jha, and Christopher Kruegel. ESEC/FSE 2007. [PDF]
- PARSEWeb: A Programmer Assistant for Reusing Open Source Code on the Web. Suresh Thummalapenta and Tao Xie. ASE 2007. [PDF]
- NEGWeb: Static Defect Detection via Searching Billions of Lines
of Open Source Code. Suresh Thummalapenta and Tao Xie. NCSU CSC
2007. [PDF]
Classification-based
methods
- Automated Support for Classifying Software Failure Reports.
Andy Podgurski, David Leon, Patrick Francis, Wes Masri, MelindaMinch,
Jiayang Sun, and Bin Wang. ICSE 2003. [PDF]
-
Automatic Categorization Algorithm for Evolvable Software
Archive. Shinji Kawaguchi, Pankaj K. Garg,
Makoto Matsushita,
and Katsuro Inoue. IWPSE 2003. [PDF]
More
papers More
papers
-
Mining the Maintenance History of a Legacy Software System. Jelber
Sayyad-Shirabad, Timothy Lethbridge, Stan Matwin. ICSM 2003 [PDF]
-
Tree-Based Methods for Classifying Software Failures. Patrick
Francis, David Leon, Melinda Minch, Andy Podgurski. ISSRE 2004. [PDF]
-
MUDABlue: An Automatic Categorization System for Open Source
Repositories. Shinji Kawaguchi, Pankaj K. Garg, Makoto Matsushita, and
Katsuro Inoue. APSEC 2004. [PDF]
-
Automatic Bug Triage Using Text Classification. Davor Cubranic and Gail Murphy. SEKE 2004. [PDF]
-
Active Learning for Automatic Classification of Software
Behavior. James Bowring, James Rehg, and Mary Jean Harrold. ISSTA 2004. [PDF]
-
Improving the Classification of Software Behaviors using
Ensembles of Control-Flow and Data-Flow Classifiers. James Bowring, Mary Jean
Harrold, and James Rehg. GIT-CERCS-05-10. [PDF]
-
Data Mining Approaches to Software Fault Diagnosis. Bose, R.P.J.C.;
Srinivasan, S.H. RIDE-SDMA 2005. [PDF]
-
Helping Users Avoid Bugs in GUI Applications. Amir Michail and Tao Xie.
ICSE 2005. [PDF]
-
Coping With Open Bug Repositories. John Anvik, Lyndon Hiew, and Gail C. Murphy. eTX 2006. [PDF] [BugTriage project]
-
Timna: A Framework for Automatically Combining Aspect Mining Analyses. David
Shepherd, Jeffrey Palm, Lori Pollock and Mark Chu-Carroll. ASE 2005. [PDF]
-
Who Should Fix This Bug? John Anvik, Lyndon Hiew, and Gail C. Murphy. ICSE 2006. [PDF] [BugTriage project]
-
Inferring Access-Control Policy Properties via Machine
Learning, Evan Martin and Tao Xie. POLICY 2006. [PDF]
-
GPLAG: Detection of Software Plagiarism by Procedure Dependency Graph
Analysis. Chao Liu,
Chen Chen
, Jiawei Han, and Philip Yu. KDD 2006. [PDF]
- How Bayesians Debug. Chao Liu, Zeng Lian and Jiawei Han. ICDM 2006 [PDF]
- Cost-Sensitive Decision Tree Learning for Forensic Classification.
Jason V. Davis, Jungwoo Ha, Hany E. Ramadan, Christopher J. Rossbach,
and Emmett Witchel. ECML 2006. [PDF]
- Improved Error Reporting for Software that Uses Black-Box
Components. Jungwoo Ha, Christopher J. Rossbach, Jason V. Davis,
Indrajit Roy, Hany E. Ramadan, Don E. Porter, David L. Chen and Emmett
Witchel. PLDI 2007 [PDF]
- Which Warnings Should I Fix First? Sunghun Kim and Michael D. Ernst. ESEC/FSE 2007. [PDF]
Clustering-based
methods
-
Finding Failures by Cluster Analysis of Execution Profiles.
William Dickinson and David Leon and Andy Podgurski. ICSE 2001. [PDF]
-
Pursuing Failure: the Distribution of Program Failures in a Profile Space. William Dickinson, David Leon,
and Andy Podgurski. FSE 2001.
[PDF]
-
Detecting AAA Vulnerabilities by Mining Execution Profiles. Zhan Xu, David
Leon, and Andy Poidgurski, and Vincenzo
Liberatore.
IEEE S&P 2004. [PDF]
- Failure Proximity: A Fault Localization-Based Approach. Chao Liu and Jiawei Han. FSE 2006 [PDF]
Automaton/grammar
learning methods
-
Discovering Models of Software Processes from Event-Based Data. Jonathan E. Cook and Alexander L. Wolf. TOSEM 1998 [PDF]
-
Encoding Program Executions. Steven P. Reiss
and Manos Renieris.
ICSE 2001 [PDF]
- Automatic Extraction of Object-Oriented Component Interfaces.
John Whaley, Michael C. Martin, and Monica S. Lam. ISSTA 2002 [PDF]
-
Mining Specifications. Glenn Ammons, Rastislav Bodk, and
James R. Larus. POPL 2002 [PDF]
-
Debugging Temporal Specifications with Concept Analysis.
Glenn Ammons, David Mandelin, Rastislav Bodik, James Larus. PLDI 2003. [PDF]
-
A New Approach to Data Mining for Software Design. Walid Taha, Scott Crosby,
and Kedar Swadi. CSITeA 2004. [PDF]
-
Inference of Component Protocols by the kBehavior Algorithm.
Leonardo Mariani, and Mauro Pezzè. 2004 Tech Report [PDF]
[kBehavior tool
implementation]
-
Behavior Capture and Test: Automated Analysis of
Component Integration, Leonardo Mariani and Mauro Pezzè. ICECCS 2005. [PDF]
-
Synthesis of Interface Specifications for Java Classes. Rajeev Alur, Pavol
Cerny, Gunjan Gupta, P. Madhusudan, Wonhong Nam, and Anshuman Srivastava.
POPL 2005. [PDF]
-
Permissive Interfaces. Thomas A. Henzinger, Ranjit Jhala, and Rupak Majumdar,
ESEC/FSE 2005. [PDF]
- QUARK: Empirical Assessment of Automaton-based Specification Miners. David Lo and Siau-Cheng Khoo. WCRE 2006. [PDF]
- SMArTIC: Towards Building an Accurate, Robust and Scalable Specification Miner. David Lo and Siau-Cheng Khoo. FSE 2006 [PDF]
- Automated Inference of Pointcuts in Aspect-Oriented Refactoring. Prasanth
Anbalagan and
Tao Xie. ICSE 2007 [PDF]
- Automatic Inference of Structural Changes for Matching
Across Program Versions. Miryung Kim, David Notkin, and Dan Grossman.
ICSE 2007. [PDF]
Searching/matching
-
CVSSearch: Searching through Source Code using CVS Comments. Annie
Chen, Eric Chou, Joshua Wong, Andrew Y. Yao, Qing Zhang, Shao Zhang, and
Amir Michail. ICSM 2001. [PDF]
-
Mining Jungloids: Helping to Navigate the API Jungle. David
Mandelin, Lin Xu, Rastislav Bodik, and Doug Kimelman. PLDI 2005. [PDF]
[Prospector
tool web interface]
-
An Empirical Study of Code Clone Genealogies.
Miryung Kim
, Vibha Sazawal, David Notkin, and Gail C. Murphy. ESEC/FSE 2005. [PDF]
- If Your Bug Database Could Talk. Adrian Schröter, Thomas Zimmermann, Rahul
Premraj, and Andreas Zeller:
Technical Report, Saarland University, June 2006. [PDF][Eclipse Bug Data]
- Mica: A Web-Search Tool for Finding API Components and Examples. Jeffrey Stylos and Brad A. Myers, VL/HCC 2006. [PDF] [Web]
- Recommending Random Walks, Zachary M Saul, Vladimir Filkov, Premkumar Devanbu, Christian Bird. ESEC/FSE 2007 [PDF] [FRAN implementation]
- Assieme: Finding
and Leveraging Implicit References in a Web Search Interface for
Programmers. Raphael Hoffmann, James Fogarty, and Daniel S. Weld. UIST 2007. [PDF]
- Searching the Library and Asking the Peers: Learning to Use Java
APIs on Demand. Yunwen Ye, Yasuhiro Yamamoto, Kumiyo Nakakoji,
Yoshiyuki Nishinaka, and Mitsuhiro Asada. PPPJ 2007. [PDF]
Concept analysis
-
The Concept of Dynamic Analysis. Tom Ball. ESEC/FSE 1999. [PDF]
-
Aspect Mining through the Formal Concept Analysis of Execution Traces. Paolo
Tonella, Mariano Ceccato. WCRE 2004. [PDF]
-
Mining Eclipse for Cross-Cutting Concerns. Silvia Breu and Thomas Zimmerman
and Christian Lindig. MSR 2006. [PDF]
- Mining Security-sensitive Operations in Legacy Code using Concept
Analysis. Vinod Ganapathy, David King, Trent Jaeger and Somesh Jha.
ICSE 2007. [PDF]
- Mining Patterns and Violations Using Concept Analysis. Christian Lindig. [PDF]
Template-based analysis
- Dynamically Discovering Likely Program Invariants to Support
Program Evolution. Michael D. Ernst, Jake Cockrell, William G.
Griswold, and David Notkin. TSE 2001. [PDF]
[Daikon tool implementation]
[Publications using Daikon]
-
Bugs as Deviant Behavior: A General Approach to Inferring Errors in Systems
Code. Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin
Chelf. SOSP 2001. [PDF]
-
Tracking Down Software Bugs Using Automatic Anomaly
Detection, Sudheendra Hangal
and Monica S. Lam. ICSE 2002. [PDF]
[DIDUCE tool implementation]
-
Discovering Algebraic Specifications from Java Classes. Johannes Henkel and
Amer Diwan. ECOOP 2003. [PDF]
-
Dynamically Inferring Temporal Properties. Jinlin Yang and
David Evans. PASTE 2004. [PDF]
[Terracotta tool implementation]
-
Automatically Inferring Temporal Properties for Program Evolution. Jinlin
Yang and David Evans. ISSRE 2004. [PDF]
-
Automatically Identifying Special and Common Unit Tests for
Object-Oriented Programs. Tao Xie and David Notkin. ISSRE 2005. [PDF]
-
Perracotta: Mining Temporal API Rules from Imperfect Traces. Jinlin Yang,
David Evans, Deepali Bhardwaj, Thirumalesh Bhat, and Manuvir Das. ICSE
2006. [PDF]
- Inference and Enforcement of Data Structure
Consistency Specifications. Brian Demsky, Michael D. Ernst, Philip J.
Guo, Stephen McCamant, Jeff H. Perkins, and Martin Rinard. ISSTA 2006. [PDF]
- Static Error Detection Using Semantic Inconsistency Inference. Isil Dillig, Thomas Dillig and Alex Aiken. PLDI 2007. [PDF]
Abstraction-based analysis
-
Bug Isolation via Remote Program Sampling. Ben
Liblit, Alex Aiken, Alice X. Zheng, and Michael I. Jordan. PLDI
2003. [PDF]
-
Automatic Extraction of Object-Oriented Observer Abstractions from Unit-Test
Executions. Tao Xie and David Notkin. ICFEM 2004. [PDF]
-
Automatic Extraction of Sliced Object State Machines for Component
Interfaces. Tao Xie and David Notkin. SAVCBS 2004. [PDF]
-
Automatic Extraction of Abstract-Object-State Machines Based on Branch
Coverage. Hai Yuan and Tao Xie. RETR 2005. [PDF]
-
Scalable Statistical Bug Isolation. Ben Liblit, Mayur Naik, Alice X. Zheng, Alex Aiken, and Michael I. Jordan.
PLDI
2005. [PDF]
- SOBER: Statistical Model-based Bug Localization. Chao Liu,
Xifeng Yan, Long Fei, Jiawei Han, and Samuel P. Midkiff. ESEC/FSE2005. [PDF]
-
Mining Object Behavior with ADABU. Valentin Dallmeier and Christian Lindig
and Andrzej Wasylkowski and Andreas Zeller. WODA 2006. [PDF]
-
Automatic Extraction of Abstract-Object-State Machines from Unit-Test
Executions. Tao Xie, Evan Martin, and Hai Yuan. ICSE 2006 Demo. [PDF]
- Statistical Debugging Using Compound Boolean Predicates. Piramanayagam
Arumuga Nainar, Ting Chen, Jake Rosin, and Ben Liblit. ISSTA 2007 [PDF]
- Static Specification Mining Using Automata-Based Abstractions. Sharon
Shoham, Eran Yahav, Stephen Fink and Marco Pistoia. ISSTA 2007 [PDF]
Text mining
- A Linguistic Analysis of How People Describe Software Problems in
Bug Reports. Andrew Ko, Brad Myers, Duen Horng Chau. VLHCC 2006 [PDF]
- Mining Email Social Networks. Christian Bird, Alex Gourley, Prem Devanbu, Michael Gertz, and Anand Swaminathan. MSR 2006 [PDF]
- Examining the Evolution of Code Comments in PostgreSQL. Zhen Ming Jiang and Ahmed E. Hassan. MSR 2006 [PDFDetection of Duplicate Defect Reports Using Natural Language
Processing. Per Runeson, Magnus Alexandersson, and Oskar Nyholm. ICSE
2007. [PDF]
- /* iComment: Bugs or Bad Comments? */. Lin Tan, Ding Yuan, Gopal Krishna and Yuanyuan Zhou. SOSP 2007. [PDF]
- What Can
OSS Mailing Lists Tell Us? A Preliminary Psychometric Text Analysis of
the Apache Developer Mailing List. Peter C. Rigby, and Ahnmed E.
Hassan. MSR 2007 [PDF]
- Mining
the Lexicon Used by Programmers during Sofware Evolution. G. Antoniol
and Y. Gael and E. Merlo and Paolo Tonella. ICSM 2007 [PDF]
- Mining Concepts from Code with Probabilistic Topic Models. Erik
Linstead, Paul Rigor, Sushil Bajracharya, Cristina Lopes, and Pierre
Baldi. ASE 2007 [PDF]
- An
Approach to Detecting Duplicate Bug Reports using Natural Language and
Execution Information.. Xiaoyin Wang, Lu Zhang, Tao Xie, John Anvik,
and Jiasu Sun. ICSE 2008. [PDF]
Misc
-
Recovering System Specific Rules from Software Repositories. Chadd Williams,
and Jeffrey K. Hollingsworth. MSR 2005. [PDF]
-
Automatic Mining of Source Code Repositories to Improve BugFinding Techniques.
Chadd C. Williams and Jeffrey K.
Hollingsworth. TSE 2005. [PDF]
-
Lightweight Defect Localization for Java. Valentin Dallmeier, Christian
Lindig, and Andreas Zeller. ECOOP 2005. [PDF]
[Ample tool implementation]
-
Detecting Failure-Related Anomalies in
Method Call Sequences. Valentin Dallmeier. Thesis 2005. [PDF]
-
Extending Dynamic Aspect Mining with Static Information.
Silvia Breu. SCAM 2005. [PDF]
- Mining Aspects in Requirements. Américo
Sampaio, Neil Loughran, Awais Rashid and Paul Rayson. Workshop on
Early Aspects 2005. [PDF]
- EA-Miner: A Tool for Automating Aspect-Oriented Requirements
Identification. Americo Sampaio, Ruzanna Chitchyan, Awais Rashid, and Paul
Rayson. ASE 2005 [PDF]
-
Understanding Software Application Interfaces via String Analysis. Evan
Martin and Tao Xie. ICSE 2006 ER. [PDF]
-
Mining Metrics to Predict Component Failures, Nachiappan
Nagappan, Thomas Ball, Andreas Zeller, ICSE 2006 [PDF]
- How Long Did It Take to Fix Bugs? Sunghun Kim, E. James
Whitehead, Jr. MSR 2006 [PDF]
- Context-Sensitive Domain-Independent Algorithm Composition and Selection. Troy A. Johnson. Rudolf Eigenmann. PLDI 2006 [PDF]
- Automatic Identification of
Bug-Introducing Changes. Sunghun Kim, Thomas Zimmermann, Kai Pan, E. James
Whitehead, Jr. ASE 2006 [PDF]
-
Using FogBUGZ to Get Crash Reports
From Users - Automatically! Joel Spolsky. [HTML]
|