Bibliography on Mining Software Engineering Data
An ICSE 2009 Tutorial (Tuesday May 19 morning) on Mining Software Engineering Data
Tutorial Slides (PPT, 4.0MB) Tutorial Notes (6 slides per page, PDF, 2.22MB) Software engineering data (such as code bases, execution traces, historical code changes, mailing lists, and bug databases) contains a wealth of information about a project's status, progress, and evolution. Using well-established data mining techniques, practitioners and researchers can explore the potential of this valuable data in order to better manage their projects and to produce higher quality software systems that are delivered on time and on budget. This tutorial presents the latest research in mining Software Engineering (SE) data, discusses challenges associated with mining SE data, highlights SE data mining success stories, and outlines future research directions. Attendees will acquire the knowledge and skills needed to perform research or conduct practice in the field and to integrate data mining techniques in their own research or practice. More information of the tutorial can be found at https://sites.google.com/site/asergrp/dmse. An ICSE 2008 Tutorial on Mining Software Engineering Data
Tutorial Slides (PPT, 4.0MB) Tutorial Notes (6 slides per page, PDF, 2.22MB) Software engineering data (such as code bases, execution traces, historical code changes, mailing lists, and bug databases) contains a wealth of information about a project's status, progress, and evolution. Using well-established data mining techniques, practitioners and researchers can explore the potential of this valuable data in order to better manage their projects and to produce higher quality software systems that are delivered on time and on budget. This tutorial presents the latest research in mining Software Engineering (SE) data, discusses challenges associated with mining SE data, highlights SE data mining success stories, and outlines future research directions. Attendees will acquire the knowledge and skills needed to perform research or conduct practice in the field and to integrate data mining techniques in their own research or practice. More information of the tutorial can be found at https://sites.google.com/site/asergrp/dmse. Invited talks at West Virginia U., HKUST, CUHK, U. Calgary, Motorola Labs, Accenture Labs Improving Software Productivity and Quality via Mining Program Source Code Tao Xie North Carolina State University Talk Slides (PPT, 1.7MB) Since
late 90's, various data mining techniques have been applied to analyze
software engineering data, and have achieved many noticeable successes.
This talk will first present recent research at North Carolina State
University on mining program source code, including mining API usage
patterns for software reuse and API properties for static detect
detection. The research exploits a model checker to generate static
traces for mining without requiring system tests or runtime execution.
The research also exploits a code search engine to expand the scope of
mining to billions of lines of open source code. The related research
papers can be found at http://www.csc.ncsu.edu/faculty/xie/research.htm#minestatic
and more general information on mining software engineering data can be
found in tutorial slides presented at KDD 2006, ICSE 2007, and ICDM
2007 as well as a comprehensive bibliography: https://sites.google.com/site/asergrp/dmse.
An ICDM 2007 Tutorial on Mining for Software Reliability
Software is ubiquitous in our daily
life. It brings us great convenience and a big headache about software
reliability as well: Software is never bug-free, and software bugs keep
incurring monetary loss or even catastrophes. In the pursuit of better
reliability, software engineering researchers found that huge amount of
data in various forms can be collected from software systems, and these
data, when properly analyzed, can help improve software reliability.
Unfortunately, the huge volume of complex data renders simple analysis
techniques incompetent; consequently, researchers have been resorting
to data mining for more effective analysis. In the past few years, we
have witnessed many studies on mining for software reliability reported
in data mining as well as software engineering forums. These studies
either develop new or apply existing data mining techniques to tackle
reliability problems from different angles. In order to keep data
mining researchers abreast of the latest development in this growing
research area, we propose this tutorial on mining for software
reliability. In this tutorial, we will present a comprehensive overview
of this area, examine representative studies, and lay out challenges to
data mining researchers. Especially, every effort will be made to let
data mining researchers appreciate the challenges and impact posed by
software reliability, and be stimulated to contribute.
An ICSE 2007 Tutorial on Mining Software Engineering Data
Some tutorial slides are adapted from KDD 06 tutorial slides co-prepared by Jian Pei from Simon Fraser University, Canada Tutorial Slides (PDF, 2.28MB) (PPT, 4.40MB) Tutorial Notes (6 slides per page, PDF, 1.72MB) Software engineering data (such as code bases, execution traces, historical code changes, mailing lists, and bug databases) contains a wealth of information about a project’s status, progress, and evolution. Using well-established data mining techniques, practitioners and researchers can explore the potential of this valuable data in order to better manage their projects and to produce higher quality software systems that are delivered on time and on budget. This tutorial presents the latest research in mining Software Engineering (SE) data, discusses challenges associated with mining SE data, highlights SE data mining success stories, and outlines future research directions. Attendees will acquire the knowledge and skills needed to perform research or conduct practice in the field and to integratedata mining techniques in their own research or practice. A
KDD 2006Tutorial
on
Tutorial Slides (PDF, 1.70MB) (PPT, 3.46MB) Since late 90's, various data mining techniques have been applied to analyze software engineering data, and have achieved many noticeable successes. Substantial experience, development, and lessons of data mining for software engineering pose interesting challenges and opportunities for new research and development. In this tutorial, we shall present a survey on the research problems, the latest progress, the challenges, and the potentials of data mining practice in software engineering. The tutorial will focus on the inherent challenges of mining software engineering data, offer a shortcut to the current research and development frontier, and illustrate a few case studies. The tutorial will answer questions like what software engineering tasks can be helped by data mining, what kinds of software engineering data are available for mining, and how data mining techniques can be used in software engineering. The tutors, Drs. Tao Xie and Jian Pei, are active and prolific researchers in software engineering and data mining, respectively. The tutorial website is at: http://ase.csc.ncsu.edu/dmse/ Tutorials on Mining Software Engineering Data Target Audience: both Practitioners and Researchers from the Software Engineering/Development or Data Mining community. If you are interested in inviting
any of us in giving this tutorial at your company, research
lab, or university, please contact Tao Xie! Venues of Tutorial Presentations:
You may be also interested in Tao Xie's presentations on Improving Automation in Developer Testing. Tao Xie's Research on Mining Software Engineering Data |