Outlier Analysis


Author: Charu C. Aggarwal

Publisher: Springer

ISBN: 3319475789

Category: Computers

Page: 466

View: 9118

This book provides comprehensive coverage of the field of outlier analysis from a computer science point of view. It integrates methods from data mining, machine learning, and statistics within the computational framework and therefore appeals to multiple communities. The chapters of this book can be organized into three categories: Basic algorithms: Chapters 1 through 7 discuss the fundamental algorithms for outlier analysis, including probabilistic and statistical methods, linear methods, proximity-based methods, high-dimensional (subspace) methods, ensemble methods, and supervised methods. Domain-specific methods: Chapters 8 through 12 discuss outlier detection algorithms for various domains of data, such as text, categorical data, time-series data, discrete sequence data, spatial data, and network data. Applications: Chapter 13 is devoted to various applications of outlier analysis. Some guidance is also provided for the practitioner. The second edition of this book is more detailed and is written to appeal to both researchers and practitioners. Significant new material has been added on topics such as kernel methods, one-class support-vector machines, matrix factorization, neural networks, outlier ensembles, time-series methods, and subspace methods. It is written as a textbook and can be used for classroom teaching.

Outlier Detection: Techniques and Applications

A Data Mining Perspective


Author: N. N. R. Ranga Suri,Narasimha Murty M,G. Athithan

Publisher: Springer

ISBN: 3030051277


Page: 227

View: 3602

This book, drawing on recent literature, highlights several methodologies for the detection of outliers and explains how to apply them to solve several interesting real-life problems. The detection of objects that deviate from the norm in a data set is an essential task in data mining due to its significance in many contemporary applications. More specifically, the detection of fraud in e-commerce transactions and discovering anomalies in network data have become prominent tasks, given recent developments in the field of information and communication technologies and security. Accordingly, the book sheds light on specific state-of-the-art algorithmic approaches such as the community-based analysis of networks and characterization of temporal outliers present in dynamic networks. It offers a valuable resource for young researchers working in data mining, helping them understand the technical depth of the outlier detection problem and devise innovative solutions to address related challenges.

Outlier Detection for Temporal Data


Author: Manish Gupta,Jing Gao,Charu Aggarwal,Jiawei Han

Publisher: Morgan & Claypool Publishers

ISBN: 162705376X

Category: Computers

Page: 129

View: 4015

Outlier (or anomaly) detection is a very broad field which has been studied in the context of a large number of research areas like statistics, data mining, sensor networks, environmental science, distributed systems, spatio-temporal mining, etc. Initial research in outlier detection focused on time series-based outliers (in statistics). Since then, outlier detection has been studied on a large variety of data types including high-dimensional data, uncertain data, stream data, network data, time series data, spatial data, and spatio-temporal data. While there have been many tutorials and surveys for general outlier detection, we focus on outlier detection for temporal data in this book. A large number of applications generate temporal datasets. For example, in our everyday life, various kinds of records like credit, personnel, financial, judicial, medical, etc., are all temporal. This stresses the need for an organized and detailed study of outliers with respect to such temporal data. In the past decade, there has been a lot of research on various forms of temporal data including consecutive data snapshots, series of data snapshots and data streams. Besides the initial work on time series, researchers have focused on rich forms of data including multiple data streams, spatio-temporal data, network data, community distribution data, etc. Compared to general outlier detection, techniques for temporal outlier detection are very different. In this book, we will present an organized picture of both recent and past research in temporal outlier detection. We start with the basics and then ramp up the reader to the main ideas in state-of-the-art outlier detection techniques. We motivate the importance of temporal outlier detection and brief the challenges beyond usual outlier detection. Then, we list down a taxonomy of proposed techniques for temporal outlier detection. Such techniques broadly include statistical techniques (like AR models, Markov models, histograms, neural networks), distance- and density-based approaches, grouping-based approaches (clustering, community detection), network-based approaches, and spatio-temporal outlier detection approaches. We summarize by presenting a wide collection of applications where temporal outlier detection techniques have been applied to discover interesting outliers.

Outlier Ensembles

An Introduction


Author: Charu C. Aggarwal,Saket Sathe

Publisher: Springer

ISBN: 3319547658

Category: Computers

Page: 276

View: 6633

This book discusses a variety of methods for outlier ensembles and organizes them by the specific principles with which accuracy improvements are achieved. In addition, it covers the techniques with which such methods can be made more effective. A formal classification of these methods is provided, and the circumstances in which they work well are examined. The authors cover how outlier ensembles relate (both theoretically and practically) to the ensemble techniques used commonly for other data mining problems like classification. The similarities and (subtle) differences in the ensemble techniques for the classification and outlier detection problems are explored. These subtle differences do impact the design of ensemble algorithms for the latter problem. This book can be used for courses in data mining and related curricula. Many illustrative examples and exercises are provided in order to facilitate classroom teaching. A familiarity is assumed to the outlier detection problem and also to generic problem of ensemble analysis in classification. This is because many of the ensemble methods discussed in this book are adaptations from their counterparts in the classification domain. Some techniques explained in this book, such as wagging, randomized feature weighting, and geometric subsampling, provide new insights that are not available elsewhere. Also included is an analysis of the performance of various types of base detectors and their relative effectiveness. The book is valuable for researchers and practitioners for leveraging ensemble methods into optimal algorithmic design.

Robust Regression and Outlier Detection


Author: Peter J. Rousseeuw,Annick M. Leroy

Publisher: John Wiley & Sons

ISBN: 9780471488552

Category: Mathematics

Page: 329

View: 430

WILEY-INTERSCIENCE PAPERBACK SERIES The Wiley-Interscience Paperback Series consists of selected books that have been made more accessible to consumers in an effort to increase global appeal and general circulation. With these new unabridged softcover volumes, Wiley hopes to extend the lives of these works by making them available to future generations of statisticians, mathematicians, and scientists. "The writing style is clear and informal, and much of the discussion is oriented to application. In short, the book is a keeper." –Mathematical Geology "I would highly recommend the addition of this book to the libraries of both students and professionals. It is a useful textbook for the graduate student, because it emphasizes both the philosophy and practice of robustness in regression settings, and it provides excellent examples of precise, logical proofs of theorems. . . .Even for those who are familiar with robustness, the book will be a good reference because it consolidates the research in high-breakdown affine equivariant estimators and includes an extensive bibliography in robust regression, outlier diagnostics, and related methods. The aim of this book, the authors tell us, is ‘to make robust regression available for everyday statistical practice.’ Rousseeuw and Leroy have included all of the necessary ingredients to make this happen." –Journal of the American Statistical Association

Damage Assessment of Structures VII


Author: Luigi Garibaldi,Cecilia Surace,Karen M. Holford,Wiesław M. Ostachowicz

Publisher: Trans Tech Publications Ltd

ISBN: 303813130X

Category: Technology & Engineering

Page: 800

View: 3702

Proceedings of the 7th International Conference on Damage Assessment of Structures (DAMAS 2007), Torino, Italy, 25th to 27th June 2007

Data Mining

The Textbook


Author: Charu C. Aggarwal

Publisher: Springer

ISBN: 3319141422

Category: Computers

Page: 734

View: 2702

This textbook explores the different aspects of data mining from the fundamentals to the complex data types and their applications, capturing the wide diversity of problem domains for data mining issues. It goes beyond the traditional focus on data mining problems to introduce advanced data types such as text, time series, discrete sequences, spatial data, graph data, and social networks. Until now, no single book has addressed all these topics in a comprehensive and integrated way. The chapters of this book fall into one of three categories: Fundamental chapters: Data mining has four main problems, which correspond to clustering, classification, association pattern mining, and outlier analysis. These chapters comprehensively discuss a wide variety of methods for these problems. Domain chapters: These chapters discuss the specific methods used for different domains of data such as text data, time-series data, sequence data, graph data, and spatial data. Application chapters: These chapters study important applications such as stream mining, Web mining, ranking, recommendations, social networks, and privacy preservation. The domain chapters also have an applied flavor. Appropriate for both introductory and advanced data mining courses, Data Mining: The Textbook balances mathematical details and intuition. It contains the necessary mathematical details for professors and researchers, but it is presented in a simple and intuitive style to improve accessibility for students and industrial practitioners (including those with a limited mathematical background). Numerous illustrations, examples, and exercises are included, with an emphasis on semantically interpretable examples. Praise for Data Mining: The Textbook - “As I read through this book, I have already decided to use it in my classes. This is a book written by an outstanding researcher who has made fundamental contributions to data mining, in a way that is both accessible and up to date. The book is complete with theory and practical use cases. It’s a must-have for students and professors alike!" -- Qiang Yang, Chair of Computer Science and Engineering at Hong Kong University of Science and Technology "This is the most amazing and comprehensive text book on data mining. It covers not only the fundamental problems, such as clustering, classification, outliers and frequent patterns, and different data types, including text, time series, sequences, spatial data and graphs, but also various applications, such as recommenders, Web, social network and privacy. It is a great book for graduate students and researchers as well as practitioners." -- Philip S. Yu, UIC Distinguished Professor and Wexler Chair in Information Technology at University of Illinois at Chicago

Empirical Direction in Design and Analysis


Author: Norman H. Anderson

Publisher: Psychology Press

ISBN: 1135643385

Category: Psychology

Page: 880

View: 8822

The goal of Norman H. Anderson's new book is to help students develop skills of scientific inference. To accomplish this he organized the book around the "Experimental Pyramid"--six levels that represent a hierarchy of considerations in empirical investigation--conceptual framework, phenomena, behavior, measurement, design, and statistical inference. To facilitate conceptual and empirical understanding, Anderson de-emphasizes computational formulas and null hypothesis testing. Other features include: *emphasis on visual inspection as a basic skill in experimental analysis to help students develop an intuitive appreciation of data patterns; *exercises that emphasize development of conceptual and empirical application of methods of design and analysis and de-emphasize formulas and calculations; and *heavier emphasis on confidence intervals than significance tests. The book is intended for use in graduate-level experimental design/research methods or statistics courses in psychology, education, and other applied social sciences, as well as a professional resource for active researchers. The first 12 chapters present the core concepts graduate students must understand. The next nine chapters serve as a reference handbook by focusing on specialized topics with a minimum of technicalities.

Advances in Knowledge Discovery and Data Mining

10th Pacific-Asia Conference, PAKDD 2006, Singapore, April 9-12, 2006, Proceedings


Author: Wee Keong Ng,Masaru Kitsuregawa,Jianzhong Li

Publisher: Springer Science & Business Media

ISBN: 3540332065

Category: Computers

Page: 879

View: 8681

The Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) is a leading international conference in the area of data mining and knowledge discovery. This year marks the tenth anniversary of the successful annual series of PAKDD conferences held in the Asia Pacific region. It was with pleasure that we hosted PAKDD 2006 in Singapore again, since the inaugural PAKDD conference was held in Singapore in 1997. PAKDD 2006 continues its tradition of providing an international forum for researchers and industry practitioners to share their new ideas, original research results and practical development experiences from all aspects of KDD data mining, including data cleaning, data warehousing, data mining techniques, knowledge visualization, and data mining applications. This year, we received 501 paper submissions from 38 countries and regions in Asia, Australasia, North America and Europe, of which we accepted 67 (13.4%) papers as regular papers and 33 (6.6%) papers as short papers. The distribution of the accepted papers was as follows: USA (17%), China (16%), Taiwan (10%), Australia (10%), Japan (7%), Korea (7%), Germany (6%), Canada (5%), Hong Kong (3%), Singapore (3%), New Zealand (3%), France (3%), UK (2%), and the rest from various countries in the Asia Pacific region.

Conducting Meta-Analysis Using SAS


Author: Winfred Arthur, Jr.,Winston Bennett,Allen I. Huffcutt

Publisher: Psychology Press

ISBN: 1135643466

Category: Psychology

Page: 208

View: 7316

Conducting Meta-Analysis Using SAS reviews the meta-analysis statistical procedure and shows the reader how to conduct one using SAS. It presents and illustrates the use of the PROC MEANS procedure in SAS to perform the data computations called for by the two most commonly used meta-analytic procedures, the Hunter & Schmidt and Glassian approaches. This book serves as both an operational guide and user's manual by describing and explaining the meta-analysis procedures and then presenting the appropriate SAS program code for computing the pertinent statistics. The practical, step-by-step instructions quickly prepare the reader to conduct a meta-analysis. Sample programs available on the Web further aid the reader in understanding the material. Intended for researchers, students, instructors, and practitioners interested in conducting a meta-analysis, the presentation of both formulas and their associated SAS program code keeps the reader and user in touch with technical aspects of the meta-analysis process. The book is also appropriate for advanced courses in meta-analysis psychology, education, management, and other applied social and health sciences departments.