This book provides a comprehensive tutorial on similarity operators. The authors systematically survey the set of similarity operators, primarily focusing on their semantics, while also touching upon mechanisms for processing them effectively. The book starts off by providing introductory material on similarity search systems, highlighting the central role of similarity operators in such systems. This is followed by a systematic categorized overview of the variety of similarity operators that have been proposed in literature over the last two decades, including advanced operators such as…mehr
This book provides a comprehensive tutorial on similarity operators. The authors systematically survey the set of similarity operators, primarily focusing on their semantics, while also touching upon mechanisms for processing them effectively.
The book starts off by providing introductory material on similarity search systems, highlighting the central role of similarity operators in such systems. This is followed by a systematic categorized overview of the variety of similarity operators that have been proposed in literature over the last two decades, including advanced operators such as RkNN, Reverse k-Ranks, Skyline k-Groups and K-N-Match. Since indexing is a core technology in the practical implementation of similarity operators, various indexing mechanisms are summarized. Finally, current research challenges are outlined, so as to enable interested readers to identify potential directions for future investigations. In summary, this book offers a comprehensive overview of the field of similarity search operators, allowing readers to understand the area of similarity operators as it stands today, and in addition providing them with the background needed to understand recent novel approaches.
Deepak P is a researcher in the Information Management Group at IBM Research - India, Bangalore. He has been working in the area of similarity search since 2008, co-chaired the 2011 EDBT Workshop on New Trends in Similarity Search and presented a tutorial on similarity search operators at the WISE 2014 conference. His current research interests include similarity search, spatio-temporal data analytics, graph mining, information retrieval and machine learning. He has authored over 20 papers in reputed conferences and has filed several patent applications with the US PTO including four issued patents. He is a senior member of the ACM and IEEE. Prasad M Deshpande is a Senior Technical Staff Member at IBM Research - India and Manager of the Watson Foundations - Platforms and Infrastructure group. His areas of expertise are in data management, specifically data integration, OLAP, data mining and text analytics. His current focus is in the areas of data discovery and curation for big data platforms, data integration and machine data analytics. He has more than 40 publications in reputed conferences and journals and 14 patents issued. He has served on the Program Committee of many conferences and has been the Industry Chair for COMAD 2009 and COMAD 2013, PC Co-Chair for COMAD 2011, ACM Compute 2010, 2011 EDBT Workshop on New Trends in Similarity Search and 2014 KDD Workshop on Big Data Discovery and Curation. He is an ACM Distinguished Scientist and member of the IBM Academy of Technology.
Inhaltsangabe
1 Introduction.- 2 Fundamentals of Similarity Search.- 3 Common Similarity Search Operators.- 4 Categorizing Operators.- 5 Advanced Operators for Similarity Search.- 6 Indexing for Similarity Search Operators.- 7 The Road Ahead.