Data management systems enable various influential applications from high-performance online services (e.g., social networks like Twitter and Facebook or financial markets) to big data analytics (e.g., scientific exploration, sensor networks, business intelligence). As a result, data management systems have been one of the main drivers for innovations in the database and computer architecture communities for several decades. Recent hardware trends require software to take advantage of the abundant parallelism existing in modern and future hardware. The traditional design of the data management…mehr
Data management systems enable various influential applications from high-performance online services (e.g., social networks like Twitter and Facebook or financial markets) to big data analytics (e.g., scientific exploration, sensor networks, business intelligence). As a result, data management systems have been one of the main drivers for innovations in the database and computer architecture communities for several decades. Recent hardware trends require software to take advantage of the abundant parallelism existing in modern and future hardware. The traditional design of the data management systems, however, faces inherent scalability problems due to its tightly coupled components. In addition, it cannot exploit the full capability of the aggressive micro-architectural features of modern processors. As a result, today's most commonly used server types remain largely underutilized leading to a huge waste of hardware resources and energy.
In this book, we shed light on the challenges present while running DBMS on modern multicore hardware. We divide the material into two dimensions of scalability: implicit/vertical and explicit/horizontal.
The first part of the book focuses on the vertical dimension: it describes the instruction- and data-level parallelism opportunities in a core coming from the hardware and software side. In addition, it examines the sources of under-utilization in a modern processor and presents insights and hardware/software techniques to better exploit the microarchitectural resources of a processor by improving cache locality at the right level of the memory hierarchy.
The second part focuses on the horizontal dimension, i.e., scalability bottlenecks of database applications at the level of multicore and multisocket multicore architectures. It first presents a systematic way of eliminating such bottlenecks in online transaction processing workloads, which is based on minimizing unbounded communication, and shows severaltechniques that minimize bottlenecks in major components of database management systems. Then, it demonstrates the data and work sharing opportunities for analytical workloads, and reviews advanced scheduling mechanisms that are aware of nonuniform memory accesses and alleviate bandwidth saturation.
Die Herstellerinformationen sind derzeit nicht verfügbar.
Autorenporträt
Anastasia Ailamaki is a Professor of Computer and Communication Sciences at the Ecole Polytechnique Federale de Lausanne (EPFL) in Switzerland. Her research interests are in data-intensive systems and applications, and in particular (a) in strengthening the interaction between the database software and emerging hardware and I/O devices, and (b) in automating data management to support computationally-demanding, data-intensive scientific applications. She has received an ERC Consolidator Award (2013), a Finmeccanica endowed chair from the Computer Science Department at Carnegie Mellon (2007), a European Young Investigator Award from the European Science Foundation (2007), an Alfred P. Sloan Research Fellowship (2005), eight bestpaper awards in database, storage, and computer architecture conferences (2001-2012), and an NSF CAREER award (2002). She holds a Ph.D. in Computer Science from the University of Wisconsin-Madison in 2000. She is an ACM fellow and the vice chair of the ACM SIGMODcommunity, as well as a senior member of the IEEE. She has served as a CRA-W mentor and is a member of the Expert Network of the World Economic Forum. Erietta Liarou is currently a co-founder in a data analytics startup. She received her Ph.D. in Computer Science from University of Amsterdam in 2013. In her thesis she worked on the first column-store stream processing system, MonetDB/DataCell, that leverages analytical systems technology for scalable stream processing. Her research interests include database architectures, transaction processing on modern hardware, data-stream processing and distributed query processing. In the past she has been with the Data-Intensive Applications and Systems Laboratory (DIAS) in EPFL, the Dutch National Research Center for Mathematics and Computer Science (CWI) in Amsterdam, The Netherlands, the Intelligence Systems Laboratory in Technical University of Crete, Greece, and with the System S group in IBM T.J.Watson Research Center, Hawthorne, NY,USA. In 2011, she received the Best Paper Award in Challenges and Visions at the Very Large Database Conference. Pinar Tozun is a research staff member at IBM Almaden Research Center. Before joining IBM, she received her Ph.D. from EPFL. Her research focuses on HTAP engines, performance characterization of database workloads, and scalability and efficiency of data management systems on modern hardware. She received a Jim Gray Doctoral Dissertation Award Honorable Mention in 2016. During her Ph.D., she also spent a summer in Oracle Labs (Redwood Shores, CA) as an intern. Before starting her Ph.D., she received her BSc degree in Computer Engineering department of Koc University in 2009. Danica Porobic is a Principal Member of Technical Staff at Oracle working on the database in-memory technologies. She received her Ph.D. from EPFL where she focused on designing scalable transaction processing systems for non-uniform hardware. She has graduated top of her class with MSc and BSc in Informatics from University of Novi Sad and has worked at Oracle Labs and Microsoft SQL Server. Iraklis Psaroudakis is a Senior Member of Technical Staff at Oracle Labs. His research interests include improving the performance of analytical workloads, parallel programming, and OS/runtime-system interaction. Prior to Oracle, he completed his Ph.D. at the Data-Intensive Application and Systems (DIAS) Laboratory of the Ecole Polytechnique Federale de Lausanne (EPFL), focusing on scaling up highly concurrent analytical database workloads on multi-socket multi-core servers through (a) sharing data and work across concurrent queries, and (b) adaptive NUMA-aware data placement and task scheduling. During his Ph.D., he cooperated with the SAP HANA database team. Before starting his Ph.D., he completed his studies in Electrical & Computer Engineering at the National Technical University of Athens (NTUA).
Inhaltsangabe
Introduction.- Exploiting Resources of a Processor Core.- Minimizing Memory Stalls.- Scaling-up OLTP.- Scaling-up OLAP Workloads.- Outlook.- Summary.- Bibliography.- Authors' Biographies.