DATA SCIENCE WITH SEMANTIC TECHNOLOGIES This book will serve as an important guide toward applications of data science with semantic technologies for the upcoming generation and thus becomes a unique resource for scholars, researchers, professionals, and practitioners in this field. To create intelligence in data science, it becomes necessary to utilize semantic technologies which allow machine-readable representation of data. This intelligence uniquely identifies and connects data with common business terms, and it also enables users to communicate with data. Instead of structuring the…mehr
This book will serve as an important guide toward applications of data science with semantic technologies for the upcoming generation and thus becomes a unique resource for scholars, researchers, professionals, and practitioners in this field.
To create intelligence in data science, it becomes necessary to utilize semantic technologies which allow machine-readable representation of data. This intelligence uniquely identifies and connects data with common business terms, and it also enables users to communicate with data. Instead of structuring the data, semantic technologies help users to understand the meaning of the data by using the concepts of semantics, ontology, OWL, linked data, and knowledge-graphs. These technologies help organizations to understand all the stored data, adding the value in it, and enabling insights that were not available before. As data is the most important asset for any organization, it is essential to apply semantic technologies in data science to fulfill the need of any organization.
Data Science with Semantic Technologies provides a roadmap for the deployment of semantic technologies in the field of data science. Moreover, it highlights how data science enables the user to create intelligence through these technologies by exploring the opportunities and eradicating the challenges in the current and future time frame. In addition, this book provides answers to various questions like: Can semantic technologies be able to facilitate data science? Which type of data science problems can be tackled by semantic technologies? How can data scientists benefit from these technologies? What is knowledge data science? How does knowledge data science relate to other domains? What is the role of semantic technologies in data science? What is the current progress and future of data science with semantic technologies? Which types of problems require the immediate attention of researchers?
Audience
Researchers in the fields of data science, semantic technologies, artificial intelligence, big data, and other related domains, as well as industry professionals, software engineers/scientists, and project managers who are developing the software for data science. Students across the globe will get the basic and advanced knowledge on the current state and potential future of data science.
Archana Patel, PhD, is a faculty of the Department of Software Engineering, School of Computing and Information Technology, Binh Duong Province, Vietnam. She completed her Postdoc from the Freie Universität Berlin, Berlin, Germany. Dr. Patel is an author or co-author of more than 30 publications in numerous refereed journals and conference proceedings. She has been awarded the Best Paper award (three times) at international conferences. Her research interests are ontological engineering, semantic web, big data, expert systems, and knowledge warehouse. Narayan C. Debnath, PhD, is the Founding Dean of the School of Computing and Information Technology at Eastern International University, Vietnam. He is also serving as the Head of the Department of Software Engineering at Eastern International University, Vietnam. Dr. Debnath has been the Director of the International Society for Computers and their Applications (ISCA), USA since 2014. Formerly, Dr. Debnath served as a Full Professor of Computer Science at Winona State University, Minnesota, USA for 28 years. Bharat Bhusan, PhD, is an assistant professor in the Department of Computer Science and Engineering, School of Engineering and Technology, Sharda University, India. In the last three years, he has published more than 80 research papers in various renowned international conferences and SCI indexed journals and edited 11 books.
Inhaltsangabe
Preface xv
1 A Brief Introduction and Importance of Data Science 1 Karthika N., Sheela J. and Janet B.
1.1 What is Data Science? What Does a Data Scientist Do? 2
1.2 Why Data Science is in Demand? 2
1.3 History of Data Science 4
1.4 How Does Data Science Differ from Business Intelligence? 9
1.5 Data Science Life Cycle 11
1.6 Data Science Components 13
1.7 Why Data Science is Important 14
1.8 Current Challenges 15
1.8.1 Coordination, Collaboration, and Communication 16
1.8.2 Building Data Analytics Teams 16
1.8.3 Stakeholders vs Analytics 17
1.8.4 Driving with Data 17
1.9 Tools Used for Data Science 19
1.10 Benefits and Applications of Data Science 28
1.11 Conclusion 28
References 29
2 Exploration of Tools for Data Science 31 Qasem Abu Al-Haija
2.1 Introduction 32
2.2 Top Ten Tools for Data Science 35
2.3 Python for Data Science 35
2.3.1 Python Datatypes 36
2.3.2 Helpful Rules for Python Programming 37
2.3.3 Jupyter Notebook for IPython 37
2.3.4 Your First Python Program 38
2.4 R Language for Data Science 39
2.4.1 R Datatypes 39
2.4.2 Your First R Program 41
2.5 SQL for Data Science 44
2.6 Microsoft Excel for Data Science 48
2.6.1 Detection of Outliers in Data Sets Using Microsoft Excel 48
2.6.2 Regression Analysis in Excel Using Microsoft Excel 50
2.7 D3.JS for Data Science 57
2.8 Other Important Tools for Data Science 58
2.8.1 Apache Spark Ecosystem 58
2.8.2 MongoDB Data Store System 60
2.8.3 MATLAB Computing System 62
2.8.4 Neo4j for Graphical Database 63
2.8.5 VMWare Platform for Virtualization 65
2.9 Conclusion 66
References 68
3 Data Modeling as Emerging Problems of Data Science 71 Mahyuddin K. M. Nasution and Marischa Elveny
3.1 Introduction 72
3.2 Data 72
3.2.1 Unstructured Data 74
3.2.2 Semistructured Data 74
3.2.3 Structured Data 76
3.2.4 Hybrid (Un/Semi)-Structured Data 77
3.2.5 Big Data 78
3.3 Data Model Design 79
3.4 Data Modeling 81
3.4.1 Records-Based Data Model 81
3.4.2 Non-Record-Based Data Model 84
3.5 Polyglot Persistence Environment 87
References 88
4 Data Management as Emerging Problems of Data Science 91 Mahyuddin K. M. Nasution and Rahmad Syah
4.1 Introduction 92
4.2 Perspective and Context 92
4.2.1 Life Cycle 93
4.2.2 Use 95
4.3 Data Distribution 98
4.4 CAP Theorem 100
4.5 Polyglot Persistence 101
References 102
5 Role of Data Science in Healthcare 105 Anidha Arulanandham, A. Suresh and Senthil Kumar R.
5.1 Predictive Modeling--Disease Diagnosis and Prognosis 106