Data Engineering and Data Science
Concepts and Applications
Herausgeber: Kumar, Kukatlapalli Pradeep; Niranjanamurthy, M.; Murthy, Hari; Pillai, Vinay Jha; Unal, Aynur
Data Engineering and Data Science
Concepts and Applications
Herausgeber: Kumar, Kukatlapalli Pradeep; Niranjanamurthy, M.; Murthy, Hari; Pillai, Vinay Jha; Unal, Aynur
- Gebundenes Buch
- Merkliste
- Auf die Merkliste
- Bewerten Bewerten
- Teilen
- Produkt teilen
- Produkterinnerung
- Produkterinnerung
Written and edited by one of the most prolific and well-known experts in the field and his team, this exciting new volume is the "one-stop shop" for the concepts and applications of data science and engineering for data scientists across many industries. The field of data science is incredibly broad, encompassing everything from cleaning data to deploying predictive models. However, it is rare for any single data scientist to be working across the spectrum day to day. Data scientists usually focus on a few areas and are complemented by a team of other scientists and analysts. Data engineering…mehr
- Computational Statistics in Data Science230,99 €
- Alfred DemarisRegression with Social Data216,99 €
- Rosa Arboretti GiancristofaroEnd-To-End Data Analytics for Product Development100,99 €
- Analysis of Survey Data208,99 €
- Survey Data Harmonization in the Social Sciences139,99 €
- Data Analysis and Applications 1189,99 €
- Peter C BruceStatistics for Data Science and Analytics128,99 €
-
-
-
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
- Produktdetails
- Verlag: Wiley
- Seitenzahl: 464
- Erscheinungstermin: 26. September 2023
- Englisch
- Abmessung: 231mm x 159mm x 25mm
- Gewicht: 794g
- ISBN-13: 9781119841876
- ISBN-10: 1119841879
- Artikelnr.: 63124947
- Verlag: Wiley
- Seitenzahl: 464
- Erscheinungstermin: 26. September 2023
- Englisch
- Abmessung: 231mm x 159mm x 25mm
- Gewicht: 794g
- ISBN-13: 9781119841876
- ISBN-10: 1119841879
- Artikelnr.: 63124947
1 Quality Assurance in Data Science: Need, Challenges and Focus 1
Jasmine K.S., Ajay D. K. and Aditya Raj
1.1 Introduction 1
1.2 Testing and Quality Assurance 3
1.3 Product Quality and Test Efforts 4
1.4 Data Masking in Data Model and Associated Risks 8
1.5 Prediction in Data Science 9
1.6 Role of Metrics in Evaluation 20
1.7 Quantity of Data in Quality Assurance 20
1.8 Identifying the Right Data Sources 20
1.9 Conclusion 21
2 Design and Implementation of Social Media Mining -- Knowledge Discovery
Methods for Effective Digital Marketing Strategies 23
Prashant Bhat and Pradnya Malaganve
2.1 Introduction 24
2.2 Literature Review 26
2.3 Novel Framework for Social Media Data Mining and Knowledge Discovery 29
2.4 Classification for Comparison Analysis 34
2.5 Clustering Methodology to Provide Digital Marketing Strategies 38
2.6 Experimental Results 43
2.7 Conclusion 45
3 A Study on Big Data Engineering Using Cloud Data Warehouse 49
Manjunath T. N., Pushpa S. K., Ravindra S. Hegadi and Ananya Hathwar K. S.
3.1 Introduction 50
3.2 Comparison Study of Different Cloud Data Warehouses 51
3.3 Snowflake Cloud Data Warehouse 55
3.4 Google BigQuery Cloud Data Warehouse 58
3.5 Microsoft Azure Synapse Cloud Data Warehouse 61
3.6 Informatica Intelligent Cloud Services (IICS) 64
3.7 Conclusion 67
4 Data Mining with Cluster Analysis Through Partitioning Approach of Huge
Transaction Data 71
Sampath Kini K. and Karthik Pai B.H.
4.1 Introduction 72
4.2 Methodology Used in Proposed Cluster Analysis System 75
4.3 Literature Survey on Existing Systems 80
4.4 Conclusion 82
5 Application of Data Science in Macromodeling of Nonlinear Dynamical
Systems 85
Nagaraj S., Seshachalam D. and Jayalatha G.
5.1 Introduction 86
5.2 Nonlinear Autonomous Dynamical System 89
5.3 Nonlinear System - MOR 90
5.4 Data Science Life Cycle 92
5.5 Artificial Neural Network in Modeling 94
5.6 Neuron Spiking Model Using FitzHugh-Nagumo (F-N) System 99
5.7 Ring Oscillator Model 104
5.8 Nonlinear VLSI Interconnect Model Using Telegraph Equation 108
5.9 Macromodel Using Machine Learning 112
5.10 MOR of Dynamical Systems Using POD-ANN 115
5.11 Numerical Results 117
5.12 Conclusion 126
6 Comparative Analysis of Various Ensemble Approaches for Web Page
Classification 137
J. Dutta, Yong Woon Kim and Dalia Dominic
6.1 Introduction 138
6.2 Literature Survey 139
6.3 Material and Methods 144
6.4 Ensemble Classifiers 146
6.5 Results 148
6.6 Conclusion 169
7 Feature Engineering and Selection Approach Over Malicious Image 173
P.M. Kavitha and B. Muruganantham
7.1 Introduction 173
7.2 Feature Engineering Techniques 176
7.3 Malicious Feature Engineering 182
7.4 Image Processing Technique 183
7.5 Image Processing Techniques for Analysis on Malicious Images 185
7.6 Conclusion 191
8 Cubic-Regression and Likelihood Based Boosting GAM to Model Drug
Sensitivity for Glioblastoma 195
Satyawant Kumar, Vinai George Biju, Ho-Kyoung Lee and Blessy Baby Mathew
8.1 Introduction 196
8.2 Literature Survey 198
8.3 Materials and Methods 201
8.4 Evaluations, Results and Discussions 209
9 Unobtrusive Engagement Detection through Semantic Pose Estimation and
Lightweight ResNet for an Online Class Environment 225
Michael Moses Thiruthuvanathan, Balachandran Krishnan and Madhavi
Rangaswamy
9.1 Introduction 226
9.2 Related Work 230
9.3 Proposed Methodology 234
9.4 Experimentation 241
9.5 Results and Discussions 245
10 Building Rule Base for Decision Making -- A Fuzzy-Rough Approach 255
Sabu M. K., Neeraj Krishna M. S. and Reshmi R.
10.1 Introduction 256
10.2 Literature Review 258
10.3 Discretization of the Dataset Using Fuzzy Set Theory 260
10.4 Description of the Dataset 260
10.5 Process Involved in Proposed Work 261
10.6 Experiment 262
10.7 Evaluation Result 267
10.8 Discussion 273
11 An Effective Machine Learning Approach to Model Healthcare Data 279
Shaila H. Koppad, S. Anupama Kumar and Mohan Kumar
11.1 Introduction 280
11.2 Types of Data in Healthcare 281
11.3 Big Data in Healthcare 283
11.4 Different V's of Big Data 284
11.5 About COPD 285
11.6 Methodology Implemented 290
12 Recommendation Engine for Retail Domain Using Machine Learning
Techniques 303
Chandrashekhara K. T., Gireesh Babu C. N. and Thungamani M.
12.1 Introduction 304
12.2 Proposed System 304
12.3 Results 312
12.3.1 ARIMA Forecasting 312
12.4 Conclusion 313
13 Mining Heterogeneous Lung Cancer from Computer Tomography (CT) Scan with
the Confusion Matrix 317
Denny Dominic and Krishnan Balachandran
13.1 Introduction 317
13.2 Literature Review 319
13.3 Methodology 320
13.4 Result 326
13.5 Conclusion and Future Scope 332
References 332
14 ML Algorithms and Their Approach on COVID-19 Data Analysis 335
Kambaluru Ashok, Penumalli Anvesh Reddy and Kukatlapalli Pradeep Kumar
14.1 Introduction 336
14.2 DataSet 336
14.3 Types of Machine Learning Algorithms 338
14.4 Conclusion 348
15 Analysis and Design for the Early Stage Detection of Lung Diseases Using
Machine Learning Algorithms 351
Sindhu Madhuri, Mahesh T. R., Vivek V., Shashikala H. K. and C. Saravanan
15.1 Introduction 352
15.2 Machine Learning Algorithms 358
15.3 Evaluation Metrics and Comparative Results for Early Detection of Lung
Diseases 364
15.4 Conclusion 369
16 Estimation of Cancer Risk through Artificial Neural Network 373
K. Aditya Shastry, Sanjay H. A., Balaji N. and Karthik Pai B. H.
16.1 Introduction 373
16.2 Case Studies Related to Cancer Risk Estimation Using ANN 375
16.3 Datasets Used in Cancer Risk Estimation 388
16.4 Discussion 397
16.5 Future Scope 400
16.6 Conclusion 400
17 Applications and Advancements in Data Science and Analytics 409
T. Mamatha, A. Balaram, B. Rama Subba Reddy, C. Shoba Bindu and M.
Niranjanamurthy
17.1 Data Science and Analytics in Software Testing 410
17.2 Applications of Data Science and Analytics 411
17.3 Selenium Testing Tool in Data Science 419
17.4 Challenges and Advancements in Data Science 425
17.5 Data Science and Analytics Tools 430
17.6 Conclusion 438
References 439
About the Editors 441
Index 443
1 Quality Assurance in Data Science: Need, Challenges and Focus 1
Jasmine K.S., Ajay D. K. and Aditya Raj
1.1 Introduction 1
1.2 Testing and Quality Assurance 3
1.3 Product Quality and Test Efforts 4
1.4 Data Masking in Data Model and Associated Risks 8
1.5 Prediction in Data Science 9
1.6 Role of Metrics in Evaluation 20
1.7 Quantity of Data in Quality Assurance 20
1.8 Identifying the Right Data Sources 20
1.9 Conclusion 21
2 Design and Implementation of Social Media Mining -- Knowledge Discovery
Methods for Effective Digital Marketing Strategies 23
Prashant Bhat and Pradnya Malaganve
2.1 Introduction 24
2.2 Literature Review 26
2.3 Novel Framework for Social Media Data Mining and Knowledge Discovery 29
2.4 Classification for Comparison Analysis 34
2.5 Clustering Methodology to Provide Digital Marketing Strategies 38
2.6 Experimental Results 43
2.7 Conclusion 45
3 A Study on Big Data Engineering Using Cloud Data Warehouse 49
Manjunath T. N., Pushpa S. K., Ravindra S. Hegadi and Ananya Hathwar K. S.
3.1 Introduction 50
3.2 Comparison Study of Different Cloud Data Warehouses 51
3.3 Snowflake Cloud Data Warehouse 55
3.4 Google BigQuery Cloud Data Warehouse 58
3.5 Microsoft Azure Synapse Cloud Data Warehouse 61
3.6 Informatica Intelligent Cloud Services (IICS) 64
3.7 Conclusion 67
4 Data Mining with Cluster Analysis Through Partitioning Approach of Huge
Transaction Data 71
Sampath Kini K. and Karthik Pai B.H.
4.1 Introduction 72
4.2 Methodology Used in Proposed Cluster Analysis System 75
4.3 Literature Survey on Existing Systems 80
4.4 Conclusion 82
5 Application of Data Science in Macromodeling of Nonlinear Dynamical
Systems 85
Nagaraj S., Seshachalam D. and Jayalatha G.
5.1 Introduction 86
5.2 Nonlinear Autonomous Dynamical System 89
5.3 Nonlinear System - MOR 90
5.4 Data Science Life Cycle 92
5.5 Artificial Neural Network in Modeling 94
5.6 Neuron Spiking Model Using FitzHugh-Nagumo (F-N) System 99
5.7 Ring Oscillator Model 104
5.8 Nonlinear VLSI Interconnect Model Using Telegraph Equation 108
5.9 Macromodel Using Machine Learning 112
5.10 MOR of Dynamical Systems Using POD-ANN 115
5.11 Numerical Results 117
5.12 Conclusion 126
6 Comparative Analysis of Various Ensemble Approaches for Web Page
Classification 137
J. Dutta, Yong Woon Kim and Dalia Dominic
6.1 Introduction 138
6.2 Literature Survey 139
6.3 Material and Methods 144
6.4 Ensemble Classifiers 146
6.5 Results 148
6.6 Conclusion 169
7 Feature Engineering and Selection Approach Over Malicious Image 173
P.M. Kavitha and B. Muruganantham
7.1 Introduction 173
7.2 Feature Engineering Techniques 176
7.3 Malicious Feature Engineering 182
7.4 Image Processing Technique 183
7.5 Image Processing Techniques for Analysis on Malicious Images 185
7.6 Conclusion 191
8 Cubic-Regression and Likelihood Based Boosting GAM to Model Drug
Sensitivity for Glioblastoma 195
Satyawant Kumar, Vinai George Biju, Ho-Kyoung Lee and Blessy Baby Mathew
8.1 Introduction 196
8.2 Literature Survey 198
8.3 Materials and Methods 201
8.4 Evaluations, Results and Discussions 209
9 Unobtrusive Engagement Detection through Semantic Pose Estimation and
Lightweight ResNet for an Online Class Environment 225
Michael Moses Thiruthuvanathan, Balachandran Krishnan and Madhavi
Rangaswamy
9.1 Introduction 226
9.2 Related Work 230
9.3 Proposed Methodology 234
9.4 Experimentation 241
9.5 Results and Discussions 245
10 Building Rule Base for Decision Making -- A Fuzzy-Rough Approach 255
Sabu M. K., Neeraj Krishna M. S. and Reshmi R.
10.1 Introduction 256
10.2 Literature Review 258
10.3 Discretization of the Dataset Using Fuzzy Set Theory 260
10.4 Description of the Dataset 260
10.5 Process Involved in Proposed Work 261
10.6 Experiment 262
10.7 Evaluation Result 267
10.8 Discussion 273
11 An Effective Machine Learning Approach to Model Healthcare Data 279
Shaila H. Koppad, S. Anupama Kumar and Mohan Kumar
11.1 Introduction 280
11.2 Types of Data in Healthcare 281
11.3 Big Data in Healthcare 283
11.4 Different V's of Big Data 284
11.5 About COPD 285
11.6 Methodology Implemented 290
12 Recommendation Engine for Retail Domain Using Machine Learning
Techniques 303
Chandrashekhara K. T., Gireesh Babu C. N. and Thungamani M.
12.1 Introduction 304
12.2 Proposed System 304
12.3 Results 312
12.3.1 ARIMA Forecasting 312
12.4 Conclusion 313
13 Mining Heterogeneous Lung Cancer from Computer Tomography (CT) Scan with
the Confusion Matrix 317
Denny Dominic and Krishnan Balachandran
13.1 Introduction 317
13.2 Literature Review 319
13.3 Methodology 320
13.4 Result 326
13.5 Conclusion and Future Scope 332
References 332
14 ML Algorithms and Their Approach on COVID-19 Data Analysis 335
Kambaluru Ashok, Penumalli Anvesh Reddy and Kukatlapalli Pradeep Kumar
14.1 Introduction 336
14.2 DataSet 336
14.3 Types of Machine Learning Algorithms 338
14.4 Conclusion 348
15 Analysis and Design for the Early Stage Detection of Lung Diseases Using
Machine Learning Algorithms 351
Sindhu Madhuri, Mahesh T. R., Vivek V., Shashikala H. K. and C. Saravanan
15.1 Introduction 352
15.2 Machine Learning Algorithms 358
15.3 Evaluation Metrics and Comparative Results for Early Detection of Lung
Diseases 364
15.4 Conclusion 369
16 Estimation of Cancer Risk through Artificial Neural Network 373
K. Aditya Shastry, Sanjay H. A., Balaji N. and Karthik Pai B. H.
16.1 Introduction 373
16.2 Case Studies Related to Cancer Risk Estimation Using ANN 375
16.3 Datasets Used in Cancer Risk Estimation 388
16.4 Discussion 397
16.5 Future Scope 400
16.6 Conclusion 400
17 Applications and Advancements in Data Science and Analytics 409
T. Mamatha, A. Balaram, B. Rama Subba Reddy, C. Shoba Bindu and M.
Niranjanamurthy
17.1 Data Science and Analytics in Software Testing 410
17.2 Applications of Data Science and Analytics 411
17.3 Selenium Testing Tool in Data Science 419
17.4 Challenges and Advancements in Data Science 425
17.5 Data Science and Analytics Tools 430
17.6 Conclusion 438
References 439
About the Editors 441
Index 443