Rakesh M. Verma (University of Houston, Texas, USA), David J. Marchette (Naval Surface Warfare Center, Dahlgren, Virgini
Cybersecurity Analytics
Rakesh M. Verma (University of Houston, Texas, USA), David J. Marchette (Naval Surface Warfare Center, Dahlgren, Virgini
Cybersecurity Analytics
- Broschiertes Buch
- Merkliste
- Auf die Merkliste
- Bewerten Bewerten
- Teilen
- Produkt teilen
- Produkterinnerung
- Produkterinnerung
This book organizes in one place the mathematics, probability, statistics and machine learning information that is required for a practitioner of cybersecurity analytics, as well as the basics of cybersecurity needed for a practitioner.
Andere Kunden interessierten sich auch für
- Lily Wang (Iowa State University, Ames, USA)Data Science for Infectious Disease Data Analytics97,99 €
- Ulrich Matter (Assistant Professor of Economics at Uni of St. GalleBig Data Analytics87,99 €
- Jun WuThe Beauty of Mathematics in Computer Science45,99 €
- Benoit LiquetMathematical Engineering of Deep Learning86,99 €
- Max KuhnFeature Engineering and Selection97,99 €
- Thierry Worch (Netherlands FrieslandCampina)Data Science for Sensory and Consumer Scientists109,99 €
- Rohan AlexanderTelling Stories with Data103,99 €
-
-
-
This book organizes in one place the mathematics, probability, statistics and machine learning information that is required for a practitioner of cybersecurity analytics, as well as the basics of cybersecurity needed for a practitioner.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Produktdetails
- Produktdetails
- Chapman & Hall/CRC Data Science Series
- Verlag: Taylor & Francis Ltd
- Seitenzahl: 340
- Erscheinungstermin: 29. August 2022
- Englisch
- Abmessung: 177mm x 253mm x 24mm
- Gewicht: 670g
- ISBN-13: 9781032401003
- ISBN-10: 1032401001
- Artikelnr.: 67321633
- Herstellerkennzeichnung
- Produktsicherheitsverantwortliche/r
- Europaallee 1
- 36244 Bad Hersfeld
- gpsr@libri.de
- Chapman & Hall/CRC Data Science Series
- Verlag: Taylor & Francis Ltd
- Seitenzahl: 340
- Erscheinungstermin: 29. August 2022
- Englisch
- Abmessung: 177mm x 253mm x 24mm
- Gewicht: 670g
- ISBN-13: 9781032401003
- ISBN-10: 1032401001
- Artikelnr.: 67321633
- Herstellerkennzeichnung
- Produktsicherheitsverantwortliche/r
- Europaallee 1
- 36244 Bad Hersfeld
- gpsr@libri.de
Rakesh Verma is a professor of computer science at the University of Houston where he is leading a research group that applies reasoning and data science to cybersecurity challenges. He teaches a course on security analytics that includes some of the material here. Since 2015, he has been co-organizing and editing the proceedings of the ACM International Workshop on Security and Privacy Analytics. He is an editor of Frontiers of Big Data in the Cybersecurity Area, an ACM Distinguished Speaker (2011-2018), and the winner of two Best Paper Awards. He received the Lifetime Mentoring Award from the University of Houston and he is a Fulbright Senior Specialist in Computer Science. David Marchette is a principal scientist at the Naval Surface Warfare Center, Dahlgren Division where he is responsible for leading basic and applied research projects in computational statistics, graph theory, network analysis, pattern recognition, computer intrusion detection, and text analysis. He is a fellow of the American Statistical Association (ASA) and the American Association for the Advancement of Science (AAAS) and an elected member of the International Statistical Institute (ISI).
Preface 1 Introduction 2 What is Data Analytics? 2.1 Data Ingestion 2.2
Data Processing and Cleaning 2.3 Visualization and Exploratory Analysis
2.3.1 Scatterplots 2.4 Pattern Recognition 2.4.1 Classification 2.4.2
Clustering 2.5 Feature extraction 2.5.1 Feature Selection 2.5.2 Random
Projections 2.6 Modeling 2.6.1 Model Specification 2.6.2 Model Selection
and Fitting 2.7 Evaluation 2.8 Strengths and Limitations 2.8.1 The Curse of
Dimensionality 3 Security: Basics and Security Analytics 3.1 Basics of
Security 3.1.1 Know Thy Enemy - Attackers and Their Motivations 3.1.2
Security Goals 3.2 Mechanisms for Ensuring Security Goals 3.2.1
Confidentiality 3.2.2 Integrity 3.2.3 Availability 3.2.4 Authentication
3.2.5 Access Control 3.2.6 Accountability 3.2.7 Non-repudiation 3.3
Threats, Attacks and Impacts 3.3.1 Passwords 3.3.2 Malware 3.3.3 Spam,
Phishing and its Variants 3.3.4 Intrusions 3.3.5 Internet Surfing 3.3.6
System Maintenance and Firewalls 3.3.7 Other Vulnerabilities 3.3.8
Protecting Against Attacks 3.4 Applications of Data Science to Security
Challenges 3.4.1 Cybersecurity Datasets 3.4.2 Data Science Applications
3.4.3 Passwords 3.4.4 Malware 3.4.5 Intrusions 3.4.6 Spam/Phishing 3.4.7
Credit Card Fraud/Financial Fraud 3.4.8 Opinion Spam 3.4.9 Denial of
Service 3.5 Security Analytics and Why Do We Need It4 Statistics 4.1
Probability Density Estimation 4.2 Models 4.2.1 Poisson 4.2.2 Uniform 4.2.3
Normal 4.3 Parameter Estimation 4.3.1 The Bias-Variance Trade-Off 4.4 The
Law of Large Numbers and the Central Limit Theorem 4.5 Confidence Intervals
4.6 Hypothesis Testing 4.7 Bayesian Statistics 4.8 Regression 4.8.1
Logistic Regression 4.9 Regularization 4.10 Principal Components 4.11
Multidimensional Scaling 4.12 Procrustes 4.13 Nonparametric Statistics 4.14
Time Series 5 Data Mining - Unsupervised Learning 5.1 Data Collection 5.2
Types of Data and Operations 5.2.1 Properties of Datasets 5.3 Data
Exploration and Preprocessing 5.3.1 Data Exploration 5.3.2 Data
Preprocessing/Wrangling 5.4 Data Representation 5.5 Association Rule Mining
5.5.1 Variations on the Apriori Algorithm 5.6 Clustering 5.6.1 Partitional
Clustering 5.6.2 Choosing K 5.6.3 Variations on K-means Algorithm 5.6.4
Hierarchical Clustering 5.6.5 Other Clustering Algorithms 5.6.6 Measuring
the Clustering Quality 5.6.7 Clustering Miscellany: Clusterability,
Robustness, Incremental, 5.7 Manifold Discovery 5.7.1 Spectral Embedding
5.8 Anomaly Detection 5.8.1 Statistical Methods 5.8.2 Distance-based
Outlier Detection 5.8.3 kNN based approach 5.8.4 Density-based Outlier
Detection 5.8.5 Clustering-based Outlier Detection 5.8.6 One-class learning
based Outliers 5.9 Security Applications and Adaptations 5.9.1 Data Mining
for Intrusion Detection 5.9.2 Malware Detection 5.9.3 Stepping-stone
Detection 5.9.4 Malware Clustering 5.9.5 Directed Anomaly Scoring for Spear
Phishing Detection 5.10 Concluding Remarks and Further Reading 6 Machine
Learning - Supervised Learning 6.1 Fundamentals of Supervised Learning 6.2
The Bayes Classifier 6.2.1 Naïve Bayes6.3 Nearest Neighbors Classifiers 6.4
Linear Classifiers 6.5 Decision Trees and Random Forests 6.5.1 Random
Forest 6.6 Support Vector Machines 6.7 Semi-Supervised Classification 6.8
Neural Networks and Deep Learning 6.8.1 Perceptron 6.8.2 Neural Networks
6.8.3 Deep Networks 6.9 Topological Data Analysis 6.10 Ensemble Learning
6.10.1 Majority 6.10.2 Adaboost 6.11 One-class Learning 6.12 Online
Learning 6.13 Adversarial Machine Learning 6.13.1 Adversarial Examples
6.13.2 Adversarial Training 6.13.3 Adversarial Generation 6.13.4 Beyond
Continuous Data 6.14 Evaluation of Machine Learning 6.14.1 Cost-sensitive
Evaluation 6.14.2 New Metrics for Unbalanced Datasets 6.15 Security
Applications and Adaptations 6.15.1 Intrusion Detection 6.15.2 Malware
Detection 6.15.3 Spam and Phishing Detection 6.16 For Further Reading 7
Text Mining 7.1 Tokenization 7.2 Preprocessing 7.3 Bag-Of-Words 7.4 Vector
space model 7.4.1 Weighting 7.5 Latent Semantic Indexing 7.6 Embedding 7.7
Topic Models: Latent Dirichlet Allocation 7.8 Sentiment Analysis 8 Natural
Language Processing 8.1 Challenges of NLP 8.2 Basics of Language Study and
NLP Techniques 8.3 Text Preprocessing 8.4 Feature Engineering on Text Data
8.4.1 Morphological, Word and Phrasal Features 8.4.2 Clausal and Sentence
Level Features 8.4.3 Statistical Features 8.5 Corpus-based Analysis 8.6
Advanced NLP Tasks 8.6.1 Part of Speech Tagging 8.6.2 Word sense
Disambiguation 8.6.3 Language Modeling 8.6.4 Topic Modeling 8.7 Sequence to
Sequence Tasks 8.8 Knowledge Bases and Frameworks 8.9 Natural Language
Generation 8.10 Issues with Pipelining 8.11 Security Applications of NLP
8.11.1 Password Checking 8.11.2 Email Spam Detection 8.11.3 Phishing Email
Detection 8.11.4 Malware Detection 8.11.5 Attack Generation 9 Big Data
Techniques and Security 9.1 Key terms 9.2 Ingesting the Data 9.3 Persistent
Storage 9.4 Computing and Analyzing 9.5 Techniques for Handling Big Data
9.6 Visualizing 9.7 Streaming Data 9.8 Big Data Security 9.8.1 Implications
of Big Data Characteristics on Security and Privacy 9.8.2 Mechanisms for
Big Data Security Goals A Linear Algebra Basics A.1 Vectors A.2 Matrices
A.2.1 Eigenvectors and Eigenvalues A.2.2 The Singular Value Decomposition
B Graphs B.1 Graph Invariants B.2 The Laplacian C Probability C.1
Probability C.1.1 Conditional Probability and Bayes' Rule C.1.2 Base Rate
Fallacy C.1.3 Expected Values and Moments C.1.4 Distribution Functions and
Densities C.2 Models C.2.1 Bernoulli and Binomial C.2.2 Multinomial C.2.3
Uniform Bibliography Author Index Index
Data Processing and Cleaning 2.3 Visualization and Exploratory Analysis
2.3.1 Scatterplots 2.4 Pattern Recognition 2.4.1 Classification 2.4.2
Clustering 2.5 Feature extraction 2.5.1 Feature Selection 2.5.2 Random
Projections 2.6 Modeling 2.6.1 Model Specification 2.6.2 Model Selection
and Fitting 2.7 Evaluation 2.8 Strengths and Limitations 2.8.1 The Curse of
Dimensionality 3 Security: Basics and Security Analytics 3.1 Basics of
Security 3.1.1 Know Thy Enemy - Attackers and Their Motivations 3.1.2
Security Goals 3.2 Mechanisms for Ensuring Security Goals 3.2.1
Confidentiality 3.2.2 Integrity 3.2.3 Availability 3.2.4 Authentication
3.2.5 Access Control 3.2.6 Accountability 3.2.7 Non-repudiation 3.3
Threats, Attacks and Impacts 3.3.1 Passwords 3.3.2 Malware 3.3.3 Spam,
Phishing and its Variants 3.3.4 Intrusions 3.3.5 Internet Surfing 3.3.6
System Maintenance and Firewalls 3.3.7 Other Vulnerabilities 3.3.8
Protecting Against Attacks 3.4 Applications of Data Science to Security
Challenges 3.4.1 Cybersecurity Datasets 3.4.2 Data Science Applications
3.4.3 Passwords 3.4.4 Malware 3.4.5 Intrusions 3.4.6 Spam/Phishing 3.4.7
Credit Card Fraud/Financial Fraud 3.4.8 Opinion Spam 3.4.9 Denial of
Service 3.5 Security Analytics and Why Do We Need It4 Statistics 4.1
Probability Density Estimation 4.2 Models 4.2.1 Poisson 4.2.2 Uniform 4.2.3
Normal 4.3 Parameter Estimation 4.3.1 The Bias-Variance Trade-Off 4.4 The
Law of Large Numbers and the Central Limit Theorem 4.5 Confidence Intervals
4.6 Hypothesis Testing 4.7 Bayesian Statistics 4.8 Regression 4.8.1
Logistic Regression 4.9 Regularization 4.10 Principal Components 4.11
Multidimensional Scaling 4.12 Procrustes 4.13 Nonparametric Statistics 4.14
Time Series 5 Data Mining - Unsupervised Learning 5.1 Data Collection 5.2
Types of Data and Operations 5.2.1 Properties of Datasets 5.3 Data
Exploration and Preprocessing 5.3.1 Data Exploration 5.3.2 Data
Preprocessing/Wrangling 5.4 Data Representation 5.5 Association Rule Mining
5.5.1 Variations on the Apriori Algorithm 5.6 Clustering 5.6.1 Partitional
Clustering 5.6.2 Choosing K 5.6.3 Variations on K-means Algorithm 5.6.4
Hierarchical Clustering 5.6.5 Other Clustering Algorithms 5.6.6 Measuring
the Clustering Quality 5.6.7 Clustering Miscellany: Clusterability,
Robustness, Incremental, 5.7 Manifold Discovery 5.7.1 Spectral Embedding
5.8 Anomaly Detection 5.8.1 Statistical Methods 5.8.2 Distance-based
Outlier Detection 5.8.3 kNN based approach 5.8.4 Density-based Outlier
Detection 5.8.5 Clustering-based Outlier Detection 5.8.6 One-class learning
based Outliers 5.9 Security Applications and Adaptations 5.9.1 Data Mining
for Intrusion Detection 5.9.2 Malware Detection 5.9.3 Stepping-stone
Detection 5.9.4 Malware Clustering 5.9.5 Directed Anomaly Scoring for Spear
Phishing Detection 5.10 Concluding Remarks and Further Reading 6 Machine
Learning - Supervised Learning 6.1 Fundamentals of Supervised Learning 6.2
The Bayes Classifier 6.2.1 Naïve Bayes6.3 Nearest Neighbors Classifiers 6.4
Linear Classifiers 6.5 Decision Trees and Random Forests 6.5.1 Random
Forest 6.6 Support Vector Machines 6.7 Semi-Supervised Classification 6.8
Neural Networks and Deep Learning 6.8.1 Perceptron 6.8.2 Neural Networks
6.8.3 Deep Networks 6.9 Topological Data Analysis 6.10 Ensemble Learning
6.10.1 Majority 6.10.2 Adaboost 6.11 One-class Learning 6.12 Online
Learning 6.13 Adversarial Machine Learning 6.13.1 Adversarial Examples
6.13.2 Adversarial Training 6.13.3 Adversarial Generation 6.13.4 Beyond
Continuous Data 6.14 Evaluation of Machine Learning 6.14.1 Cost-sensitive
Evaluation 6.14.2 New Metrics for Unbalanced Datasets 6.15 Security
Applications and Adaptations 6.15.1 Intrusion Detection 6.15.2 Malware
Detection 6.15.3 Spam and Phishing Detection 6.16 For Further Reading 7
Text Mining 7.1 Tokenization 7.2 Preprocessing 7.3 Bag-Of-Words 7.4 Vector
space model 7.4.1 Weighting 7.5 Latent Semantic Indexing 7.6 Embedding 7.7
Topic Models: Latent Dirichlet Allocation 7.8 Sentiment Analysis 8 Natural
Language Processing 8.1 Challenges of NLP 8.2 Basics of Language Study and
NLP Techniques 8.3 Text Preprocessing 8.4 Feature Engineering on Text Data
8.4.1 Morphological, Word and Phrasal Features 8.4.2 Clausal and Sentence
Level Features 8.4.3 Statistical Features 8.5 Corpus-based Analysis 8.6
Advanced NLP Tasks 8.6.1 Part of Speech Tagging 8.6.2 Word sense
Disambiguation 8.6.3 Language Modeling 8.6.4 Topic Modeling 8.7 Sequence to
Sequence Tasks 8.8 Knowledge Bases and Frameworks 8.9 Natural Language
Generation 8.10 Issues with Pipelining 8.11 Security Applications of NLP
8.11.1 Password Checking 8.11.2 Email Spam Detection 8.11.3 Phishing Email
Detection 8.11.4 Malware Detection 8.11.5 Attack Generation 9 Big Data
Techniques and Security 9.1 Key terms 9.2 Ingesting the Data 9.3 Persistent
Storage 9.4 Computing and Analyzing 9.5 Techniques for Handling Big Data
9.6 Visualizing 9.7 Streaming Data 9.8 Big Data Security 9.8.1 Implications
of Big Data Characteristics on Security and Privacy 9.8.2 Mechanisms for
Big Data Security Goals A Linear Algebra Basics A.1 Vectors A.2 Matrices
A.2.1 Eigenvectors and Eigenvalues A.2.2 The Singular Value Decomposition
B Graphs B.1 Graph Invariants B.2 The Laplacian C Probability C.1
Probability C.1.1 Conditional Probability and Bayes' Rule C.1.2 Base Rate
Fallacy C.1.3 Expected Values and Moments C.1.4 Distribution Functions and
Densities C.2 Models C.2.1 Bernoulli and Binomial C.2.2 Multinomial C.2.3
Uniform Bibliography Author Index Index
Preface 1 Introduction 2 What is Data Analytics? 2.1 Data Ingestion 2.2
Data Processing and Cleaning 2.3 Visualization and Exploratory Analysis
2.3.1 Scatterplots 2.4 Pattern Recognition 2.4.1 Classification 2.4.2
Clustering 2.5 Feature extraction 2.5.1 Feature Selection 2.5.2 Random
Projections 2.6 Modeling 2.6.1 Model Specification 2.6.2 Model Selection
and Fitting 2.7 Evaluation 2.8 Strengths and Limitations 2.8.1 The Curse of
Dimensionality 3 Security: Basics and Security Analytics 3.1 Basics of
Security 3.1.1 Know Thy Enemy - Attackers and Their Motivations 3.1.2
Security Goals 3.2 Mechanisms for Ensuring Security Goals 3.2.1
Confidentiality 3.2.2 Integrity 3.2.3 Availability 3.2.4 Authentication
3.2.5 Access Control 3.2.6 Accountability 3.2.7 Non-repudiation 3.3
Threats, Attacks and Impacts 3.3.1 Passwords 3.3.2 Malware 3.3.3 Spam,
Phishing and its Variants 3.3.4 Intrusions 3.3.5 Internet Surfing 3.3.6
System Maintenance and Firewalls 3.3.7 Other Vulnerabilities 3.3.8
Protecting Against Attacks 3.4 Applications of Data Science to Security
Challenges 3.4.1 Cybersecurity Datasets 3.4.2 Data Science Applications
3.4.3 Passwords 3.4.4 Malware 3.4.5 Intrusions 3.4.6 Spam/Phishing 3.4.7
Credit Card Fraud/Financial Fraud 3.4.8 Opinion Spam 3.4.9 Denial of
Service 3.5 Security Analytics and Why Do We Need It4 Statistics 4.1
Probability Density Estimation 4.2 Models 4.2.1 Poisson 4.2.2 Uniform 4.2.3
Normal 4.3 Parameter Estimation 4.3.1 The Bias-Variance Trade-Off 4.4 The
Law of Large Numbers and the Central Limit Theorem 4.5 Confidence Intervals
4.6 Hypothesis Testing 4.7 Bayesian Statistics 4.8 Regression 4.8.1
Logistic Regression 4.9 Regularization 4.10 Principal Components 4.11
Multidimensional Scaling 4.12 Procrustes 4.13 Nonparametric Statistics 4.14
Time Series 5 Data Mining - Unsupervised Learning 5.1 Data Collection 5.2
Types of Data and Operations 5.2.1 Properties of Datasets 5.3 Data
Exploration and Preprocessing 5.3.1 Data Exploration 5.3.2 Data
Preprocessing/Wrangling 5.4 Data Representation 5.5 Association Rule Mining
5.5.1 Variations on the Apriori Algorithm 5.6 Clustering 5.6.1 Partitional
Clustering 5.6.2 Choosing K 5.6.3 Variations on K-means Algorithm 5.6.4
Hierarchical Clustering 5.6.5 Other Clustering Algorithms 5.6.6 Measuring
the Clustering Quality 5.6.7 Clustering Miscellany: Clusterability,
Robustness, Incremental, 5.7 Manifold Discovery 5.7.1 Spectral Embedding
5.8 Anomaly Detection 5.8.1 Statistical Methods 5.8.2 Distance-based
Outlier Detection 5.8.3 kNN based approach 5.8.4 Density-based Outlier
Detection 5.8.5 Clustering-based Outlier Detection 5.8.6 One-class learning
based Outliers 5.9 Security Applications and Adaptations 5.9.1 Data Mining
for Intrusion Detection 5.9.2 Malware Detection 5.9.3 Stepping-stone
Detection 5.9.4 Malware Clustering 5.9.5 Directed Anomaly Scoring for Spear
Phishing Detection 5.10 Concluding Remarks and Further Reading 6 Machine
Learning - Supervised Learning 6.1 Fundamentals of Supervised Learning 6.2
The Bayes Classifier 6.2.1 Naïve Bayes6.3 Nearest Neighbors Classifiers 6.4
Linear Classifiers 6.5 Decision Trees and Random Forests 6.5.1 Random
Forest 6.6 Support Vector Machines 6.7 Semi-Supervised Classification 6.8
Neural Networks and Deep Learning 6.8.1 Perceptron 6.8.2 Neural Networks
6.8.3 Deep Networks 6.9 Topological Data Analysis 6.10 Ensemble Learning
6.10.1 Majority 6.10.2 Adaboost 6.11 One-class Learning 6.12 Online
Learning 6.13 Adversarial Machine Learning 6.13.1 Adversarial Examples
6.13.2 Adversarial Training 6.13.3 Adversarial Generation 6.13.4 Beyond
Continuous Data 6.14 Evaluation of Machine Learning 6.14.1 Cost-sensitive
Evaluation 6.14.2 New Metrics for Unbalanced Datasets 6.15 Security
Applications and Adaptations 6.15.1 Intrusion Detection 6.15.2 Malware
Detection 6.15.3 Spam and Phishing Detection 6.16 For Further Reading 7
Text Mining 7.1 Tokenization 7.2 Preprocessing 7.3 Bag-Of-Words 7.4 Vector
space model 7.4.1 Weighting 7.5 Latent Semantic Indexing 7.6 Embedding 7.7
Topic Models: Latent Dirichlet Allocation 7.8 Sentiment Analysis 8 Natural
Language Processing 8.1 Challenges of NLP 8.2 Basics of Language Study and
NLP Techniques 8.3 Text Preprocessing 8.4 Feature Engineering on Text Data
8.4.1 Morphological, Word and Phrasal Features 8.4.2 Clausal and Sentence
Level Features 8.4.3 Statistical Features 8.5 Corpus-based Analysis 8.6
Advanced NLP Tasks 8.6.1 Part of Speech Tagging 8.6.2 Word sense
Disambiguation 8.6.3 Language Modeling 8.6.4 Topic Modeling 8.7 Sequence to
Sequence Tasks 8.8 Knowledge Bases and Frameworks 8.9 Natural Language
Generation 8.10 Issues with Pipelining 8.11 Security Applications of NLP
8.11.1 Password Checking 8.11.2 Email Spam Detection 8.11.3 Phishing Email
Detection 8.11.4 Malware Detection 8.11.5 Attack Generation 9 Big Data
Techniques and Security 9.1 Key terms 9.2 Ingesting the Data 9.3 Persistent
Storage 9.4 Computing and Analyzing 9.5 Techniques for Handling Big Data
9.6 Visualizing 9.7 Streaming Data 9.8 Big Data Security 9.8.1 Implications
of Big Data Characteristics on Security and Privacy 9.8.2 Mechanisms for
Big Data Security Goals A Linear Algebra Basics A.1 Vectors A.2 Matrices
A.2.1 Eigenvectors and Eigenvalues A.2.2 The Singular Value Decomposition
B Graphs B.1 Graph Invariants B.2 The Laplacian C Probability C.1
Probability C.1.1 Conditional Probability and Bayes' Rule C.1.2 Base Rate
Fallacy C.1.3 Expected Values and Moments C.1.4 Distribution Functions and
Densities C.2 Models C.2.1 Bernoulli and Binomial C.2.2 Multinomial C.2.3
Uniform Bibliography Author Index Index
Data Processing and Cleaning 2.3 Visualization and Exploratory Analysis
2.3.1 Scatterplots 2.4 Pattern Recognition 2.4.1 Classification 2.4.2
Clustering 2.5 Feature extraction 2.5.1 Feature Selection 2.5.2 Random
Projections 2.6 Modeling 2.6.1 Model Specification 2.6.2 Model Selection
and Fitting 2.7 Evaluation 2.8 Strengths and Limitations 2.8.1 The Curse of
Dimensionality 3 Security: Basics and Security Analytics 3.1 Basics of
Security 3.1.1 Know Thy Enemy - Attackers and Their Motivations 3.1.2
Security Goals 3.2 Mechanisms for Ensuring Security Goals 3.2.1
Confidentiality 3.2.2 Integrity 3.2.3 Availability 3.2.4 Authentication
3.2.5 Access Control 3.2.6 Accountability 3.2.7 Non-repudiation 3.3
Threats, Attacks and Impacts 3.3.1 Passwords 3.3.2 Malware 3.3.3 Spam,
Phishing and its Variants 3.3.4 Intrusions 3.3.5 Internet Surfing 3.3.6
System Maintenance and Firewalls 3.3.7 Other Vulnerabilities 3.3.8
Protecting Against Attacks 3.4 Applications of Data Science to Security
Challenges 3.4.1 Cybersecurity Datasets 3.4.2 Data Science Applications
3.4.3 Passwords 3.4.4 Malware 3.4.5 Intrusions 3.4.6 Spam/Phishing 3.4.7
Credit Card Fraud/Financial Fraud 3.4.8 Opinion Spam 3.4.9 Denial of
Service 3.5 Security Analytics and Why Do We Need It4 Statistics 4.1
Probability Density Estimation 4.2 Models 4.2.1 Poisson 4.2.2 Uniform 4.2.3
Normal 4.3 Parameter Estimation 4.3.1 The Bias-Variance Trade-Off 4.4 The
Law of Large Numbers and the Central Limit Theorem 4.5 Confidence Intervals
4.6 Hypothesis Testing 4.7 Bayesian Statistics 4.8 Regression 4.8.1
Logistic Regression 4.9 Regularization 4.10 Principal Components 4.11
Multidimensional Scaling 4.12 Procrustes 4.13 Nonparametric Statistics 4.14
Time Series 5 Data Mining - Unsupervised Learning 5.1 Data Collection 5.2
Types of Data and Operations 5.2.1 Properties of Datasets 5.3 Data
Exploration and Preprocessing 5.3.1 Data Exploration 5.3.2 Data
Preprocessing/Wrangling 5.4 Data Representation 5.5 Association Rule Mining
5.5.1 Variations on the Apriori Algorithm 5.6 Clustering 5.6.1 Partitional
Clustering 5.6.2 Choosing K 5.6.3 Variations on K-means Algorithm 5.6.4
Hierarchical Clustering 5.6.5 Other Clustering Algorithms 5.6.6 Measuring
the Clustering Quality 5.6.7 Clustering Miscellany: Clusterability,
Robustness, Incremental, 5.7 Manifold Discovery 5.7.1 Spectral Embedding
5.8 Anomaly Detection 5.8.1 Statistical Methods 5.8.2 Distance-based
Outlier Detection 5.8.3 kNN based approach 5.8.4 Density-based Outlier
Detection 5.8.5 Clustering-based Outlier Detection 5.8.6 One-class learning
based Outliers 5.9 Security Applications and Adaptations 5.9.1 Data Mining
for Intrusion Detection 5.9.2 Malware Detection 5.9.3 Stepping-stone
Detection 5.9.4 Malware Clustering 5.9.5 Directed Anomaly Scoring for Spear
Phishing Detection 5.10 Concluding Remarks and Further Reading 6 Machine
Learning - Supervised Learning 6.1 Fundamentals of Supervised Learning 6.2
The Bayes Classifier 6.2.1 Naïve Bayes6.3 Nearest Neighbors Classifiers 6.4
Linear Classifiers 6.5 Decision Trees and Random Forests 6.5.1 Random
Forest 6.6 Support Vector Machines 6.7 Semi-Supervised Classification 6.8
Neural Networks and Deep Learning 6.8.1 Perceptron 6.8.2 Neural Networks
6.8.3 Deep Networks 6.9 Topological Data Analysis 6.10 Ensemble Learning
6.10.1 Majority 6.10.2 Adaboost 6.11 One-class Learning 6.12 Online
Learning 6.13 Adversarial Machine Learning 6.13.1 Adversarial Examples
6.13.2 Adversarial Training 6.13.3 Adversarial Generation 6.13.4 Beyond
Continuous Data 6.14 Evaluation of Machine Learning 6.14.1 Cost-sensitive
Evaluation 6.14.2 New Metrics for Unbalanced Datasets 6.15 Security
Applications and Adaptations 6.15.1 Intrusion Detection 6.15.2 Malware
Detection 6.15.3 Spam and Phishing Detection 6.16 For Further Reading 7
Text Mining 7.1 Tokenization 7.2 Preprocessing 7.3 Bag-Of-Words 7.4 Vector
space model 7.4.1 Weighting 7.5 Latent Semantic Indexing 7.6 Embedding 7.7
Topic Models: Latent Dirichlet Allocation 7.8 Sentiment Analysis 8 Natural
Language Processing 8.1 Challenges of NLP 8.2 Basics of Language Study and
NLP Techniques 8.3 Text Preprocessing 8.4 Feature Engineering on Text Data
8.4.1 Morphological, Word and Phrasal Features 8.4.2 Clausal and Sentence
Level Features 8.4.3 Statistical Features 8.5 Corpus-based Analysis 8.6
Advanced NLP Tasks 8.6.1 Part of Speech Tagging 8.6.2 Word sense
Disambiguation 8.6.3 Language Modeling 8.6.4 Topic Modeling 8.7 Sequence to
Sequence Tasks 8.8 Knowledge Bases and Frameworks 8.9 Natural Language
Generation 8.10 Issues with Pipelining 8.11 Security Applications of NLP
8.11.1 Password Checking 8.11.2 Email Spam Detection 8.11.3 Phishing Email
Detection 8.11.4 Malware Detection 8.11.5 Attack Generation 9 Big Data
Techniques and Security 9.1 Key terms 9.2 Ingesting the Data 9.3 Persistent
Storage 9.4 Computing and Analyzing 9.5 Techniques for Handling Big Data
9.6 Visualizing 9.7 Streaming Data 9.8 Big Data Security 9.8.1 Implications
of Big Data Characteristics on Security and Privacy 9.8.2 Mechanisms for
Big Data Security Goals A Linear Algebra Basics A.1 Vectors A.2 Matrices
A.2.1 Eigenvectors and Eigenvalues A.2.2 The Singular Value Decomposition
B Graphs B.1 Graph Invariants B.2 The Laplacian C Probability C.1
Probability C.1.1 Conditional Probability and Bayes' Rule C.1.2 Base Rate
Fallacy C.1.3 Expected Values and Moments C.1.4 Distribution Functions and
Densities C.2 Models C.2.1 Bernoulli and Binomial C.2.2 Multinomial C.2.3
Uniform Bibliography Author Index Index