Data Science Handbook
-15%
portes grátis
Data Science Handbook
A Practical Approach
Prakash, Kolla Bhanu
John Wiley & Sons Inc
11/2022
480
Dura
Inglês
9781119857334
15 a 20 dias
666
Descrição não disponível.
Acknowledgment xi
Preface xiii
1 Data Munging Basics
1 Introduction 1
1.1 Filtering and Selecting Data 6
1.2 Treating Missing Values 11
1.3 Removing Duplicates 14
1.4 Concatenating and Transforming Data 16
1.5 Grouping and Data Aggregation 20
References 20
2 Data Visualization 23
2.1 Creating Standard Plots (Line, Bar, Pie) 26
2.2 Defining Elements of a Plot 30
2.3 Plot Formatting 33
2.4 Creating Labels and Annotations 38
2.5 Creating Visualizations from Time Series Data 42
2.6 Constructing Histograms, Box Plots, and Scatter Plots 44
References 54
3 Basic Math and Statistics 57
3.1 Linear Algebra 57
3.2 Calculus 58
3.2.1 Differential Calculus 58
3.2.2 Integral Calculus 58
3.3 Inferential Statistics 60
3.3.1 Central Limit Theorem 60
3.3.2 Hypothesis Testing 60
3.3.3 ANOVA 60
3.3.4 Qualitative Data Analysis 60
3.4 Using NumPy to Perform Arithmetic Operations on Data 61
3.5 Generating Summary Statistics Using Pandas and Scipy 64
3.6 Summarizing Categorical Data Using Pandas 68
3.7 Starting with Parametric Methods in Pandas and Scipy 84
3.8 Delving Into Non-Parametric Methods Using Pandas and Scipy 87
3.9 Transforming Dataset Distributions 91
References 94
4 Introduction to Machine Learning 97
4.1 Introduction to Machine Learning 97
4.2 Types of Machine Learning Algorithms 101
4.3 Explanatory Factor Analysis 114
4.4 Principal Component Analysis (PCA) 115
References 121
5 Outlier Analysis 123
5.1 Extreme Value Analysis Using Univariate Methods 123
5.2 Multivariate Analysis for Outlier Detection 125
5.3 DBSCan Clustering to Identify Outliers 127
References 133
6 Cluster Analysis 135
6.1 K-Means Algorithm 135
6.2 Hierarchial Methods 141
6.3 Instance-Based Learning w/ k-Nearest Neighbor 149
References 156
7 Network Analysis with NetworkX 157
7.1 Working with Graph Objects 159
7.2 Simulating a Social Network (ie; Directed Network Analysis) 163
7.3 Analyzing a Social Network 169
References 171
8 Basic Algorithmic Learning 173
8.1 Linear Regression 173
8.2 Logistic Regression 183
8.3 Naive Bayes Classifiers 189
References 195
9 Web-Based Data Visualizations with Plotly 197
9.1 Collaborative Aanalytics 197
9.2 Basic Charts 208
9.3 Statistical Charts 212
9.4 Plotly Maps 216
References 219
10 Web Scraping with Beautiful Soup 221
10.1 The BeautifulSoup Object 224
10.2 Exploring NavigableString Objects 228
10.3 Data Parsing 230
10.4 Web Scraping 233
10.5 Ensemble Models with Random Forests 235
References 254
Data Science Projects 257
11 Covid19 Detection and Prediction 259
Bibliography 275
12 Leaf Disease Detection 277
Bibliography 283
13 Brain Tumor Detection with Data Science 285
Bibliography 295
14 Color Detection with Python 297
Bibliography 300
15 Detecting Parkinson's Disease 301
Bibliography 302
16 Sentiment Analysis 303
Bibliography 306
17 Road Lane Line Detection 307
Bibliography 315
18 Fake News Detection 317
Bibliography 318
19 Speech Emotion Recognition 319
Bibliography 322
20 Gender and Age Detection with Data Science 323
Bibliography 339
21 Diabetic Retinopathy 341
Bibliography 350
22 Driver Drowsiness Detection in Python 351
Bibliography 356
23 Chatbot Using Python 357
Bibliography 363
24 Handwritten Digit Recognition Project 365
Bibliography 368
25 Image Caption Generator Project in Python 369
Bibliography 379
26 Credit Card Fraud Detection Project 381
Bibliography 391
27 Movie Recommendation System 393
Bibliography 411
28 Customer Segmentation 413
Bibliography 431
29 Breast Cancer Classification 433
Bibliography 443
30 Traffic Signs Recognition 445
Bibliography 453
Preface xiii
1 Data Munging Basics
1 Introduction 1
1.1 Filtering and Selecting Data 6
1.2 Treating Missing Values 11
1.3 Removing Duplicates 14
1.4 Concatenating and Transforming Data 16
1.5 Grouping and Data Aggregation 20
References 20
2 Data Visualization 23
2.1 Creating Standard Plots (Line, Bar, Pie) 26
2.2 Defining Elements of a Plot 30
2.3 Plot Formatting 33
2.4 Creating Labels and Annotations 38
2.5 Creating Visualizations from Time Series Data 42
2.6 Constructing Histograms, Box Plots, and Scatter Plots 44
References 54
3 Basic Math and Statistics 57
3.1 Linear Algebra 57
3.2 Calculus 58
3.2.1 Differential Calculus 58
3.2.2 Integral Calculus 58
3.3 Inferential Statistics 60
3.3.1 Central Limit Theorem 60
3.3.2 Hypothesis Testing 60
3.3.3 ANOVA 60
3.3.4 Qualitative Data Analysis 60
3.4 Using NumPy to Perform Arithmetic Operations on Data 61
3.5 Generating Summary Statistics Using Pandas and Scipy 64
3.6 Summarizing Categorical Data Using Pandas 68
3.7 Starting with Parametric Methods in Pandas and Scipy 84
3.8 Delving Into Non-Parametric Methods Using Pandas and Scipy 87
3.9 Transforming Dataset Distributions 91
References 94
4 Introduction to Machine Learning 97
4.1 Introduction to Machine Learning 97
4.2 Types of Machine Learning Algorithms 101
4.3 Explanatory Factor Analysis 114
4.4 Principal Component Analysis (PCA) 115
References 121
5 Outlier Analysis 123
5.1 Extreme Value Analysis Using Univariate Methods 123
5.2 Multivariate Analysis for Outlier Detection 125
5.3 DBSCan Clustering to Identify Outliers 127
References 133
6 Cluster Analysis 135
6.1 K-Means Algorithm 135
6.2 Hierarchial Methods 141
6.3 Instance-Based Learning w/ k-Nearest Neighbor 149
References 156
7 Network Analysis with NetworkX 157
7.1 Working with Graph Objects 159
7.2 Simulating a Social Network (ie; Directed Network Analysis) 163
7.3 Analyzing a Social Network 169
References 171
8 Basic Algorithmic Learning 173
8.1 Linear Regression 173
8.2 Logistic Regression 183
8.3 Naive Bayes Classifiers 189
References 195
9 Web-Based Data Visualizations with Plotly 197
9.1 Collaborative Aanalytics 197
9.2 Basic Charts 208
9.3 Statistical Charts 212
9.4 Plotly Maps 216
References 219
10 Web Scraping with Beautiful Soup 221
10.1 The BeautifulSoup Object 224
10.2 Exploring NavigableString Objects 228
10.3 Data Parsing 230
10.4 Web Scraping 233
10.5 Ensemble Models with Random Forests 235
References 254
Data Science Projects 257
11 Covid19 Detection and Prediction 259
Bibliography 275
12 Leaf Disease Detection 277
Bibliography 283
13 Brain Tumor Detection with Data Science 285
Bibliography 295
14 Color Detection with Python 297
Bibliography 300
15 Detecting Parkinson's Disease 301
Bibliography 302
16 Sentiment Analysis 303
Bibliography 306
17 Road Lane Line Detection 307
Bibliography 315
18 Fake News Detection 317
Bibliography 318
19 Speech Emotion Recognition 319
Bibliography 322
20 Gender and Age Detection with Data Science 323
Bibliography 339
21 Diabetic Retinopathy 341
Bibliography 350
22 Driver Drowsiness Detection in Python 351
Bibliography 356
23 Chatbot Using Python 357
Bibliography 363
24 Handwritten Digit Recognition Project 365
Bibliography 368
25 Image Caption Generator Project in Python 369
Bibliography 379
26 Credit Card Fraud Detection Project 381
Bibliography 391
27 Movie Recommendation System 393
Bibliography 411
28 Customer Segmentation 413
Bibliography 431
29 Breast Cancer Classification 433
Bibliography 443
30 Traffic Signs Recognition 445
Bibliography 453
Este título pertence ao(s) assunto(s) indicados(s). Para ver outros títulos clique no assunto desejado.
Business Intelligence; Data Engineering; Decision Science; Artificial Intelligence; Machine Learning; Supervised Learning; Classification; Cross validation; Clustering; Deep Learning; Linear Regression; A/B Testing; Hypothesis Testing; Statistical Power; Standard Error; Exploratory Data Analysis (EDA); Data Visualization; R programming; Python; SAS; SPSS; TABLEAU; VBA; Structured Query Language; ETL; GitHub; Data Models; Data Warehouse; AWS; Big data; Cloud computing; Data Analytics; Data Exploration; Data Management; Data wrangling; Data processing; Data security; Data governance; Data Manipulation; Data Migration; DevOps; HADOOP; Multivariate; Calculus & Linear Algebra; MATLAB
Acknowledgment xi
Preface xiii
1 Data Munging Basics
1 Introduction 1
1.1 Filtering and Selecting Data 6
1.2 Treating Missing Values 11
1.3 Removing Duplicates 14
1.4 Concatenating and Transforming Data 16
1.5 Grouping and Data Aggregation 20
References 20
2 Data Visualization 23
2.1 Creating Standard Plots (Line, Bar, Pie) 26
2.2 Defining Elements of a Plot 30
2.3 Plot Formatting 33
2.4 Creating Labels and Annotations 38
2.5 Creating Visualizations from Time Series Data 42
2.6 Constructing Histograms, Box Plots, and Scatter Plots 44
References 54
3 Basic Math and Statistics 57
3.1 Linear Algebra 57
3.2 Calculus 58
3.2.1 Differential Calculus 58
3.2.2 Integral Calculus 58
3.3 Inferential Statistics 60
3.3.1 Central Limit Theorem 60
3.3.2 Hypothesis Testing 60
3.3.3 ANOVA 60
3.3.4 Qualitative Data Analysis 60
3.4 Using NumPy to Perform Arithmetic Operations on Data 61
3.5 Generating Summary Statistics Using Pandas and Scipy 64
3.6 Summarizing Categorical Data Using Pandas 68
3.7 Starting with Parametric Methods in Pandas and Scipy 84
3.8 Delving Into Non-Parametric Methods Using Pandas and Scipy 87
3.9 Transforming Dataset Distributions 91
References 94
4 Introduction to Machine Learning 97
4.1 Introduction to Machine Learning 97
4.2 Types of Machine Learning Algorithms 101
4.3 Explanatory Factor Analysis 114
4.4 Principal Component Analysis (PCA) 115
References 121
5 Outlier Analysis 123
5.1 Extreme Value Analysis Using Univariate Methods 123
5.2 Multivariate Analysis for Outlier Detection 125
5.3 DBSCan Clustering to Identify Outliers 127
References 133
6 Cluster Analysis 135
6.1 K-Means Algorithm 135
6.2 Hierarchial Methods 141
6.3 Instance-Based Learning w/ k-Nearest Neighbor 149
References 156
7 Network Analysis with NetworkX 157
7.1 Working with Graph Objects 159
7.2 Simulating a Social Network (ie; Directed Network Analysis) 163
7.3 Analyzing a Social Network 169
References 171
8 Basic Algorithmic Learning 173
8.1 Linear Regression 173
8.2 Logistic Regression 183
8.3 Naive Bayes Classifiers 189
References 195
9 Web-Based Data Visualizations with Plotly 197
9.1 Collaborative Aanalytics 197
9.2 Basic Charts 208
9.3 Statistical Charts 212
9.4 Plotly Maps 216
References 219
10 Web Scraping with Beautiful Soup 221
10.1 The BeautifulSoup Object 224
10.2 Exploring NavigableString Objects 228
10.3 Data Parsing 230
10.4 Web Scraping 233
10.5 Ensemble Models with Random Forests 235
References 254
Data Science Projects 257
11 Covid19 Detection and Prediction 259
Bibliography 275
12 Leaf Disease Detection 277
Bibliography 283
13 Brain Tumor Detection with Data Science 285
Bibliography 295
14 Color Detection with Python 297
Bibliography 300
15 Detecting Parkinson's Disease 301
Bibliography 302
16 Sentiment Analysis 303
Bibliography 306
17 Road Lane Line Detection 307
Bibliography 315
18 Fake News Detection 317
Bibliography 318
19 Speech Emotion Recognition 319
Bibliography 322
20 Gender and Age Detection with Data Science 323
Bibliography 339
21 Diabetic Retinopathy 341
Bibliography 350
22 Driver Drowsiness Detection in Python 351
Bibliography 356
23 Chatbot Using Python 357
Bibliography 363
24 Handwritten Digit Recognition Project 365
Bibliography 368
25 Image Caption Generator Project in Python 369
Bibliography 379
26 Credit Card Fraud Detection Project 381
Bibliography 391
27 Movie Recommendation System 393
Bibliography 411
28 Customer Segmentation 413
Bibliography 431
29 Breast Cancer Classification 433
Bibliography 443
30 Traffic Signs Recognition 445
Bibliography 453
Preface xiii
1 Data Munging Basics
1 Introduction 1
1.1 Filtering and Selecting Data 6
1.2 Treating Missing Values 11
1.3 Removing Duplicates 14
1.4 Concatenating and Transforming Data 16
1.5 Grouping and Data Aggregation 20
References 20
2 Data Visualization 23
2.1 Creating Standard Plots (Line, Bar, Pie) 26
2.2 Defining Elements of a Plot 30
2.3 Plot Formatting 33
2.4 Creating Labels and Annotations 38
2.5 Creating Visualizations from Time Series Data 42
2.6 Constructing Histograms, Box Plots, and Scatter Plots 44
References 54
3 Basic Math and Statistics 57
3.1 Linear Algebra 57
3.2 Calculus 58
3.2.1 Differential Calculus 58
3.2.2 Integral Calculus 58
3.3 Inferential Statistics 60
3.3.1 Central Limit Theorem 60
3.3.2 Hypothesis Testing 60
3.3.3 ANOVA 60
3.3.4 Qualitative Data Analysis 60
3.4 Using NumPy to Perform Arithmetic Operations on Data 61
3.5 Generating Summary Statistics Using Pandas and Scipy 64
3.6 Summarizing Categorical Data Using Pandas 68
3.7 Starting with Parametric Methods in Pandas and Scipy 84
3.8 Delving Into Non-Parametric Methods Using Pandas and Scipy 87
3.9 Transforming Dataset Distributions 91
References 94
4 Introduction to Machine Learning 97
4.1 Introduction to Machine Learning 97
4.2 Types of Machine Learning Algorithms 101
4.3 Explanatory Factor Analysis 114
4.4 Principal Component Analysis (PCA) 115
References 121
5 Outlier Analysis 123
5.1 Extreme Value Analysis Using Univariate Methods 123
5.2 Multivariate Analysis for Outlier Detection 125
5.3 DBSCan Clustering to Identify Outliers 127
References 133
6 Cluster Analysis 135
6.1 K-Means Algorithm 135
6.2 Hierarchial Methods 141
6.3 Instance-Based Learning w/ k-Nearest Neighbor 149
References 156
7 Network Analysis with NetworkX 157
7.1 Working with Graph Objects 159
7.2 Simulating a Social Network (ie; Directed Network Analysis) 163
7.3 Analyzing a Social Network 169
References 171
8 Basic Algorithmic Learning 173
8.1 Linear Regression 173
8.2 Logistic Regression 183
8.3 Naive Bayes Classifiers 189
References 195
9 Web-Based Data Visualizations with Plotly 197
9.1 Collaborative Aanalytics 197
9.2 Basic Charts 208
9.3 Statistical Charts 212
9.4 Plotly Maps 216
References 219
10 Web Scraping with Beautiful Soup 221
10.1 The BeautifulSoup Object 224
10.2 Exploring NavigableString Objects 228
10.3 Data Parsing 230
10.4 Web Scraping 233
10.5 Ensemble Models with Random Forests 235
References 254
Data Science Projects 257
11 Covid19 Detection and Prediction 259
Bibliography 275
12 Leaf Disease Detection 277
Bibliography 283
13 Brain Tumor Detection with Data Science 285
Bibliography 295
14 Color Detection with Python 297
Bibliography 300
15 Detecting Parkinson's Disease 301
Bibliography 302
16 Sentiment Analysis 303
Bibliography 306
17 Road Lane Line Detection 307
Bibliography 315
18 Fake News Detection 317
Bibliography 318
19 Speech Emotion Recognition 319
Bibliography 322
20 Gender and Age Detection with Data Science 323
Bibliography 339
21 Diabetic Retinopathy 341
Bibliography 350
22 Driver Drowsiness Detection in Python 351
Bibliography 356
23 Chatbot Using Python 357
Bibliography 363
24 Handwritten Digit Recognition Project 365
Bibliography 368
25 Image Caption Generator Project in Python 369
Bibliography 379
26 Credit Card Fraud Detection Project 381
Bibliography 391
27 Movie Recommendation System 393
Bibliography 411
28 Customer Segmentation 413
Bibliography 431
29 Breast Cancer Classification 433
Bibliography 443
30 Traffic Signs Recognition 445
Bibliography 453
Este título pertence ao(s) assunto(s) indicados(s). Para ver outros títulos clique no assunto desejado.
Business Intelligence; Data Engineering; Decision Science; Artificial Intelligence; Machine Learning; Supervised Learning; Classification; Cross validation; Clustering; Deep Learning; Linear Regression; A/B Testing; Hypothesis Testing; Statistical Power; Standard Error; Exploratory Data Analysis (EDA); Data Visualization; R programming; Python; SAS; SPSS; TABLEAU; VBA; Structured Query Language; ETL; GitHub; Data Models; Data Warehouse; AWS; Big data; Cloud computing; Data Analytics; Data Exploration; Data Management; Data wrangling; Data processing; Data security; Data governance; Data Manipulation; Data Migration; DevOps; HADOOP; Multivariate; Calculus & Linear Algebra; MATLAB