Data Science with Python

Course Overview

Data Science with Python Training is a course designed to introduce participants to the field of data science and the Python programming language. The course covers a range of topics, including statistical analysis, machine learning, data visualization, and data manipulation. It is designed for people who want to learn how to use Python for data analysis, or for those who are already familiar with Python and want to learn more about data science. The course may include lectures, hands-on exercises, and projects to help participants develop their skills and knowledge.

At the end of the training, participants will be able to:

  • 1.Introducing participants to the field of data science and the Python programming language, including key concepts, tools, and techniques used in data science.
  •  
  • 2.Teaching participants how to use Python for data analysis and manipulation, including importing, cleaning, and manipulating data.
  •  
  • 3.Providing an overview of statistical analysis and machine learning techniques, and demonstrating how to apply these techniques using Python.

Pre-requisite

Some Programming Experience

Duration

3 days

Course Outline

  1. What is analytics and data science?
  2. Common terms in analytics
  3. Different Sectors Using Data Science
  4. Purpose and Components of Python
  1. What is Python?
  2. Features of Python
  3. Why Python?
  4. Interpreter and types
  5. Applications of Python
  6. “Hello World” program
  7. Variables
  8. Types of variable datatypes
  9. Example programs with each type
  10. Operators
  11. Types of operators
  12. Basic programs
  13. Operator overloading
  14. Define control statements
  15. Types of control statements
  16. Why Looping statements are used?
  17. Types of looping statements
  18. Range function
  19. Functions
  20. Types of functions
  21. Global and local variables
  22. Modules
  23. Types of modules and use
  24. What is Files?
  25. Type of Files
  26. File Access Mode
  27. Handling I/O
  28. Oops concept
  29. Collection
  30. Collection module and types
  31. Types of error
  32. Exception handling
  33. Concept of Packages/Libraries – Important packages(NumPy, Pandas, Matplotlib)
  1. What is Numpy
  2. What is Ndarray
  3. Data types in NumPy
  4. Mathematical Functions
  5. Array manipulation
  6. Numpy array visualization
  7. Broadcasting
  1. What is Pandas
  2. Concepts of Pandas
  3. Why and how pandas is used for data manipulation
  4. Cleansing Data with Python
  5. Data Manipulation
  6. Data manipulation tools
  7. Python Built-in Functions (Text, numeric, date, utility functions)
  8. Python User Defined Functions
  9. Stripping out extraneous information
  10. Normalizing data
  1. Creating Graphs- Bar/pie/line chart/histogram/ boxplot/ scatter/ density etc)
  2. Data Analytics Conclusion or Predictions
  3. Data Analytics Communication
  1. Importing Data from various sources (csv, txt, excel, access, etc)
  2. Connecting to database
  3. Viewing Data objects – sub setting, methods
  4. Exporting Data to various formats
  1. Basic Statistics – Measures of Central Tendencies and Variance
  2. Building blocks – Probability Distributions – Normal distribution – Central Limit Theorem
  3. Inferential Statistics -Sampling – Concept of Hypothesis Testing
  4. Statistical Methods – Z/t-tests( One sample, independent, paired), Anova, Correlations, and Chi-square
  5. Introduction exploratory data analysis
  6. Descriptive statistics, Frequency Tables
  7. Univariate Analysis (Distribution of data & Graphical Analysis)
  1. Concept of model in analytics and how it is used?
  2. Common terminology used
  3. Popular modelling algorithms
  4. Types of Business problems – Mapping of Techniques
  5. Different Phases of Predictive Modelling
  6. EDA for exploring the data and identifying any problems with the data
  7. Identify missing data
  8. Identify outliers data
  9. Visualize the data trends and patterns
  1. What is regression?
  2. Applications of regression
  3. Types of regression
  4. Fitting the regression line
  5. Simple linear regression
  6. Simple linear regression in python
  7. Polynomial regression
  8. Polynomial regression in python
  9. Gradient Descent
  10. Cost function
  11. Regularization
  12. Ridge and lasso Regression
  1. How is classification used?
  2. Applications of classification
  3. Logistic Regression, Sigmoid function
  4. Decision tree
  5. K-Nearest Neighbors (K-NN)
  6. SVM
  7. Naive Bayes
  8. Confusion Matrix
  9. Precision, Recall
  10. F1-score
  11. RoC, AuC
  12. n-fold cross validation
  13. Measuring classifier performance
  14. Factors affecting classifier performance
  15. Overfitting
  16. Ensemble Learning
  17. Bagging and Boosting
  1. Application of Unsupervised learning, examples and applications
  2. Clustering
  3. Hierarchical Clustering in Python, Agglomerative and Divisive techniques
  4. Measuring the distance between two clusters
  5. k-means algorithm
  6. Limitations of K-means clustering
  7. SSE and Distortion measurements
  8. Demo: Agglomerative Hierarchical clustering
  1. Time Series Forecasting
  2. Introduction – Applications
  3. Time Series Components( Trend, Seasonality, Cyclicity and Level) and Decomposition
  4. Classification of Techniques(Pattern based – Pattern less)
  5. Basic Techniques – Averages, Smoothening, etc
  6. Advanced Techniques – AR Models, ARIMA, etc
  7. Understanding Forecasting Accuracy – MAPE, MAD, MSE, etc
  1. Web Scraping and Parsing
  2. Understanding and Searching the Tree
  3. Navigating options
  4. Modifying the Tree
  5. Parsing and Printing the Document

Reviews