How-to Guides & Instructions
- DevOps Guides Overview
- Plotting Overview
- Statistics Overview
- Data Wrangling Overview
- Feature Engineering Overview
- Regression Overview
- Classification Overview
Cheat sheets & syntax reference
- Jupyter notebooks
- VScode (Windows)
- VScode (MacOs)
- Git
- NumPy
- Pandas
- NumPy: A core library for efficient numerical computations and multi-dimensional array operations in Python.
- Pandas: Provides high-level data structures (DataFrame, Series) and powerful tools for data manipulation and analysis.
- Matplotlib: A versatile plotting library for creating static, animated, and interactive visualizations in Python.
- Seaborn: A statistical data visualization library built on Matplotlib that provides attractive themes and higher-level plotting functions.
- SciPy: A collection of scientific computing tools built on NumPy for optimization, integration, signal processing, and more.
- Statsmodels: Offers classes and functions for estimating statistical models, conducting hypothesis tests, and performing data exploration.
Other interesting links
- Further topics in data wrangling/data analysis
- For an interesting alternative to Pandas see Polars
- For N dimensional, labeled arrays see Xarray
- For parallel, distributed dataframes see PySpark and Dask
- For GPU accelerated data analysis see: CuPy and RAPIDS
- For data pipeline workflow management see: Luigi or Airflow
- Data visualization
Incremental capstone slides
Unit 2: Applied Data Science with Python
- Incremental capstone 1: import and clean data
Unit 3: Machine Learning
- Incremental capstone 5: Florida Bike Rentals