Resources

TensorFlow notes (work-in-progress)

Tensorflow setup options
TensorFlow development environment GitHub repository: gperdrizet/tensorflow-GPU
George’s TensorFlow 2.16 container on DockerHub: gperdrizet/tensorflow-gpu
NVIDIA’s TensorFlow Release 24.06 container

How-to Guides & Instructions

Cheat sheets & syntax reference

Data science library information

NumPy: A core library for efficient numerical computations and multi-dimensional array operations in Python.
Pandas: Provides high-level data structures (DataFrame, Series) and powerful tools for data manipulation and analysis.
Matplotlib: A versatile plotting library for creating static, animated, and interactive visualizations in Python.
Seaborn: A statistical data visualization library built on Matplotlib that provides attractive themes and higher-level plotting functions.
SciPy: A collection of scientific computing tools built on NumPy for optimization, integration, signal processing, and more.
Statsmodels: Offers classes and functions for estimating statistical models, conducting hypothesis tests, and performing data exploration.

Other interesting links

Further topics in data wrangling/data analysis
- For an interesting alternative to Pandas see Polars
- For N dimensional, labeled arrays see Xarray
- For parallel, distributed dataframes see PySpark and Dask
- For GPU accelerated data analysis see: CuPy and RAPIDS
- For data pipeline workflow management see: Luigi or Airflow
Data visualization
- The Python graph gallery: lots different plot examples
- Data to vis: example plots by data type
- PyWaffle ‘waffle’ style proportion plots in python
- squarify square treemap layouts in Python

Incremental capstone slides

Unit 2: Applied Data Science with Python

Incremental capstone 1: import and clean data

Unit 3: Machine Learning

Incremental capstone 5: Florida Bike Rentals