-
Classification overview
The classification overview document is up on the resources page. Check it out for distilled best practices and core tools and techniques for training classification models.
-
Lesson 20 activity
The lesson 20 activity notebook has been uploaded - this time around we are participating in a Kaggle playground competition attempting to classify patients and diabetic or non-diabetic. See the full competition details here: Playground Series - Season 5, Episode 12: Diabetes Prediction Challenge.
-
Git LFS for large files
The GitHub repo that hosts this site has been updated to use Git LFS for large file storage. This let’s us host datasets larger than GitHubs file size limit of 100 MB. If you use this site only to download resources, this change should not effect you. If you have a fork of the repo and/or what to contribute to it, you will need to enable LFS and re-sync. If you are using the devcontainer configuration, this should be handled for you by re-building the container after syncing your fork.
-
Lesson 19: activity solution
The solution for the lesson 19: regression activity is up on the jupyter notebooks page.
-
Lesson 19: updated demo
The in class demo notebook for lesson 19: supervised learning regression has been updated. Have a good long weekend, see you on Monday!
-
Lesson 19: regression materials
Just added the in-class demo notebook and an activity for tonights session on supervised learning and regression models.
-
Lesson 15 demo update
Made some updates to the Lesson 15: data wrangling demo notebook. Switched to the feature used for the outlier analysis section to
MedIncrather thanMedHouseVal- I think it gives more interesting and informative results. You can easily re-run the notebook with any feature you want to try it out. Just update themissing_featureand/oroutlier_featurevariables in the setup section at the start of the notebook. Also switched to using Scikit-learn’sKBinsDiscretizerrather thanpd.cut()for outlier binning. KBinsDiscretizer uses quantile binning - this is what we want, not equal size bins. -
Lesson 15 demo
Cleaned up and posted the Lesson 15: Data Wrangling demo notebook.
-
Data Wrangling Overview
Just added a data wrangling overview page to the resources tab. It contains best practices suggestions and references for common techniques and methods for cleaning and reshaping, dealing with missing data and handing outliers.
-
Lessons 12, 13 & 14 activity solutions
Updated solution notebooks for Lessons 12, 13 and 14 are up on the Jupyter notebooks page.
-
New statistics overview page
Added a comprehensive Statistics Overview for quick reference on descriptive statistics, probability distributions and statistical testing.
-
Lesson 11 activity solution
Lesson 11 activity solution notebook is up. Note: we will continue to work on this activity tonight, the plots currently in the solution notebook fulfill the requirements. But, in many cases we could do a lot better!
-
Incremental capstone solution
Andrew’s incremental capstone solution is up on the Jupyter notebooks page. I also added the incomplete lesson 11 demo notebook for your reference. It will be updated next week after we finish lesson 11.
-
Lesson 10: activity solution now available
Just added the solution notebook for the lesson 10 Pandas activity. The lesson 10 demo notebook has also been cleaned up and updated.
-
Extra Pandas practice
Andrew added some extra Pandas practice notebooks. If you are looking for some more Pandas problems to try out, check them out on the Jupyter notebooks page.
-
Pandas activity
Added the Pandas activity notebook to the Jupyter notebooks page. Also included some links to Pandas resources on the resources page.
-
Extra NumPy practice problems
Added some extra NumPy practice problems from Andrew to the Jupyter notebooks page. The Lesson 9 demo has also been updated. Note: to run the image demonstration you need to install the matplotlib library via pip.
-
Lesson 9 activity
Lesson 9: NumPy activity notebook is up on the Jupyter notebooks page. I also added some links to more information about NumPy to the resources page.
-
Lesson 6 demo & activity solution
The demo notebook and activity solution from lesson 6 are up. Take a look on the Jupyter notebooks page.
-
New lesson 5 & 6 materials
Posted the in class demo notebook and the activity solution from lesson 5. The lesson 6 activity is also available on the Jupyter notebooks page. See you in class!
-
Lesson 5 Activity Update
Hi all, removed the extra/erroneous test cases in problem 2 of the lesson 5 activity. The link on the Jupyter notebooks page will now give you the updated notebook. Mea culpa!
-
Lesson 5 Activity
The lesson 5 activity notebook, covering loops and conditionals is up on the Jupyter notebooks page. We will spend the last hour or so of class working on it in breakout rooms tonight.
-
Lesson 4 Activity Solution
Just added the lesson 4 activity solution to the Jupyter notebooks page. We will take a look together at the start of class on Wednesday.
-
Resources & Notebooks Pages
Added a page for Jupyter notebooks and a page for resource links (see the upper left of this page).
-
Devops how-tos for new students
Just posted some how-to guides for getting set up with Python & Jupyter notebooks, either via Vocareum, another cloud service or on your local machine. See the DevOps overview page.