-
Notifications
You must be signed in to change notification settings - Fork 33
Update markdown instructions for Penguin classification exercise (1) #66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 3 commits
4d198d8
6f39653
224d3cd
dcee528
74bb0c6
5ebca66
97ef68d
7d4a8c2
4cddfee
16534f6
acb9039
a772685
ce19a72
525602c
937d2ba
71c17a6
bf82e32
5a25012
a3fda0f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -23,11 +23,11 @@ | |
"### Task 1: look at the data\n", | ||
"In the following code block, we import the ``load_penguins`` function from the ``palmerpenguins`` package.\n", | ||
"\n", | ||
"- Call this function, which returns a single object, and assign it to the variable ``data``.\n", | ||
" - Print ``data`` and recognise that ``load_penguins`` has returned a ``pandas.DataFrame``.\n", | ||
"- Consider which features it might make sense to use in order to classify the species of the penguins.\n", | ||
" - You can print the column titles using ``pd.DataFrame.keys()``\n", | ||
" - You can also obtain useful information using ``pd.DataFrame.Series.describe()``" | ||
"- Call this function, which returns a single object in the form of a ``pandas.DataFrame``, and assign it to the variable ``data``.\n", | ||
" - Print ``data`` and recognise that ``load_penguins`` has returned the dataframe.\n", | ||
"- Analyse which features it might make sense to use in order to classify the species of the penguins.\n", | ||
" - You can print the column names using ``pd.DataFrame.keys()``\n", | ||
" - You can also obtain useful statistical information on the dataset using ``pd.DataFrame.Series.describe()``" | ||
] | ||
}, | ||
{ | ||
|
@@ -108,23 +108,25 @@ | |
"source": [ | ||
"### Task 2: creating a ``torch.utils.data.Dataset``\n", | ||
"\n", | ||
"All PyTorch dataset objects are subclasses of the ``torch.utils.data.Dataset`` class. To make a custom dataset, create a class which inherits from the ``Dataset`` class, implement some methods (the Python magic (or dunder) methods ``__len__`` and ``__getitem__``) and supply some data.\n", | ||
"To be able to use Pytorch functionalities, we need to make the dataset compatible with Pytorch. We do it using PyTorch's Dataset class called ``torch.utils.data.Dataset``. \n", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. PyTorch There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I still think we should keep the bit on
Followed by: Sorry, I realise this is fine. It's just been restructured. |
||
"\n", | ||
"Spoiler alert: we've done this for you already in ``src/ml_workshop/_penguins.py``.\n", | ||
"To make a custom dataset, create a new class which inherits from the ``Dataset`` class, implement some methods (the Python magic (or dunder) like ``__len__`` and ``__getitem__``) and supply data.\n", | ||
"\n", | ||
"- Open the file ``src/ml_workshop/_penguins.py``.\n", | ||
"Spoiler alert: we've done this for you already in ``worked-solutions/01_penguin_classification_solutions.ipynb``.\n", | ||
"\n", | ||
"- Open the above mentioned file.\n", | ||
"- Let's examine, and discuss, each of the methods together.\n", | ||
" - ``__len__``\n", | ||
" - What does the ``__len__`` method do?\n", | ||
" - The ``__len__`` method is a so-called \"magic method\", which tells python to do if the ``len`` function is called on the object containing it.\n", | ||
" - The ``__len__`` method is a so-called \"magic method\" in python, that defines what happens when the ``len`` function is called on an object.\n", | ||
" - ``__getitem__``\n", | ||
" - What does the ``__getitem__`` method do?\n", | ||
" - The ``__getitem__`` method is another magic method which tells python what to do if we try and index the object containing it (i.e. ``my_object[idx]``).\n", | ||
"- Review and discuss the class arguments.\n", | ||
" - ``input_keys``— A sequence of strings telling the data set which objects to return as inputs to the model.\n", | ||
" - ``target_keys``— Same as ``input_keys`` but specifying the targets.\n", | ||
" - ``input_keys``— A sequence of strings telling the data set which objects to return as inputs to the model. These are basically the input column names.\n", | ||
" - ``target_keys``— Same as ``input_keys`` but specifying the targets columns.\n", | ||
" - ``train``— A boolean variable determining if the model returns the training or validation split (``True`` for training).\n", | ||
" - ``x_tfms``— A ``Compose`` object with functions which will convert the raw input to a tensor. This argument is _optional_.\n", | ||
" - ``x_tfms``— A ``Compose`` object with functions which will convert the raw input to a tensor. This argument is _optional_. Remember Pytorch deals with tensors only.\n", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how about: Recall that PyTorch deals with |
||
" - ``y_tfms``— A ``Compose`` object with functions which will convert the raw target to a tensor. This argument is _optional_." | ||
] | ||
}, | ||
|
@@ -900,7 +902,7 @@ | |
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.11.4" | ||
"version": "3.12.4" | ||
} | ||
}, | ||
"nbformat": 4, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
statistical
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 'Consider' is probably fine here, but either is fine.