Skip to content

Fine-tune a Vision Transformer Model with a custom biomedical dataset #85

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 6, 2024

Conversation

emre570
Copy link
Contributor

@emre570 emre570 commented Apr 25, 2024

Hello folks @merveenoyan @stevhliu ,

I'm planning to make a notebook about fine-tuning a vit model with a custom biomedical dataset.
I have a code ready to use, made for my graduation project.
I used HF Datasets for dataset works, HF Transformers and Trainer for fine-tuning.
(Optional) Metrics won't show at training process momentarily, I can add a custom callback function.

If it's okay, i will push my code, then begin the editing for beginners. I am waiting for your opinions and suggestions.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@stevhliu
Copy link
Member

Yes, looking forward to reviewing your notebook! 🤗

Added and edited the notebook
@emre570
Copy link
Contributor Author

emre570 commented Apr 26, 2024

Hello @stevhliu, made the first commit

I edited and organized all sections, waiting for your opinions and reviews

@@ -0,0 +1,767 @@
{
Copy link
Member

@stevhliu stevhliu Apr 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In "Dataset Info", it'd be nice to briefly explain what the images are of so users have more context about what they're training the model to do.


Reply via ReviewNB

@@ -0,0 +1,767 @@
{
Copy link
Member

@stevhliu stevhliu Apr 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this dataset available on the Hub? I think it'd be easier for users to follow along if they could also download the dataset or if you provided some more information/details about how a user can create their own dataset with their images.


Reply via ReviewNB

@@ -0,0 +1,767 @@
{
Copy link
Member

@stevhliu stevhliu Apr 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe say something like the following to avoid confusion with the next sentence where you say we can see the features again.

"We can the image is a PIL.Image with a label associated with it."


Reply via ReviewNB

@stevhliu
Copy link
Member

Make sure to add your notebook to the toctree!

@emre570
Copy link
Contributor Author

emre570 commented Apr 27, 2024

Hello @stevhliu, I made some changes

I put some images from dataset to "Dataset Info" section, but I have some questions.

The user can find similar datasets from Kaggle, and I can also upload the dataset to Hub. What should I do?
In toctree, which section should I put the notebook?
Last question, you said "We can the image is a PIL.Image with a label associated with it.". Sorry I didn't understand this. Where should I put it?

@stevhliu
Copy link
Member

The user can find similar datasets from Kaggle, and I can also upload the dataset to Hub. What should I do?

I think it'd be easiest to upload the dataset to the Hub so users can follow along with your notebook without putting in the extra work of finding a similar dataset from Kaggle if they don't want to.

In toctree, which section should I put the notebook?

I think you can create a new "Computer vision" section.

Last question, you said "We can the image is a PIL.Image with a label associated with it.".

Sorry for the typo. You can put that text before you call train_ds[0]. So in other words:

We can see the image is a PIL.Image with a label associated with it.

train_ds[0]

- Added notebook to toctree
- Put an image about dataset info
- Pushed the dataset to Hub
@emre570
Copy link
Contributor Author

emre570 commented Apr 29, 2024

Hello @stevhliu, I made the changes you asked for.

  • Put an image to Dataset info so user can see some images from dataset.
  • Edited the sections you corrected.
  • Added notebook to toctree
  • Pushed the dataset to Hub and waiting to be public when notebook releases.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link

review-notebook-app bot commented Apr 30, 2024

View / edit / reply to this conversation on ReviewNB

merveenoyan commented on 2024-04-30T13:32:17Z
----------------------------------------------------------------

maybe you could give link to the base model :)


emre570 commented on 2024-04-30T17:52:20Z
----------------------------------------------------------------

I already did, it should direct you to model's HF page

Copy link

review-notebook-app bot commented Apr 30, 2024

View / edit / reply to this conversation on ReviewNB

merveenoyan commented on 2024-04-30T13:32:18Z
----------------------------------------------------------------

nit: let's snake case the variable names for consistency with the rest of the recipe


Copy link

review-notebook-app bot commented Apr 30, 2024

View / edit / reply to this conversation on ReviewNB

merveenoyan commented on 2024-04-30T13:32:19Z
----------------------------------------------------------------

maybe call push_to_hub explicitly as well


Copy link

review-notebook-app bot commented Apr 30, 2024

View / edit / reply to this conversation on ReviewNB

merveenoyan commented on 2024-04-30T13:32:20Z
----------------------------------------------------------------

nit: scikit-learn


Copy link

review-notebook-app bot commented Apr 30, 2024

View / edit / reply to this conversation on ReviewNB

merveenoyan commented on 2024-04-30T13:32:21Z
----------------------------------------------------------------

this is nice, but also we could put classification score because in this case we care about recall a lot (we don't want to miss malignant ones that look like benign or malign)


Copy link
Collaborator

@merveenoyan merveenoyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly nits, thanks a lot! we can merge afterwards IMO, very well made

Copy link
Contributor Author

emre570 commented Apr 30, 2024

I already did, it should direct you to model's HF page


View entire conversation on ReviewNB

@emre570
Copy link
Contributor Author

emre570 commented Apr 30, 2024

Hello @merveenoyan, I made changes you asked. The notebook had issues. Your request for recall score saved nearly everything 😁.

Some cells had errors and could've ruined all work. I made the notebook from scratch and fixed all code. It is fully working now and ready to use.

Copy link

review-notebook-app bot commented May 2, 2024

View / edit / reply to this conversation on ReviewNB

stevhliu commented on 2024-05-02T17:22:27Z
----------------------------------------------------------------

The link to the image doesn't work. Can you upload it to https://huggingface.co/datasets/huggingface/cookbook-images and then link from there?


@stevhliu
Copy link
Member

stevhliu commented May 2, 2024

One more comment, then we can merge! 🤗

- Opened PR and added the image.
@emre570
Copy link
Contributor Author

emre570 commented May 2, 2024

Hey @stevhliu, I think I did it, opened a PR and uploaded the image, it should work now.

@merveenoyan
Copy link
Collaborator

@emre570 I just merged your PR to dataset repository

Copy link
Collaborator

@merveenoyan merveenoyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot @emre570 once @stevhliu approves we can merge!

@emre570
Copy link
Contributor Author

emre570 commented May 2, 2024

Thanks folks, it was a pleasure ❤️

Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the contribution! 🤗

@stevhliu stevhliu merged commit 3c0cff3 into huggingface:main May 6, 2024
1 check passed
@emre570
Copy link
Contributor Author

emre570 commented May 6, 2024

HUGE thanks folks, again, it was a pleasure ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants