Might-be-trivial issue to you: the repo size is growing #3797

xied75 · 2017-02-15T13:33:22Z

xied75
Feb 15, 2017

It used to take a flash to clone the repo. Now the repo size has grown to a whopping 257MB, with but 21720 git objects in the pack to be downloaded.

Why I use whopping, to compare:
Git itself has 286253 git objects, with a size of 112MB
.NET Corefx has 749047 git objects, with a size of 181MB.

With a checkout on the master, we can see this folder info:

82M     ./docs
464K    ./githubresources
924K    ./Microsoft.Toolkit.Uwp
796K    ./Microsoft.Toolkit.Uwp.Notifications.NETStandard
664K    ./Microsoft.Toolkit.Uwp.Notifications.Portable
1.7M    ./Microsoft.Toolkit.Uwp.Notifications.UWP
2.2M    ./Microsoft.Toolkit.Uwp.Notifications.WinRT
75M     ./Microsoft.Toolkit.Uwp.SampleApp
1.5M    ./Microsoft.Toolkit.Uwp.Services
852K    ./Microsoft.Toolkit.Uwp.UI
880K    ./Microsoft.Toolkit.Uwp.UI.Animations
2.9M    ./Microsoft.Toolkit.Uwp.UI.Controls
2.0M    ./Notifications
43M     ./UnitTests
976K    ./UnitTests.Notifications.Portable
16M     ./UnitTests.Notifications.UWP

Going further down we can discover a lot culprits being images, e.g. under UWPCommunityToolkit/docs/resources/images we have a list:

82K adaptive.GIF
141K AddNugetServices.png
266K Animations-Blur.gif
34K Animations-Fade.gif
967K Animations-FadeHeader.gif
132K Animations-Light.gif
93K Animations-Offset.gif
285K Animations-Rotate.gif
177K Animations-Scale.gif
163K choosetoolboxitems.png
5.5M Controls-AdaptiveGridView.gif
625K Controls-BladeView.gif
117K Controls-DropShadowPanel.png
502K Controls-Expander.gif
12K Controls-GridSplitter.png
1.4M Controls-HamburgerMenu.gif
2.8K Controls-HeaderedTextBlock.png
1.9M Controls-ImageEx.gif
6.7M Controls-MarkdownTextBlock.gif
3.8M Controls-MasterDetailsView.gif
125K Controls-PullToRefreshListView.gif
289K Controls-RadialGauge.gif
24K Controls-RangeSelector.gif
1.7M Controls-RotatorTile.gif
230K Controls-ScrollHeader.gif
231K Controls-SlidableListItem.gif
5.8K Controls-TextBoxMask.png
5.8K Controls-TextBoxRegex.png
9.2K Controls-WrapPanel.png
2.3M hamburgermenu.gif
8.5K head.GIF
496K herotile.png
132K imageex.GIF
8.0M LoadingXamlControl.gif
36K ManageNugetPackages.png
183K Notifications-LiveTile.gif
174K Notifications-PopToast.gif
18K Notifications-WeatherLiveTileAndToast.png
51K NugetPackages.png
27M ParallaxService.gif
23K radial.GIF
6.1K range.GIF
18M ReorderGrid.gif
101K sampleapp.png
45K sampleapp-small.png
104K slideable.GIF
63K SurfaceDialTextboxAnim.gif
16K SurfaeDial.jpg
118K TileControl.gif
4.5K toolboxfinal.png
43K Toolkit_Responsive_Behavior_v01_img-MD-SM.png
49K Toolkit_Responsive_Behavior_v01_img-XL.png
25K Toolkit_Responsive_Behavior_v01_img-XS.png
12K weatherlivetilentoastNotification.GIF
60K WinSDKFBInstall.png

Seriously guys, one single gif to be 27M?

No intention to offend anyone. But git was designed to version on text data, not binary image. And once you commit and push, it will be in the history almost forever, it is very hard to try to remove an object from history (although possible). In our case, we can't even escape from git clone --depth=1 trick that you can do if you dare to clone Linux kernel (object number approaching 6 million, download size around 2GB), since one recent commit will still drag in all these images.
(edit: now I tried clone our toolkit with depth 1, we are downloading 1276 objects, with a download size of 105MB)

Having a huge repo is bad, think about how fast CI/CD can run, jenkins/appveyor can only start to build after clone. And the wasted Internet bandwidth plus kittens.

Sorry about my ultra-sensitivity on this, I was testing my store App built-in git clone functions using the toolkit repo as a target, and found out it's extremely slow...

Odonno · 2017-02-15T13:50:52Z

Odonno
Feb 15, 2017

I agree. The repository should be huge. Until everyone can use GVFS, I think we can reduce the size of images/gif.

@xied75 What about the SampleApp folder? There is an abnormal size too.

0 replies

bkaankose · 2017-02-15T13:52:18Z

bkaankose
Feb 15, 2017

I agree on this. Also I wasn't able to compile the solution for some reason when I cloned the repo from the scratch. I don't remember the reason but there are some things that prevent solution from building. Not sure if they're still there though. I need to check again.

0 replies

xied75 · 2017-02-15T14:06:21Z

xied75
Feb 15, 2017
Author

@Odonno you mean it should be huge? :) Regarding the SampleApp, UWPCommunityToolkit/Microsoft.Toolkit.Uwp.SampleApp/Assets/Photos

0 replies

deltakosh · 2017-02-15T14:33:26Z

deltakosh
Feb 15, 2017

Interesting feedback. what do you recommend? if we push an update to reduce gif size the original will still be in history

0 replies

xied75 · 2017-02-15T14:35:03Z

xied75
Feb 15, 2017
Author

@deltakosh that solve --depth=1 and speedup CI/CD. Regarding big thing in history, ............. I can try to find a solution if we prefer to solve this.

0 replies

deltakosh · 2017-02-15T14:36:37Z

deltakosh
Feb 15, 2017

I would like both:)
As a first step I agree we can at least reduce image size, if someone wants to volunteer;)

0 replies

xied75 · 2017-02-15T14:38:12Z

xied75
Feb 15, 2017
Author

Probably also set a RULE regarding image size also.

0 replies

deltakosh · 2017-02-15T16:23:58Z

deltakosh
Feb 15, 2017

Do you want to try reducing picture size?

0 replies

skendrot · 2017-02-15T16:31:18Z

skendrot
Feb 15, 2017
Collaborator

I'm guessing items like the 27 meg gif will need to be re-recorded, not just compressed

0 replies

hermitdave · 2017-02-16T10:13:24Z

hermitdave
Feb 16, 2017

@skendrot I think gifs need resizing.. I will push it on a separate branch

0 replies

hermitdave · 2017-02-16T10:18:48Z

hermitdave
Feb 16, 2017

@deltakosh I think instead of certain animated gifs we might be better off with screen capture video as wmv

0 replies

hermitdave · 2017-02-16T10:19:16Z

hermitdave
Feb 16, 2017

both parallax and animate grid videos are way too large

0 replies

lucaasrojas · 2018-06-25T15:35:14Z

lucaasrojas
Jun 25, 2018

Hey there! What's the current status of this issue?

0 replies

michael-hawker · 2020-03-27T17:15:37Z

michael-hawker
Mar 27, 2020
Maintainer

@azchohfi @HerrickSpencer any thoughts on how we optimize our git history? Pulling down the repo is like 436MB now! However, 419MB of that is the .git folder...

0 replies

azchohfi · 2020-03-27T17:22:30Z

azchohfi
Mar 27, 2020
Maintainer

Without loosing history, by push forcing the removal of big files, or all commits to master, I don't think there is a way. It would also be a pain to anyone that already cloned the repo, or forked it, or that have a current PR in place.

0 replies

michael-hawker · 2020-03-27T23:23:30Z

michael-hawker
Mar 27, 2020
Maintainer

@azchohfi would a good time to do something like that be maybe for when we swap over to WinUI 3 as we'd be modifying so many files anyway?

0 replies

azchohfi · 2020-03-27T23:39:26Z

azchohfi
Mar 27, 2020
Maintainer

We could test it and see how much it would save (probably a lot).

0 replies

HerrickSpencer · 2020-06-10T21:30:25Z

HerrickSpencer
Jun 10, 2020
Collaborator

@azchohfi @HerrickSpencer any thoughts on how we optimize our git history? Pulling down the repo is like 436MB now! However, 419MB of that is the .git folder...

Interesting. I will investigate this option. I'm hopeful there are some steps we can take. I'll suggest a few for us to discuss.

0 replies

HerrickSpencer · 2020-06-15T20:10:21Z

HerrickSpencer
Jun 15, 2020
Collaborator

Relaying internal conversation:
I've done some investigation on options to reduce our repo size.

As far as I see it we don't have too many good ones that don't involve making a divergent master branch.

The main folder that is causing size bulk is .git, (445mb) likely to the large history we have
All folders under that are not so huge. Sample app is the next largest with <7mb

Atlassian gives some good options, some involve making a branch that shares a base with first commit, and squashes all commits after that. I did this locally, and cloned from that local branch and got the same size .git folder

So I'm not sure this would work unless we actually orphan the master branch, and relegate a copy of the history to an entirely different repo.

Since all options I have discovered still cause a divergent branch, all current consumers of the repo would need to force pull from the repo to get the smaller size. This could cause issues for people, I suggest a doc/notification to explain how to do this.

One other interesting option is setting up a Sparse-Checkout option in our current repo, to exclude folders that aren't expressly needed to work as a developer. I suspect that there won't be too many of these, and of these folders they won't cause much of a size decrease.

Conclusion: if a 445MB history is causing issues in the community, we can setup an alternate repo that shares a base commit. This way we can pull changes between the repos. This will add a lot of complexity, but would likely work.

Suggestion is to do a 'hard history reset' every so often (#of commits, or yrs) that will push all changes older than a recent release into a secondary history repo, and reset the history on the current repo to a shared commit (the release commit).

I will also look into the history some deletions of very large files (images?) ... there is an option to GC these out of the history as well. I can look into this option.

Per conversation with @michael-hawker, the realistic option is that we do the hard reset option when we make a huge refactor of code base, such as a move to WinUI 3, then start with a new history of 7.0 + the merge of the WinUI3 branch. The rest of the history can then be archived to its own repo, and linked with a base commit.

We'll consider this at next large release.

0 replies

xied75 · 2020-06-18T15:38:30Z

xied75
Jun 18, 2020
Author

I vote for leaving the current as is. And starting a brand new repo.

0 replies

Kyaa-dost · 2021-03-01T20:46:07Z

Kyaa-dost
Mar 1, 2021

Will be moving this to the discussion as that will be our new platform for all the older discussions that still require further input and clarity.

0 replies

Might-be-trivial issue to you: the repo size is growing #3797

Uh oh!

Replies: 21 comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xied75 Feb 15, 2017 Author

Uh oh!

Uh oh!

xied75 Feb 15, 2017 Author

Uh oh!

Uh oh!

xied75 Feb 15, 2017 Author

Uh oh!

Uh oh!

skendrot Feb 15, 2017 Collaborator

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

michael-hawker Mar 27, 2020 Maintainer

Uh oh!

azchohfi Mar 27, 2020 Maintainer

Uh oh!

michael-hawker Mar 27, 2020 Maintainer

Uh oh!

azchohfi Mar 27, 2020 Maintainer

Uh oh!

HerrickSpencer Jun 10, 2020 Collaborator

Uh oh!

HerrickSpencer Jun 15, 2020 Collaborator

Uh oh!

xied75 Jun 18, 2020 Author

Uh oh!

xied75
Feb 15, 2017
Author

xied75
Feb 15, 2017
Author

xied75
Feb 15, 2017
Author

skendrot
Feb 15, 2017
Collaborator

michael-hawker
Mar 27, 2020
Maintainer

azchohfi
Mar 27, 2020
Maintainer

michael-hawker
Mar 27, 2020
Maintainer

azchohfi
Mar 27, 2020
Maintainer

HerrickSpencer
Jun 10, 2020
Collaborator

HerrickSpencer
Jun 15, 2020
Collaborator

xied75
Jun 18, 2020
Author