Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repository Size Limit #3658

Open
jonasfranz opened this issue Mar 11, 2018 · 49 comments
Open

Repository Size Limit #3658

jonasfranz opened this issue Mar 11, 2018 · 49 comments
Labels
type/enhancement An improvement of existing functionality type/proposal The new feature has not been accepted yet but needs to be discussed first.

Comments

@jonasfranz
Copy link
Member

jonasfranz commented Mar 11, 2018

It would be nice if it possible to set a global or user repository size limit. This would be good for Gitea providers providing Gitea for the public with limited disk space to have this option as mentioned in #1029 .

Approaches for restricting repo size

Solution 1

  • User gets a notification if a repository exceeded the size limit
  • Admin gets a notification if a repository exceeded the size limit (optional)
  • Repository gets deleted (after a ultimatum) if a user did not reduced the size of the repository

Solution 2

  • The size of every push gets approved (I don't know if that is possible or realistic) and blocked if the push is to large
@sbrl
Copy link

sbrl commented Mar 11, 2018

Suggestion: Have a warning limit too. That way users are warned when they are, say, 80% full - so they've got some time to do something about it.

@kolaente
Copy link
Member

I'd prefer the second solution + warning. It doesn't seem practical to completely delete a users repo if it is past the size limit.

@jonasfranz
Copy link
Member Author

jonasfranz commented Mar 12, 2018

@kolaente Do you have an idea who to calculate the size of a push before it is finished?

@sapk
Copy link
Member

sapk commented Mar 12, 2018

@JonasFranzDEV seems possible with a pre-receive hook that check size (cat-files) of the commit object. For examples : https://github.com/github/platform-samples/tree/master/pre-receive-hooks & https://stackoverflow.com/questions/40697663/show-commit-size-in-git-log

@jonasfranz
Copy link
Member Author

@sapk Thank you for the hint. It seems that size, err := git.GetRepoSize(repoPath) will return the size of the repository including the newest push if it is called at pre-receive.

How should the size limit work?

  1. Global size limit via the config file for all repositories
  2. Custom size limit for repositories and a default value from the config file
    a. How can change the limit of a repository and how? I propose the admin but I do not know if the admin panel or the repository settings are the right place to do this?
  3. Size limit per user via config file with the option the change the size limit via the admin panel like it is possible for the number of repositories. (user administration)

Other proposals are welcome too!

@techknowlogick
Copy link
Member

We could have a combination of the above, a global size limit to make sure that the disk doesn't run out of space, and a per user limit (in the case of some git hosts they say each user gets 1GB or something).

@jonasfranz
Copy link
Member Author

@techknowlogick Should a push (no matter which user) be restricted/denied, if the global size limit is exceeded?

@lafriks
Copy link
Member

lafriks commented Mar 16, 2018

I don't think 1st one is needed but would be great if it was combination of 2nd and 3rd. With defaults in config and custom values stored in repo and user tables with 0 being that default value limit should be applied and -1 that there is no limit

@lafriks
Copy link
Member

lafriks commented Mar 16, 2018

Oh one more thing - should LFS be counted and if so than that should probably be done seperetly

@jonasfranz
Copy link
Member Author

@lafriks

How can change the limit of a repository and how? I propose the admin but I do not know if the admin panel or the repository settings are the right place to do this?

@lafriks
Copy link
Member

lafriks commented Mar 16, 2018

@JonasFranzDEV for user limit in user editing in admin panel. For repository there is now also a admin specific option in repo settings already

@sapk
Copy link
Member

sapk commented Mar 16, 2018

Also what happen when a user try to push to an org ? or did we consider org like a distinct user ?

@lafriks
Copy link
Member

lafriks commented Mar 16, 2018

I would say that org should be treated as separate user

@techknowlogick
Copy link
Member

I don't think we should rely on just 2 & 3 as perhaps if you don't want to limit the number of users but you want to make sure you don't run out of space, then a global limit would be needed (ex. try.gitea.io runs out of space and if a user maxes out their space you don't want to encourage them to create an additional account). This is especially possible with an open gitea instance.

Regarding what we should do when the limit is reached, there are two options that I see:

  1. show warnings in the git push prompt, and perhaps raise the warning is system messages of admin panel
  2. reject push

or have two limits: soft (1), and hard (2)

@lunny lunny added the type/proposal The new feature has not been accepted yet but needs to be discussed first. label Mar 17, 2018
@kolaente
Copy link
Member

We should think of a way to prevent the following:

A user creates a repo, which means he is now admin of said repo. As a repo admin he'd be able to change the repo size limit and bypass the global limit set by the server admin...

I guess the simplest solution would be to only allow to change the size limit setting if the user is also a server admin.

@lafriks
Copy link
Member

lafriks commented Mar 17, 2018

@kolaente there is already a section in repo Settings that is available for server admin

@jonasfranz
Copy link
Member Author

@techknowlogick I do not agree with that because I think that this would deny every push to gitea at all if the global limit is exceeded. So we need a way to restrict the size per user because this would not restrict users having only small repos from pushing.

@lafriks Would this be a posibility to get unlimited storage by creating organizations? My ideas:

a) Bind the limit of an organization to the owner of an organization
b) Accept @lafriks proposal and limit the count of organizations per user
c) Bind the limit of an organization to the members of the organization (might be complicated for users and developers due to the relationship between members and maximum size). Example: Organization uses 10GB of storage every user has 5GB of storage, The org has 4 members. => Every member could only use 2.5GB for his personal account because 2.5GB is used by the organization. (2.5GB*4=10GB)

@kolaente
Copy link
Member

@JonasFranzDEV
c) You mean when pushing to a repo inside that org or in general (aka when creating normal repos under the normal account)?

I'd go with @lafriks here, I think it would save us a lot of headache.

Another thing: we should treat migrations like normal repos (in terms of limits), right? This would mean updating a migration should fail if a user has exceeded his limit. And we could check if a user has enough space left when creating the migration instead of doing that later on.

@lunny
Copy link
Member

lunny commented Mar 17, 2018

I think there are two different concepts here. One is repository size limit, another is user upload size limit. A repository size is the repository's folder size. User upload size is the sum size of all the commits he uploaded.

@jonasfranz
Copy link
Member Author

@lunny it's quite easy to get the size of a repository after a push but I think that it would be harder to get the size of a commit itself. So I would propose to use repository size. The idea was to limit the size of a repository by a user limit.

@stevegt stevegt mentioned this issue May 5, 2018
20 tasks
@poVoq
Copy link

poVoq commented Jul 14, 2018

Any intermediate solution until this is implemented? Is there maybe an option to limit Gitea's overall disk use (without getting into complex drive partitions etc.)?

This is IMHO the main feature that is holding back Gitea's use for any kind of semi-public service :(

@ghost
Copy link

ghost commented Jul 14, 2018

@poVoq as an interim solution consider suggesting users run the BFG repo cleaner to remove large files (especially video or large binaries) from their project history.

TBH disk space is cheap these days. I understand the word "cheap" is subjective but a little ingunity can go a long way. Let us not be held back by our tools but by our imaginations.

@yatsyk
Copy link

yatsyk commented Jul 10, 2019

Is it true that any user with write permission to repository could disrupt the service for everyone by pushing quite big data to his personal repository?

@lunny lunny added the type/enhancement An improvement of existing functionality label Jul 11, 2019
@sbrl
Copy link

sbrl commented Jul 15, 2019

In theory yes, @yatsyk - as far as I understand.

Unless you're running a public instance (i.e. allowing anyone to sign up and create an account) though, it's unlikely to happen - and especially not on purpose.

@yatsyk
Copy link

yatsyk commented Jul 16, 2019

@sbrl why do you think that it’s unlikely to happen?

@sbrl
Copy link

sbrl commented Jul 16, 2019

@yatsyk If you're not running a public instance and an instance for you and some friends / co-workers, I would assume that you'd trust them enough not to intentionally try and break the server. If it does happen by accident, as others have said the BFG repo cleaner can help sort out the mess.

Of course, you will have unique requirements for your particular use-case.

@yatsyk
Copy link

yatsyk commented Jul 16, 2019

@sbrl we should validate data in any service is it public or not. Co-worker computer could be hacked and we should not compromise other users.

@sapk sapk mentioned this issue Aug 12, 2019
5 tasks
@alexanderadam
Copy link

@sapk what will happen with mirrors? Will they be handled exactly the same?

@sapk
Copy link
Member

sapk commented Sep 30, 2019

@alexanderadam I think my PR is ignoring this case. I will need to check if the pre-receive hook is triggered by mirror.

@jrtechs

This comment has been minimized.

@sapk

This comment has been minimized.

@jrtechs

This comment has been minimized.

@lesderid

This comment has been minimized.

@MrGeorgen

This comment has been minimized.

@melroy89

This comment has been minimized.

@mewalig
Copy link

mewalig commented Oct 5, 2021

We are implementing this ourselves as we cannot wait. If anyone would like to collaborate, pls lmk

@spirobel
Copy link

spirobel commented Oct 6, 2021

We are implementing this ourselves as we cannot wait. If anyone would like to collaborate, pls lmk

Did you see this PR #7833 ?

@mewalig
Copy link

mewalig commented Nov 18, 2021

Thanks for pointing out, we did see and review that PR. However, since the PR code appears to never have been finished and also was made against a different version than what we are running, we felt we had to choose between "invest an uncertain amount of time/resources to determine whether that code would be worth attempting to reuse, with the best-case outcome still requiring further work to integrate into our customized-- and yet not latest-- gitea codebase" vs "invest a predictable amount of time/resources to implement this relatively small feature ourselves"-- and we went with the latter.

I would certainly have preferred to collaborate and spend those several-thousand bucks on something else, and furthermore I would have been happy to share the work we have done on this feature-- if only that could be done modular fashion that did not require a lot of additional work and cost for us. Unfortunately, from our perspective, those last two conditions are not met (I think this speaks to the enormous value, that is currently being lost, of having plug-in support as mentioned in #16195). Perhaps that is the fault of our own poor engineering decisions, but whatever the reason may be, our experience, real or perceived, is that while we would prefer to collaborate (both as a recipient of and donor of code for common features such as this) and we have budget to contribute, we are finding it difficult to do so in a way that makes economic sense.

Given now how increasingly far apart our gitea code bases are, I am increasingly of the view that the plug-in framework is a necessary precondition to enabling collaboration on other features.

@christaus
Copy link

I know gitea is made to be self hosted but many don't have the knowledge to do it, why not sharing gitea? For that we need quotas limits, for the number of users and the volume of data they can use on disk. That would be a nice functionnality.

@ptman
Copy link
Contributor

ptman commented Mar 30, 2022

@cGIfl300 there's codeberg and gitea.com

@christaus
Copy link

This is not what I mean, I mean each one could open an instance and allow a limited number of account, using a central system is not what I like in self hosting.

@DmitryFrolovTri
Copy link
Contributor

DmitryFrolovTri commented Nov 8, 2022

@mewalig hi. Were you able to do the PR? I also have this feature in some semi-developed state and as developer left looking for a person who could continue. It even works, but I think needs a lot of fine tuning to make PRable

@mewalig
Copy link

mewalig commented Jan 20, 2023

Hi @DmitryFrolovTri we didn't create a PR because the version of gitea we are working from is now far behind the current version, and when we looked at the PR diff it just didn't make sense. I am still of the view that in the absence of the core maintainers agreeing to incorporate this into the core codebase (assuming the contributed code is of reasonable quality and does what it's supposed to do), it's unlikely to be useful to anyone to attempt to contribute this feature until and unless there is some sort of plug-in mechanism

@DmitryFrolovTri
Copy link
Contributor

DmitryFrolovTri commented Jan 25, 2023

Well funny enough I also have a full PR for the old version that was done a while ago, it is very far behind.
@mewalig if you have the PR to look at it would be of help as well

@DmitryFrolovTri
Copy link
Contributor

Sure. Please advise if any ideas that we are trying to implement here are contradicting

@DmitryFrolovTri
Copy link
Contributor

DmitryFrolovTri commented Jan 30, 2023 via email

@DmitryFrolovTri
Copy link
Contributor

We are moving ahead with this so hopefully we can have it done soon PR #21820

@melroy89
Copy link
Contributor

We are moving ahead with this so hopefully we can have it done soon PR #21820

What would be awesome for my public running server! And later git LFS. In my case without git LFS repo size limits, would be very welcome already.

@Cyberes
Copy link

Cyberes commented Mar 20, 2024

Just chiming in with my support for this issue. I had a bot register on my Gitea instance and proceed to migrate multi-gig repositories from numerous git servers around the internet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement An improvement of existing functionality type/proposal The new feature has not been accepted yet but needs to be discussed first.
Projects
None yet
Development

No branches or pull requests