Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add config flag to boards for detecting dupes #45

Open
karai17 opened this issue Nov 30, 2016 · 4 comments
Open

Add config flag to boards for detecting dupes #45

karai17 opened this issue Nov 30, 2016 · 4 comments

Comments

@karai17
Copy link
Owner

karai17 commented Nov 30, 2016

In some cases, it might be preferred to allow boards to have duplicate content.

@raidho36
Copy link

Really allowing duplicate content is pointless, it just wastes your hard drive space. When enabled, it should substitute duplicate file for a reference to already existing file. Then you, however, can't manage files by nuking everything with sunk thread, you'll have to find another thread that references duplicate file and attach it there.

@karai17
Copy link
Owner Author

karai17 commented Nov 30, 2016

hdd space isn't really that big of an issue tbh. The small amount of dupes probably wouldn't be more than a few hundred mb per board in very active boards, and some of the very active boards like /b/ usually turn archiving off so once it falls off of page 10 or 20 or whatever, it's all freed up.

I think it'd be easier to just allow dupe files than to try and micromanage kilobytes.

@karai17 karai17 changed the title Add config flag for detecting dupes Add config flag to boards for detecting dupes Nov 30, 2016
@karai17 karai17 added this to the 1.3.0 milestone Nov 30, 2016
@shakesoda
Copy link

I'd strongly prefer just using the hash check to point to an existing file

@raidho36
Copy link

raidho36 commented Nov 30, 2016

You'd only think it's "kilobytes" if your board only allows small files, has few threads, and is in general not popular. Something as simple as webm thread can tank hard drive usage by several gigabytes. Additionally, having duplicate files reside in the filesystem prevents using same file cache for the duplicate file, i,e. both files will have to be cached, taking up server memory and displacing other files from cache, which gives additional hard drive load to look up files that are missing from cache. Speaking of webm threads and the like, those usually contain large amount of various files, and may be prime cause of duplicate file "errors". So it's not helpful to refuse a post containing a duplicate file. Again, if you're facing problems with flood of identical files, that's a problem of moderation, not software. It is an incredibly naive idea that people will not bypass it if they wanted to post it, defeating the whole purpose of such "feature" and getting annoyed in the process, so again, it's not helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants