-
-
Notifications
You must be signed in to change notification settings - Fork 482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In the built documentation, replace duplicate files by symlinks #25111
Comments
comment:1
I actually do this in sage-on-gentoo, but I do it at the packaging level rather than the building level. |
comment:2
Here is some Python code which works for me. Is this the sort of thing you use? from filecmp import dircmp
import os, shutil
def directories_equal(left, right, ignore=None):
"""
True if and only if the directories ``left`` and ``right`` have
the same contents, file by file. Ignore any files listed in
``ignore``.
"""
dcmp = dircmp(left, right, ignore=ignore)
return (not dcmp.left_only and not dcmp.right_only
and not dcmp.common_funny and not dcmp.funny_files
and not dcmp.diff_files and
all(directories_equal(os.path.join(left, a), os.path.join(right, a), ignore=ignore)
for a in dcmp.common_dirs))
def replace_duplicates_with_symlinks(source, target):
"""
INPUTS:
- ``source``, ``target``: directories.
If the two directories are identical, replace ``target`` with a
symlink pointing to ``source``. Otherwise, for each file in
``target``, if a copy of it exists in ``source``, replace the copy
in ``target`` with a symlink pointing to ``source``.
"""
if directories_equal(source, target, ignore=['pdf.png']):
if not os.path.islink(target):
shutil.rmtree(target)
os.symlink(source, target)
else:
# compare file by file, doing the replacement
dcmp = dircmp(source, target)
for d in dcmp.common_dirs:
replace_duplicates_with_symlinks(os.path.join(source, d),
os.path.join(target, d))
for f in dcmp.common_files:
os.remove(os.path.join(target, f))
os.symlink(os.path.join(source, f),
os.path.join(target, f))
def replace_with_master_directory(top_dir):
"""
top_dir: top of html doc directory (so typically
top_dir = local/share/doc/sage/html)
"""
master = os.path.join(top_dir, 'en', '_static')
for lang in os.listdir(top_dir):
for d in os.listdir(os.path.join(top_dir, lang)):
target = os.path.join(top_dir, lang, d, '_static')
if (os.path.isdir(target)
and not os.path.islink(target)
and not os.path.samefile(master, target)):
replace_duplicates_with_symlinks(master, target) |
comment:3
This saves me almost 400 MB, by the way. ("This" = |
comment:4
No I don't use python code because I do it within the packaging script in bash
because I have the mathjax fonts by default and they are copied in all The last touch is replacing most of the mathjax stuff by symlink in the master _static folder
|
comment:5
See possibly related discussion at #25089. |
comment:6
I'm confused by this ticket, because it already does that, per #25089... |
comment:7
I see the difference--it does already do this within the IMO treating the entire tree of Sage docs as such a hierarchy with shared static resources would be the best approach. |
comment:8
Some parts of the documentation tree have slightly different |
comment:9
Is it worth pursuing this? It could be part of the docbuild process, or it could be done only when you use I also don't know what to do about Windows/cygwin and symbolic links. |
comment:10
Sphinx has a configuration option html_static_path which might do what we want. I'll look into it. Edit: or maybe not: the documentation says that the files "are copied to the output’s _static directory after the theme’s static files". We don't want files copied, we want a single |
comment:11
Couldn't a different |
comment:12
I don't think that |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
comment:14
Too bad that I missed this ticket earlier. This should have been an 8.3 blocker. |
comment:15
As a workaround to #25089, for the Windows build I run a script that deletes all the duplicate |
comment:16
Replying to @embray:
I do exactly that in sage-on-gentoo as well. Would be great to know how to tell sphinx to create symlinks instead of copying. |
This comment has been minimized.
This comment has been minimized.
comment:18
Replying to @jdemeyer:
By the way, that 20GB figure is not really how much disk space is being taken up. There is a bug (I have no idea where this comes from) that puts a "mathjax" directory in each If I delete all those nonsense
and
so I think it's not all as bad as it seems. |
This comment has been minimized.
This comment has been minimized.
comment:20
I see now your sage-devel post where you reported the same. |
comment:21
It seems to be more subtle: sometimes the symlinks are correctly generated and sometimes not. |
comment:22
Replying to @embray:
By the way, after using the script in comment:2, I get
There are lots of symlinks, though. |
comment:23
I'm getting really confused here. Initially I thought that the problem was the The problem seems to be the few copies of |
comment:24
Replying to @embray:
I'm not sure why you think that it's a hard link. On my system (and probably most Unix-like systems), creating a hardlink to a directory is not even allowed. See https://askubuntu.com/questions/210741/why-are-hard-links-not-allowed-for-directories/525129 |
comment:25
I created a new ticket #26152 specifically for the mathjax symlink issue. |
comment:26
Replying to @jdemeyer:
To clarify: The directory is not a hard link but the files under it are, and there were deeply nested directories (I didn't confirm how deep) each containing what I presume were hard links to the files (since deleting them did not actually release much usage of my disk). |
comment:27
Replying to @jdemeyer:
And as noted above, the |
comment:28
In a recently built copy of the Sage documentation, there seem to be 27 copies of MathJax installed in various |
This problem does not exist now. Roughly, the total size of all Of 18M, mathjax takes 17M. Hence after #36098, the total size would be reduced to something much less than 100M. |
Until Sage 8.2, the
_static
directories in the generated HTML documentation of the reference manual were symlinks to a single master_static
directory. Now all the files are copied, leading to a huge explosion in size of the built documentation (from 1.8GB in Sage 8.2 to about 20GB in Sage 8.3).CC: @timokau
Component: documentation
Issue created by migration from https://trac.sagemath.org/ticket/25111
The text was updated successfully, but these errors were encountered: