-
-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PR: Update all screenshots to high-res versions from Spyder 3.2.8 and Win10, and add new ones #22
Conversation
@CAM-Gerlach, I think we should use Git LFS in this repo to handle images, given the sheer amount of them you want to commit in this PR. So my advice is to close this PR, remove/recreate your local clone (so you can't inadvertently add these images in another PR) and start again with Git LFS and a PR that only adds the updated images required by our current docs. After that, please open new PRs, each with a single (no matter if short or long) change to our docs and the new images required by it. Does that sound like a good plan? |
TL;DR: You might know more than me, but from the information I was able to gather, for this application, Git LFS seems like a problem in search of a solution. I know you said I shouldn't use humor, but obligatory xkcd is obligatory.
I probably just am not as experienced with git LFS as well as you are, but I'm really not sure I understand the reason, use case or benefits for doing this. From what I've read and heard about it, git LFS (Large File Storage) seems to be designed to handle versioning very large individual files, up to multiple gigabtyes, to work around repo size limitations, allow better versioning support for such files, and avoid inefficiency with git operations such large individual blobs, not really large numbers of small files. However, the images here are only 5-100 KB in size each (averaging ~30 KB each, within the same order of magnitude as the docs themselves), and even all 200 of them are only 7 MB, and aren't likely to change anytime soon until at least Spyder 4 due to the feature freeze, nor have too much added to it since I've already covered pretty much everything I can think of. Its also not like they make the repo unweildly large; the main Spyder repo is around 10x the size of the images at 70 MB, and even our fairly minimal website is 30 MB. Furthermore, it adds extra complexity to the system and time/work spent needed to setup and maintain it, additional things that can go wrong, and added requirements/confusion/sources of error for new contributors; plus they've been a number of complaints with Git LFS; the free plan is quite limited and apparently it can even break some repos, and it currently has 215 open issues. On the flipside, I'm not sure I understand any of the actual substantial benefits for our use case, given our files are not at all large, don't take up that much space in aggregate, and shouldn't change very frequently at all (and when they do, it would be a complete rewrite). On top of all of that, it will suck up even more contributor time to redo this PR, and push back the formal docs release even further especially with the lengthy review cycle. Maybe you could explain more what you see as the benefits that supercede these downsides?
I'm a little confused as to the point of this step too, sorry—if they're still hanging around on my local repo, then they'll still be on my fork that I reclone the repo from, unless I'm silly enough to just leave them untracked in the working dir for some reason. What am I missing?
If the other images aren't in the repo and instead siloed away somewhere on my disk or on Google Drive, anyone else but me who wants to add to, enhance or fix issues with the docs won't have access to them, resulting in either duplicated effort to take likely less consistent/high quality screenshots, or other users' contribution lacking them altogether and having to manually go through me or a different bespoke/siloed off source to get them, when they could easily just be stored right in the repo that needs them for anyone to use and reference. Furthermore, from a conceptual point of view, they were all created from the exact time and from the same snapshot in time of Spyder (and the OS, dependencies, etc), so committing them together at the same time makes that heritage clear and allows us to keep easier track of how much they've (and their successors, if we update them as needed) have aged right within git, rather than putting them in piecemeal at all different times, dates, and even years. This also allows us to establish the full organization scheme up front, rather than making people guess at how we'd want things organized.
If you're referring to only updating the currently inserted images in this PR, I actually went back and forth about whether or not to add a few additional ones directly suggested by the current text (and the full window screenshots as discussed) in this PR, or save it for either the next one doing the baseline copyediting, correctness and consistency pass, or the appropriate following one(s) adding basic low-hanging fruit content to specific parts of the docs. Other than what was directly necessary for replacing the screenshots properly and to address #14 (i.e. removing the non-IPython console doc, as the relevant feature no longer existed for me to screenshot it, and very basic changes to the layout to actually fit the new screenshots), I kept myself to very strictly not touching anything else despite strong temptation to do so, and almost held off adding any at all to observe that, but ended up adding a few obvious ones as commits to the very end (so they could be rolled back easily, if not wanted in this PR) as they were much more within scope of this one than any immediately following ones, and would make this a little less of a blocker to followup PRs if it had come to that (and it very well might have, if I hadn't happened to have a bevy of final projects and exams during the week and a half I've had to wait). However, if you want me to roll back adding the few images I did that didn't replace existing ones (and modify #14 to not include that aspect, creating a new issue just for the new ones) including the overview alternate layout ones we discussed previously and the couple that were directly suggested by the existing text (i.e. the last two commits before the most recent one), I can do so very easily, as I planned for that possibility. If so, be aware that at this rate of getting things approved and merged lately (around a week or more per PR), it could be weeks till they get merged, since they'll have to wait behind the baseline copyediting/correctness/consistency pass before I can even really start them, since that is necessarily a file-crosscutting one (the changes being tightly linked within a conceptual and practical scope, and a single issue but not necessarily file-wise) and is a hard requirement for the proper public release of our docs as otherwise we're basically promoting heavily outdated and inaccurate information. From there, once that gets merged, things should mostly be file by file as needed going forward, of which adding the images (and expanding the bullet points) of the overview will be first up, followed by adding the extra images elsewhere when addressing those pages in turn. |
The thing is this: git-lfs is not only good to handle large files, but binary files in general, so that they don't become part of git history. That way, fetching, cloning and pushing to this repo will take much less time, so it'll be easier to work with it.
The problem is you could add (by mistake) this PR to git history and that would make git-lfs useless.
What I said is that you should add new images only with an associated PR that requires them. Else, they could end up never been used.
I meant that, for an initial PR, you should only replace our current (and outdated) images. Then, when new text content is added to our docs, you could add more images (as the new content requires them). |
Right, but doesn't this only really matters given such small files if we version and update them a lot, since they only increase repo size if they are modified? Otherwise, they don't grow substantially larger than anything else, and don't need to get passed back and forth often. Even assuming we put in all 200 now and update every single one of them every two years (fairly unlikely given the amount of work and lack of need for all of them to be updated every time), after ten years that's only 35 MB total, approximately the size of the
I did some tests to investigate (origin is my fork with the screenshots branch; upstream is the main repo without it). I'm on an emergency backup internet connection running at 3G speeds due to my main one being out, degraded (by near an order of magnitude) at the moment due to a major technical issue, so I only have about 7 Mb/s down and 1 Mb/s up at the moment; far slower than the average American much less the rest of the world (in fact, the upload speed, the most critical for this analysis, is slower than both the average mobile and fixed upload speeds for every single one of the 135 countries listed in the speedtest global index) ; an average user testing under typical conditions would experience much less impact, especially with pushes which depend on upstream bandwidth. Initial push to origin on a new branch based off this branch with screenshots: 8s Followup push on ^ with screenshots; 6s Git fetch origin on spyder-docs (no changes): 2s Git clone on origin: 10s Cloning was the most consistent difference, but was only a few seconds and only has to be done essentially once per contributor and in any case would actually be slower with git lfs, since it would need to hit the separate server for each file and then download the same file anyway; fetches, relying on downlink, saw essentially no impact and would be similarly slower if any LFS files changed, while pushes varied from 1-5s faster in this essentially worst case without any overhead from actual LFS. Furthermore, with regard to the tiny fixed time cost (cloning and initial push), figuring out, initializing (and possibly installing) git LFS for each contributor would be virtually certain to overwhelm it; furthermore, the files still need to be downloaded and uploaded on and off the LFS server when they are first copied or if they change, and pushes (and fetches, if necessary) need to check with the GitHub LFS server for each file, further increasing latency especially for larger numbers of relatively small files. Therefore, even in the absolute best case for Git LFS, making the following best case assumptions (to the point of being unrealistic):
Under those incredibly favorable assumptions, git LFS would take around two years to barely break even on time; if even a few of them are substantially less favorable, which many of them could potentially be by an order of magnitude, we'll both be dead before it does even assuming we update the files regularly. Furthermore, Git LFS (particularly on Github) has the following significant downsides, aside from any time-related issues mentioned above:
Furthermore, more of a near term problem but it will certainly push back our initial docs release, possibly by multiple weeks if the present latency for getting PRs approved continues, particularly considering updating the screenshots was one of the two "highest priority" items you assigned me a few weeks ago.
Well, we could still
Okay, thanks for clarifying; just to be clear there is no new text content whatsoever added in this PR (excepting a small still-relevant existing chunk moved from the removed non-IPython console doc). However, more to the point, if we don't include the full set of screenshots, all the downsides mentioned above result, and additionally any actual benefits for Git LFS shrink basically to nothing at least for now, while with Git LFS there is essentially is not even the small extra push latency downside to keeping them there for contributors to use when working on the relevant section. Again, I'm not sure I understand the substantial tangible benefits of siloing them away somewhere else other than a few extra seconds a few times per day at most, relative to the numerous downsides mentioned above. |
I don't have much time for these lengthy discussions. So my final proposal is this: let's not use Git LFS, close this PR and start a new PR that only replaces our current images. Then open new PRs to add more images, when new content requires them. Is that ok for you @CAM-Gerlach? |
The thing is it would be easier to review and merge PRs that add new images, if we see them in the context of new content being added. |
Okay, I guess that's acceptable for now. That still leaves unresolved the issue of how to make the rest of the comprehensive, standardized screenshot set easily accessible to collaborators other than myself, so they don't duplicate effort for a less consistent result (or fail to include screenshots entirely when existing assets would help illustrate a concept), and I'll need to document the process of finding, selecting, and adding them, but we can finishing hashing that out after we get the initial release out the door—I don't want to hold things up any longer. My suggestion here, if we don't want them in the repo, is to recreate the directory hierarchy and upload the screenshots in a subdirectory of our official Google Drive folder that @goanpeca created that we could give view/download access to collaborators interested in working on the docs. They'd then need to preview them on there, and download and copy them over into the/a corresponding subdirectory of |
There's no problem with having them here. In fact, I think it's better. I was just worried about that making it difficult to work with the repo. But that doesn't seem to be the case. |
Okay, cool beans. So, with that being the case then, do we want to:
Let me know which one you'd like—they're arranged in order of preference, but I'll go right ahead with whichever you think best. Just say the word—err, the number. Thanks! |
Using this PR will mess with git history (due to the large addition and removal of files), so please create a new PR instead. |
A new PR from scratch, I meant. |
So which option? Option 2. in a new branch and a new PR based off this branch with the extra files interactive |
To be honest I'm not still sure what we really gain from adding the rest of the files all in a separate PR, other than extra work rebasing and reorganizing everything, a bit of conceptual seperation and the possibility to defer merging of that for a bit longer... but if that's how you want it, I can do it that way. |
Yes, that's it.
A couple of things:
|
Sorry for the continued confusion, but I thing there was a miscommunication somewhere, since both of the valid concerns you list under
are only directly addressed with Option 3, since Option 2 (as noted in both my previous posts) would then add all the rest of the images to the repo in a separate followup PR (as mentioned previously, Option 3 is basically just your "final proposal" above). I was going to implement the latter, but then you said, in reply to my suggestion of where to put the full set of screenshots for others to use since we weren't putting them in the repo,
...which is what 1. and 2. propose, whereas 3. (your previous proposal) would essentially require the full set of screenshots stored somewhere externally and then manually copied into the correct hierarchy if another collaborator wants to use them, with the disadvantages that entails as mentioned in my previous post that you were responding to.
True. But won't you see them either way, and in context, when you're reviewing the changes that actually add them?
Indeed, and it could be some work tracking down which are used or not.
This is actually the main reason behind adding to the repo, so they're readily accessible to contributors other than myself. Otherwise, we have to set up a bespoke system to store them (like the Google Drive solution I suggested) and individually invite interested collaborators, which is more of a burden and increases the barrier to entry for new potential contributors. So, at least how I see it, it basically comes down to a cost-benefit between maintaining a separate shared google drive folder with the full set of screenshots (and any more that aren't used right away), vs. just including them in the repo; the former keeps the repo itself somewhat faster and cleaner (at least until most of them are added), and makes reviewing potentially a little easier, at the cost of some more time and tedium for collaborators both up front and on an ongoing basis, as well as being another potential hurdle for new contributors. |
@CAM-Gerlach, I really can't keep discussing about this, sorry. Please don't take this the wrong way, it's just that I see things very simple but it seems I'm unable to convey to you what I want :-) So I'll do a PR for you, committing on your behalf, and I'll submit it as soon as I can (a couple of days at most). |
I can just go ahead with what you previously said as your "final proposal" in this comment, as that part was indeed clear and simple and it seems that's still what you want as a first step. We can worry about where to host the rest of the images after, and it seems that's where the confusion/misunderstanding came in. |
Thanks a lot for understanding! I think our purpose for an initial public release is to iterate as quickly as we can until we have something decent to show to the world (like we did with the homepage). After that, we can discuss about the fine details of future additions. |
That's the plan! |
Pull Request Checklist
Description of Changes
Adds over 200 new, consistent, high-resolution screenshots of modern Spyder (3.2.8) running on a clean Win 10 test machine, replaces all the existing old screenshots and adds a few new ones where appropriate and called for by the existing text, minor layout alterations to the text and theme to accommodate the new images. Also, removes the obsolete legacy console docs (to remove the need to update the now non-existant image) and update links to point to the IPython Console, merging a short paragraph from the former into the latter.
Hopefully this can get reviewed promptly, as I can really get started on whipping the existing text/content into some semblance of shape to address #13 and #15 (updating/correcting out of date/innacurate content, copyediting, adding low-hanging fruit, etc) until this gets merged. I tried to make it as straightforward and non-controversial as possible, just direct replacements of the existing screenshots and a few very judicious new ones as previously discussed, and essentially no real changes to the actual body text itself, so hopefully that can happen quickly. Thanks.
Issue(s) Resolved
Fixes #14