Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

readthedocs documentation no longer building #1472

Closed
ltalirz opened this issue May 3, 2018 · 15 comments · Fixed by #1612
Closed

readthedocs documentation no longer building #1472

ltalirz opened this issue May 3, 2018 · 15 comments · Fixed by #1612

Comments

@ltalirz
Copy link
Member

ltalirz commented May 3, 2018

Perhaps someone is already working on this, but just so that we don't replicate work -
I'm currently getting flooded with emails pointing out that the readthedocs builds are failing because of timeouts (both for the latest and the workflows branch).
I suggest that

One straightforward idea to save time would be to disable the json build which, I guess, is completely unnecessary json is used for search
They mention that they can increase the build time for specific projects. Perhaps one just needs to ask?

On top of this, one should still look at the build itself and see whether the long build time makes sense and could be reduced.

P.S. Currently the readthedocs build notifications go to @giovannipizzi @DropD @szoupanos and me.
Since @sphuber is doing a lot of merges lately, should I add you as well?
Alternatively, we could define someone who is responsible for the readthedocs builds and reduce the number of people who get notified about this.

@sphuber
Copy link
Contributor

sphuber commented May 3, 2018

I have been taking a look and the problems seem to be twofold:

  • We were hitting OOM
  • We were hitting timeout

The OOM was probably introduced because we switched the requirement for scipy to v1.0.1 which used to be scipy<1.0.0 because v1.0.0 was broken. However, now that v1.0.1 is out, the requirement is no longer needed and after removing it we no longer hit OOM.

However, now we are still hitting the timeout, but according to this thread this is a general problem with RTD servers and they say to try again later

@ltalirz
Copy link
Member Author

ltalirz commented May 3, 2018

Thanks for the update!
Regarding the issue you mentioned - it seems like this was one day beginning of March, where they had problems on their servers. I don't see any new open issues regarding build time in their issue tracker, so I'm not sure this problem will simply go away... (in particular, since the v0.12.0 documentation build passed at the same time)

@sphuber
Copy link
Contributor

sphuber commented May 3, 2018

I had misread Mar 2 as May 2 and was a bit hasty with my conclusion. Probably wishful thinking is to blame :) Then I am not sure what causes the increase in build time. Maybe we were already close to it and just a little extra tipped the bucket

@ltalirz
Copy link
Member Author

ltalirz commented May 3, 2018

FYI the last build of develop still errored due to memory problems, not build time
https://readthedocs.org/projects/aiida-core/builds/7133686/

@sphuber
Copy link
Contributor

sphuber commented May 3, 2018

That's because the removal of scipy as a requirement has not been merged yet. The PR is still open. But I tried it on the workflows branch and removing scipy there fixed the OOM

@ltalirz
Copy link
Member Author

ltalirz commented May 3, 2018

But the same is true for the last workflows build...
https://readthedocs.org/projects/aiida-core/builds/7134005/
Anyhow, good to fix this, just approved the PR.

@ltalirz
Copy link
Member Author

ltalirz commented May 3, 2018

Hm... maybe you were lucky?
Seems to persist after merge https://readthedocs.org/projects/aiida-core/builds/7134834/

Also: Does anybody happen to know why there are always two builds triggered after a new commit?
https://readthedocs.org/projects/aiida-core/builds/

@giovannipizzi
Copy link
Member

Check also #1524 for a graph of memory consumption and additional notes on what readthedocs is working on

@giovannipizzi
Copy link
Member

I report here some more detailed memory reporting.
I run on linux, with sphinx-build -b singlehtml -d build/doctrees source build/html and created the plots running in a different terminal (with -Y if going over ssh): psrecord $(pgrep sphinx-build) --interval 1 --duration 120 --plot plot1.png

Memory usage for the html mode

(sphinx-build -b html -d build/doctrees source build/html)
plot-html

Memory usage for the singlehtml mode

(sphinx-build -b singlehtml -d build/doctrees source build/html)
plot-singlehtml

So singlehtml is taking much more memory

Memory usage for the singlehtml mode, when rebuilding using the doctrees cache

plot-singlehtml-remake

So even when remaking, it uses the same memory - I'll use this for the next tests, as it is faster

Memory usage for the singlehtml mode, when not building the apidoc

plot-singlehtml-remake-noapidoc

Memory usage for the singlehtml mode, when building the apidoc but not the other doctrees

plot-singlehtml-remake-onlyapidoc

Baseline when I remove everything from the folder and leave only an empty index.rst

plot-singlehtml-remake-nodirs

My understanding is that the APIdocs take a significant amount of memory, more than double than the rest of the docs. Probably we were already close to the limit and now we hit the limit. @ltalirz any idea why?
On the other hand, also the baseline is already pretty big, and we might try to see if there is a way to reduce that. I'm not sure I'm able to understand if it's something that we import or something else.
I will try to do a few more tests but any suggestion welcome.

@giovannipizzi
Copy link
Member

Actually, I have to correct what I was saying before, because I tried again to run a fresh build, and then rebuilding, and the memory at the first step was

# Elapsed time   CPU (%)     Real (MB)   Virtual (MB)
      41.063       99.800      360.047     1208.812

While in the second run more like

# Elapsed time   CPU (%)     Real (MB)   Virtual (MB)
       1.001       98.900      194.516      313.750

(these numbers fluctuating a bit, VM in the range 250-320MB).

Note also the huge Virtual memory (w.r.t. the real memory, in the previous plots!)

These would correspond to runs where I left all files in place and just emptied the index.rst (apparently, then, sphinx still reads and imports all rst files it finds in the folder).

@ltalirz
Copy link
Member Author

ltalirz commented May 24, 2018

My understanding is that the APIdocs take a significant amount of memory, more than double than the rest of the docs. Probably we were already close to the limit and now we hit the limit. @ltalirz any idea why?

I don't quite follow the maths - according to your tests, singlehtml without apidoc is ~560MB and singlehtml with apidoc is ~880MB.

Anyhow, I have read on the rtd github issues that the resource usage of the singlehtml build depends significantly on the length of the documentation (see related issue here ).
Obviously, the API doc is very long.

@ltalirz
Copy link
Member Author

ltalirz commented May 24, 2018

In the issue above, they just asked and they increased the memory usage for the project. We could try the same.

@giovannipizzi
Copy link
Member

A few comments, done after emptying the documentation but keeping the files around, and trying to monitor memory. I managed to track down to a few files (removing the automatic apidoc) that increase a lot the memory:

  • in caching.rst, we use the ipython sphinx extension. This bumps up the VMemory form 200 to 800-900MB.
  • the two files orm_overview.rst and transport.rst - this time the problem I think is in the automodule (indeed, just this is sufficient to bump up the VMemory to 800-900MB:
ORM documentation: generic aiida.orm
====================================

.. toctree::
   :maxdepth: 3

Some generic methods of the module aiida.orm.utils


.. automodule:: aiida.orm
   :members:
   :noindex:
   :special-members: __init__

Unfortunately it becomes a bit too much work to track which specific import in there might require so much memory... Also we have e.g. database connections, these might require some memory (also because we use an in-memory SQLite DB, even if this should be the main problem - my tests above are done with PostgreSQL).

@giovannipizzi
Copy link
Member

In readthedocs/readthedocs.org#3220 they say that there is now a way to remove the unnecessary compilation! I'm going to prepare a PR for this

giovannipizzi added a commit to giovannipizzi/aiida-core that referenced this issue May 30, 2018
At the moment we just configure the formats,
in particular disabling the htmlzip format, that was
causing a lot of memory usage and, as a consequence,
the builds to fail, see aiidateam#1472

This should close aiidateam#1472 if it  works as expected
@giovannipizzi
Copy link
Member

Seems to have worked, even if we need to wait a few builds to be more sure!

There were three builds triggered at the same time

  • 1 failed due to timeout
  • 1 passed (but build the single html)
  • 1 passed (but did not build the single html as we wanted)

So hard to say if it really worked... We can reopen this if we see it didn't work.

giovannipizzi added a commit to giovannipizzi/aiida-core that referenced this issue Jun 6, 2018
But this time for the release 0.12.1 branch
sphuber pushed a commit that referenced this issue Jun 6, 2018
But this time for the release 0.12.1 branch
sphuber pushed a commit that referenced this issue Jun 18, 2018
But this time for the release 0.12.1 branch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants