Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load contents of source files only when needed #559

Merged
merged 4 commits into from
Jan 17, 2020

Conversation

matthijskooijman
Copy link
Collaborator

When compiling a big sketch, I noticed that arduino-builder took up a lot of RAM (over 1G, sometimes it even went over 1.7G and crashed with an out of memory error).

Investigating, it turns out that this is because arduino-builder and arduino-cli load all source file contents into memory when loading the sketch. In practice, only the .ino file contents are actually used, so the rest is just dead weight. For sketches that have a lot of extra baggage (we have libraries, a core and some host-system tools in the sketch directory), this can take up a lot of unneeded space.

This PR modifies sketch.Item to remove the Source string, which is then no longer filled at sketch load time. Instead, the GetSourceStr() method loads the contents on demand (which is really only when merging the sketch files).

This PR moves around some error handling (since I/O operations have moved), but I've managed to split most of the trivial code movements into separate commits.

To reproduce this problem, you can simply create an empty sketch and add some baggage (https://github.com/vancegroup/arduino-boost adds some 300MB of source files and is good for testing):

matthijs@grubby:~$ mkdir sketch
matthijs@grubby:~$ cd sketch/
matthijs@grubby:~/sketch$ echo "void setup() { } void loop() { }" > sketch.ino
matthijs@grubby:~/sketch$ git clone https://github.com/vancegroup/arduino-boost.git                                                                           
[ ... snip output ... ]
matthijs@grubby:~/sketch$ /usr/bin/time -f '%MkB' arduino-cli compile -b arduino:avr:uno .                                                                    
Sketch uses 444 bytes (1%) of program storage space. Maximum is 32256 bytes.
Global variables use 9 bytes (0%) of dynamic memory, leaving 2039 bytes for local variables. Maximum is 2048 bytes.
379120kB

This uses time to show that the maximum resident memory size is nearly 400MB.

With the patch applied, this is reduced to 70MB:

matthijs@grubby:~/sketch$ /usr/bin/time -f '%MkB' arduino-cli compile -b arduino:avr:uno .                                                                    
Sketch uses 444 bytes (1%) of program storage space. Maximum is 32256 bytes.
Global variables use 9 bytes (0%) of dynamic memory, leaving 2039 bytes for local variables. Maximum is 2048 bytes.                                           
71176kB

This still seems rather much. When running arduino-builder on the same sketch, the memory usage drops from 375MB to 37MB, so it seems that arduino-builder is a bit more efficient even (and on our much bigger production sketch things improved from 1.1G to just 27M, which is unexpectedly even less). But I guess this is something for another time.

Previously, these two arguments were wrapped together in a sketch.Item,
but since all callers build such an item during the call, there is no
compelling reason to do it like this. This commit splits the Item
parameter into a separate path and contents, which prepares for removing
the file contents from sketch.Item later.
This adds an error return value, which is currently always nil. This
prepares for making changes that require returning errors.
This adds an error return value to this method, which is currently
always nil. This prepares for actually returning errors later.
Previously, the full contents of *all* sketch files would be loaded into
memory. This includes all source and header files inside the sketch
directory, even when they will not even be compiled (e.g.
subdirectories other than src). In practice, only the .ino file contents
will actually be used, so these are now read on demand.

Note that when copying the sketch into the build directory, the contents
of all these sketch files *is* used, but that code (`writeIfDifferent()`
in `arduino/builder/sketch.go`) already did not use the preloaded data
but read the file contents when copying.

For small sketches, this does not make much of a difference, but bigger
sketches, especially when they include libraries, core definitions,
tools, examples, documentation, etc. the memory usage can quite explode,
for no good reason.
@rsora
Copy link
Contributor

rsora commented Jan 17, 2020

Thanks @matthijskooijman !
This is, with no doubts, high quality contributing 😄

@rsora rsora merged commit 24503d5 into arduino:master Jan 17, 2020
@matthijskooijman
Copy link
Collaborator Author

Thanks for merging and the compliment :-)

cmaglie added a commit to arduino/arduino-builder that referenced this pull request Jan 23, 2020
Fix library priority selection
- arduino/arduino-cli#565

Load contents of source files only when needed
- arduino/arduino-cli#559

Allow naming sketches like "RCS" and "CVS"
- arduino/arduino-cli#537
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants