-
Notifications
You must be signed in to change notification settings - Fork 45
Integration CI Testing Failing on Multiple Branches #356
Comments
Fwiw, the creation of a test file isn't necessary to run local integration tests. But I can see that the document lacks details either way. Adding some of that now. |
I get that it is not. I would imagine users with less experience with the project would have similar experiences. |
FindingsFundamental failure was caused by the following:
During the CI tests, these |
It's possible that (async) GC and As an example, kopia may have 1 goroutine trying to upload items in the Increasing the number of goroutines kopia uses may help, but cannot guarantee a deadlock will not occur. This is because there can be an arbitrary number of folders in a backup |
Synchronous deadlock explanation: SerializeMessages() loaded messages according to the following algorithm:
DataCollection channels used a buffer limit of 1000 entries. If any folder exceeded 1000 messages, the channel buffer would fill, blocking the function from continuing to load any further messages or folders, effectively locking the entire system. |
Addressing #360 will fix this issue. In addition to that fix - we should also reduce the scope of the Integration test(backup a specific folder only, restore as COPY in a restore- folder). The larger scoped, long-running tests should not run in CI. |
Explanation StrikeThrough of Document/Adjust long-running processes...The major changes were #360 to deal with deadlock and a large refactor along the lines #361 that was implemented in stages. Both of those code changes are present in main as of today. The final cause of tests timing out are to be addressed in PR #479 |
CI documentation is a constantly improving process. Steps have been made to improve the initial setup of developer environments. While the process is not complete, the main objective of this issue was to address the cascading failures that were experienced several days ago. Those CI failures have been addressed or have a separate issue in the repository. The issue is to be closed. |
Receiving Timeout Errors during one of the operation tests:
Adjust CI Integration local testing documentation in Notion so that developer is able to do the following:assumeRole.sh
and be able to run all the CI_Tests locallyDocument/Adjust long-running processes of CI that are timing outfolder
toemail-folder
#343The text was updated successfully, but these errors were encountered: