Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE] trigger full rebuild #11502

Closed
wants to merge 14 commits into from

Conversation

cenit
Copy link
Contributor

@cenit cenit commented May 21, 2020

Describe the pull request
Due to infinite number of regressions in #11130 I fear that something slipped in, breaking many ports. CMake 3.17?
Here I just want to test in CI what happens triggering a full rebuild...

@cenit cenit changed the title [DONT MERGE] trigger full rebuild [vcpkg] issues with #11130 May 21, 2020
@JackBoosY JackBoosY changed the title [vcpkg] issues with #11130 [DO NOT MERGE] issues with #11130 May 22, 2020
@BillyONeal
Copy link
Member

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@cenit cenit marked this pull request as ready for review May 22, 2020 10:49
@cenit cenit changed the title [DO NOT MERGE] issues with #11130 [DO NOT MERGE] trigger full rebuild May 22, 2020
@cenit
Copy link
Contributor Author

cenit commented May 22, 2020

All the failures on osx and Linux are not due to msys2 problems...
These failures explain many of those experienced in #11130

@BillyONeal
Copy link
Member

All the failures on osx and Linux are not due to msys2 problems...

Yes, but the previous thing that kept causing this PR to hang the builders forever was

@BillyONeal
Copy link
Member

The boost ARM ones should be fixed by #11545

@JackBoosY
Copy link
Contributor

JackBoosY commented May 25, 2020

  • seal:arm64-windows
C:\agent\_work\2\s\buildtrees\seal\src\9754f4b2c2-e171b3c220\native\src\seal\batchencoder.cpp(235): error C2039: 'index_type': is not a member of 'gsl::span<const unsigned __int64,18446744073709551615>'
C:\agent\_work\2\s\buildtrees\seal\src\9754f4b2c2-e171b3c220\native\src\seal/intarray.h(265): note: see declaration of 'gsl::span<const unsigned __int64,18446744073709551615>'
  • quickfix:arm64-windows
C:\agent\_work\1\s\buildtrees\quickfix\src\v1.15.1-5086b0da20\src\C++\test\DictionaryTestCase.cpp(30): error C2871: 'FIX': a namespace with this name does not exist
  • libaiff:arm64-windows needs remove tombstone
  • boost-exception:arm64-windows
  • boost-regex:arm64-windows
  • boost-stacktrace:arm64-windows
  • boost-system:arm64-windows
  • boost-container:arm64-windows
  • boost-atomic:arm64-windows
  • boost-signals:arm64-windows
  • boost-atomic:arm-uwp
  • geographiclib:arm-uwp needs remove tombstone
  • boost-exception:arm-uwp
  • boost-regex:arm-uwp
  • boost-nowide:arm-uwp
  • boost-system:arm-uwp
  • boost-container:arm-uwp
  • boost-signals:arm-uwp
  • bond:x64-linux
CMake Error at cmake/FindStack.cmake:16 (MESSAGE):
  Stack was not found.
  • seal:x64-linux
/mnt/_work/1/s/buildtrees/seal/src/9754f4b2c2-e171b3c220/native/src/seal/batchencoder.cpp:452:51: error: ‘index_type’ in ‘class gsl::span<long unsigned int, 18446744073709551615>’ does not name a type
         using index_type = decltype(destination)::index_type;
                                                   ^~~~~~~~~~
  • opencc:x64-linux
Error copying file "/mnt/_work/1/s/buildtrees/opencc/x64-linux-dbg/src/libopencc.a" to "/mnt/_work/1/s/buildtrees/opencc/x64-linux-dbg/src/tools".
  • wtl:x64-linux needs remove tombstone
  • seal:x64-osx
/Volumes/data/work/1/s/buildtrees/seal/src/9754f4b2c2-e171b3c220/native/src/seal/batchencoder.cpp:235:53: error: no type named 'index_type' in 'gsl::span<const unsigned long long, 18446744073709551615>'
        using index_type = decltype(values_matrix)::index_type;
                           ~~~~~~~~~~~~~~~~~~~~~~~~~^
  • sigslot:x64-osx
CMake Error at ports/sigslot/portfile.cmake:15 (file):
  file INSTALL cannot find
  "/Volumes/data/work/1/s/buildtrees/sigslot/sigslot/sigslot.h": No such file
  or directory.
  • xmsh:x64-osx cannot download source.
  • nettle:x64-osx
Undefined symbols for architecture x86_64:
  "_EVP_bf_ecb", referenced from:
      _openssl_bf128_set_encrypt_key in nettle-openssl.o
      _openssl_bf128_set_decrypt_key in nettle-openssl.o
  "_EVP_cast5_ecb", referenced from:
      _openssl_cast128_set_encrypt_key in nettle-openssl.o
      _openssl_cast128_set_decrypt_key in nettle-openssl.o
ld: symbol(s) not found for architecture x86_64
  • geographiclib:x64-windows needs remove tombstone
  • gmp:x64-windows
2>CppUTest.lib(UtestPlatform.cpp.obj) : error LNK2019: unresolved external symbol __imp_timeGetTime referenced in function "long __cdecl VisualCppTimeInMillis(void)" 
  • pdal:x64-windows
CMake Error at C:/agent/_work/2/s/installed/x64-windows/share/postgresql/vcpkg-cmake-wrapper.cmake:13 (set_property):
  set_property could not find TARGET PostgreSQL::PostgreSQL.  Perhaps it has
  not yet been created.
  • icu:x64-windows
  • seal:x64-windows
C:\agent\_work\2\s\buildtrees\seal\src\9754f4b2c2-e171b3c220\native\src\seal\batchencoder.cpp(235): error C2039: 'index_type': is not a member of 'gsl::span<const unsigned __int64,18446744073709551615>'
C:\agent\_work\2\s\buildtrees\seal\src\9754f4b2c2-e171b3c220\native\src\seal/intarray.h(265): note: see declaration of 'gsl::span<const unsigned __int64,18446744073709551615>'
  • seal:x64-windows-static
C:\agent\_work\2\s\buildtrees\seal\src\9754f4b2c2-e171b3c220\native\src\seal\batchencoder.cpp(235): error C2039: 'index_type': is not a member of 'gsl::span<const unsigned __int64,18446744073709551615>'
C:\agent\_work\2\s\buildtrees\seal\src\9754f4b2c2-e171b3c220\native\src\seal/intarray.h(265): note: see declaration of 'gsl::span<const unsigned __int64,18446744073709551615>'
  • libfabric:x64-windows-static
1>CppUTest.lib(UtestPlatform.cpp.obj) : error LNK2019: unresolved external symbol __imp_timeGetTime referenced in function "long __cdecl VisualCppTimeInMillis(void)" 
  • pdal:x64-windows-static
CMake Error at C:/agent/_work/2/s/installed/x64-windows-static/share/postgresql/vcpkg-cmake-wrapper.cmake:13 (set_property):
  set_property could not find TARGET PostgreSQL::PostgreSQL.  Perhaps it has
  not yet been created.
  • ois:x64-windows-static
CMake Error at CMakeLists.txt:22 (configure_file):
  configure_file Problem configuring file
  • zxing-cpp:x64-windows-static
libiconv.lib(iconv.c.obj) : error LNK2019: unresolved external symbol locale_charset referenced in function iconv_canonicalize
  • icu:x64-windows-static
  • seal:x86-windows
C:\agent\_work\2\s\buildtrees\seal\src\9754f4b2c2-e171b3c220\native\src\seal\batchencoder.cpp(235): error C2039: 'index_type': is not a member of 'gsl::span<const unsigned __int64,4294967295>'
C:\agent\_work\2\s\buildtrees\seal\src\9754f4b2c2-e171b3c220\native\src\seal/intarray.h(265): note: see declaration of 'gsl::span<const unsigned __int64,4294967295>'
  • ccfits:x86-windows needs remove tombstone
  • gsoap:x86-windows
2>CppUTest.lib(UtestPlatform.cpp.obj) : error LNK2001: unresolved external symbol __imp__fputs 
2>CppUTest.lib(UtestPlatform.cpp.obj) : error LNK2001: unresolved external symbol __imp___localtime64_s 
2>CppUTest.lib(UtestPlatform.cpp.obj) : error LNK2001: unresolved external symbol __imp__timeGetTime@0 
  • pdal:x86-windows
CMake Error at C:/agent/_work/2/s/installed/x86-windows/share/postgresql/vcpkg-cmake-wrapper.cmake:13 (set_property):
  set_property could not find TARGET PostgreSQL::PostgreSQL.  Perhaps it has
  not yet been created.
  • gmp:x86-windows
2>CppUTest.lib(UtestPlatform.cpp.obj) : error LNK2019: unresolved external symbol __imp__timeGetTime@0 referenced in function "long __cdecl VisualCppTimeInMillis(void)"
  • icu:x86-windows

@JackBoosY
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@PhoebeHui
Copy link
Contributor

sigslot:x64-osx failure doesn't repro locally, and didn't see this issue in latest CI testing status.

@JackBoosY
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

BillyONeal and others added 4 commits June 11, 2020 17:58
…oft#11839)

* [vcpkg] Remove do-nothing Set-Content from Windows azure-pipelines.yml.
* [vcpkg] Fix OSX CI by ensuring the downloads directory exists in advance, and extract common command line parameters with powershell splatting.
* [tensorflow-cc] Prevent hang building tensorflow-cc asking to configure iOS.
* Skip ignition-msgs5:x64-osx
* [Arrow] Update to 0.17.1

* Remove arrow:x64-linux=fail from ci.baseline.txt.
Add explicit tool dependencies on Flex and Bison for Linux and OSX.

* Revert arrow dependency on Flex/Bison, it's Thrift that needs them and its portfile is already fine.

* Use vcpkg_fail_port_install(ON_ARCH x86 arm arm64) instead of custom check.
Remove thrift:x64-osx=fail from ci.baseline.txt (we know arrow depends on it, and arrow:x64-osx has been shown to work in 3rd party project).

* Disable using pkg-config files to locate dependencies in arrow

This is incompatible with vcpkg as these files refer to paths in the
packages directory rather than the installed directory, so this only
works if the packages haven't been cleaned.

* Mark thrift:x64-osx as still failing until a proper solution for Bison can be found.

* Update ports/arrow/portfile.cmake

Co-authored-by: Adam Reeve <[email protected]>
Co-authored-by: NancyLi1013 <[email protected]>
@cenit cenit force-pushed the dev/cenit/cmake317_verify branch from 22909dd to 76d45d5 Compare June 11, 2020 15:58
@cenit
Copy link
Contributor Author

cenit commented Jun 11, 2020

All green!

I still see a regression on msix on x86-windows.
Also, I will trigger another final full rebuild whenever I will see a green checkmark, just to be sure...

@JackBoosY
Copy link
Contributor

@cenit That pipeline test case has been deprecated.

@cenit
Copy link
Contributor Author

cenit commented Jun 12, 2020

@JackBoosY as you can see, we are still far from a green checkmark on the baseline... unfortunately

@cenit That pipeline test case has been deprecated.

?? this PR is perfectly in sync with master, only triggering a full port-tree rebuild

@JackBoosY
Copy link
Contributor

@cenit Looks like more and more regressions appear, and I'm fixing them.

@BillyONeal
Copy link
Member

@cenit What are you actually trying to accomplish with this? All this is doing is showing that when you run 1300 separate build systems, the probability of any one of them having flaky behavior approaches 1. And in so doing is burning a ton of compute time and wasted cache storage space. We have the binary caching system precisely because that is necessary to ever make any progress with a system like this.

@BillyONeal
Copy link
Member

In particular almost all the recent ones are failures to download the sources to build from a third party server which is absolutely not a bug in vcpkg.

@cenit
Copy link
Contributor Author

cenit commented Jun 12, 2020

I was triggering this at the beginning because I had tons of regressions in my opencv PR, all of them not due to my modifications (which on the other hand were extensive and required careful checks).
I was just trying to have a clean baseline, then of course the mission of this PR was over. It seems that having a clean baseline is not so mandatory here, so I will stop with it, understood.
(caching original source artifacts when they do not change should not be a problem, while I disagree that having probability 1 of having a port broken in the official port-tree using an official triplet can be ok, but this is just IMHO...)

@BillyONeal
Copy link
Member

I was just trying to have a clean baseline, then of course the mission of this PR was over. It seems that having a clean baseline is not so mandatory here, so I will stop with it, understood.

I'm not saying that necessarily. This PR highlights that we need some process in place where we monitor for regressions caused by overall tool changes; we should probably be running full rebuilds at least once per week and more likely once per day, and publishing that information for others to consume here. It's just that that being a PR elevates a normal process into a 'fire drill' because someone's blocked because it's a PR :)

(caching original source artifacts when they do not change should not be a problem, while I disagree that having probability 1 of having a port broken in the official port-tree using an official triplet can be ok, but this is just IMHO...)

It doesn't matter how official the triplet is, the sources are hosted by each third party associated with a given package. For example if SourceForge chooses a bad mirror for a given run you get a whole bunch of failures, none of which are really port bugs.

@cenit
Copy link
Contributor Author

cenit commented Jun 12, 2020

It doesn't matter how official the triplet is, the sources are hosted by each third party associated with a given package. For example if SourceForge chooses a bad mirror for a given run you get a whole bunch of failures, none of which are really port bugs.

That's what I meant with "caching mechanism for original sources".

Ok thanks for clarifying your points. Of course an official mechanism to keep the tree in shape would be terrific and of course much better than a PR destined to be closed in the end.

@BillyONeal
Copy link
Member

I'm going to discuss changing our CI builds to never run with binary caching enabled, only PR builds, at our next discussion meeting. (I want to make sure the team is OK with the resulting lower quality of service of PR validation since it's more work for the build fleet given that we have to stay under a core count quota, and the resulting increased expense of keeping the VMs alive longer, rather than making a change like that unilaterally)

@cenit
Copy link
Contributor Author

cenit commented Jun 18, 2020

I'd say that a reasonable compromise would be to expand hash coverage, as to say what declares a binary cache compatible or not. I mean, we should not use just one "outsider" cmake script (target fix is all but important during port build), but all "fundamental" cmake scripts that would change port file behaviour.
Also, we should include compiler full version, cmake version, ninja version.
This might be useful to not let slip in breaking changes, while at the same time preserve a minimum cache mechanism to avoid really unnecessary rebuilds

@BillyONeal
Copy link
Member

BillyONeal commented Jun 18, 2020

I'd say that a reasonable compromise would be to expand hash coverage, as to say what declares a binary cache compatible or not. I mean, we should not use just one "outsider" cmake script (target fix is all but important during port build), but all "fundamental" cmake scripts that would change port file behaviour.

We already do things like that.

Also, we should include compiler full version, cmake version, ninja version.

Compiler is being worked on for that to enable the binary caching stuff. CMake and ninja versions are already included.

The issues this PR have found have been:

  • Servers with sources being down
  • Servers being up but making changes/redirects to their links that cause our SHA512 check to fail
  • Ports that themselves have unstable / racy build scripts (e.g. nettle)
  • Ports that are incompatible with each other or improperly inspecting the output directory (e.g. osg depends on boost-asio but will find a dependency in the asio package but needs boost headers, so if asio is installed the port fails to build -- note that the port declares a dependency on boost-asio)
  • Merge conflicts that also put normal CI on the floor anyway

and I don't believe there's anything we can do to the caching system to truly address such problems. We can mitigate the first 2 in CI by caching the downloaded sources. The third we can't do much other than skipping such flaky ports; we can't really be in the business of fixing everyone's build race in ~1300 projects :). We could theoretically mitigate the next by doing increasingly exhaustive rebuild schemes (e.g. build each port with everything installed except its dependencies, build each port with only its dependencies and remove unrelated ports after each build, etc.) but that gets expensive in terms of compute time fast.

@BillyONeal
Copy link
Member

@cenit Does #12082 resolve the concern you were trying to raise with this PR (and thus this one should be closed?

@cenit
Copy link
Contributor Author

cenit commented Jun 26, 2020

@BillyONeal it’s definitely a bold step in the right direction! We can close this one, sure. The main goal here was to highlight the problem, and it has been addressed both with port fixes and infrastructure fixes.
Work might not be over, but PR goal is reached

@cenit cenit closed this Jun 26, 2020
@cenit cenit deleted the dev/cenit/cmake317_verify branch July 28, 2020 06:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants