Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI][Python][Windows] AppVeyor job failed wihtout error message #39884

Closed
kou opened this issue Feb 1, 2024 · 17 comments
Closed

[CI][Python][Windows] AppVeyor job failed wihtout error message #39884

kou opened this issue Feb 1, 2024 · 17 comments

Comments

@kou
Copy link
Member

kou commented Feb 1, 2024

Describe the bug, including details regarding any error messages, version, and platform.

https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/49065846

...
[117/126] Building CXX object CMakeFiles\_s3fs.dir\_s3fs.cpp.obj
[118/126] Linking CXX shared module _s3fs.cp310-win_amd64.pyd
[119/126] cmd.exe /c
[120/126] Building CXX object CMakeFiles\_substrait.dir\_substrait.cpp.obj
[121/126] Linking CXX shared module _substrait.cp310-win_amd6

GH-39652 may be related.

Component(s)

Continuous Integration, Python

@kou
Copy link
Member Author

kou commented Feb 1, 2024

Do we still need AppVeyor?

@kou
Copy link
Member Author

kou commented Feb 1, 2024

https://lists.apache.org/thread/rcqkx19gpfjlkyvrhxfsrckktrdrzm2p
It seems that we can remove AppVeyor jobs.

@kou
Copy link
Member Author

kou commented Feb 1, 2024

#13903 (comment)
It seems that we kept AppVeyor because GitHub Actions was slow.
GitHub Actions isn't slow now. Can we remove AppVeyor jobs?

@danepitkin
Copy link
Member

It looks like the build timed out. I don't see any error messages in the logs. Can we increase the allowed build time in the short term?

@jorisvandenbossche
Copy link
Member

While it indeed times out (after 1h30), it's a bit suspicious that it does, though. Looking at the last working builds, they completed in about 32 to 36 minutes.

@kou
Copy link
Member Author

kou commented Feb 2, 2024

The last [121/126] Linking CXX shared module _substrait.cp310-win_amd6 message showed 00:15:44 by popup.
It seems that a link process got stuck. But we can't debug it...

@pitrou
Copy link
Member

pitrou commented Feb 2, 2024

We can transfer the job on GHA. We should at least keep a CI job with Python on Windows.

@kou
Copy link
Member Author

kou commented Feb 2, 2024

OK. Let's do it.

@felipecrv
Copy link
Contributor

A PR of mine failed the other day due to an ASAN violation in main [1] caught by an AppVeyor CI build that didn't timeout. Does anyone know by heart which builds have ASAN enabled and can confirm AppVeyor is not the only one building with ASAN enabled? cc @assignUser

[1] 419d7df (the violation was in main and didn't relate to the code in my PR).

@assignUser
Copy link
Member

Looking at the nightlies it seems we only have TSAN and valgrind builds (and an R specific sanitizer build). We should move the appveyor build to gha. We have the capacity and even M1s now.

@felipecrv
Copy link
Contributor

Looking at the nightlies it seems we only have TSAN and valgrind builds (and an R specific sanitizer build). We should move the appveyor build to gha. We have the capacity and even M1s now.

So it's possible that ASAN enabled on AppVeyor could be another factor in the slow down. Again: the reason I believe ASAN is enabled there is seeing this violation on an AppVeyor build.

@assignUser
Copy link
Member

I checked the appveyor build and that doesn't have the asan flag set explicitly. But one of the cpp.yml matrix jobs does set it.

@pitrou
Copy link
Member

pitrou commented Feb 13, 2024

Does anyone know by heart which builds have ASAN enabled

It's easy to find out in the C++ CMake build logs. AppVeyor has ASAN off:
https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/49167126?fullLog=true#L1211

-- Checks options:
-- 
--   ARROW_TEST_MEMCHECK=OFF [default=OFF]
--       Run the test suite using valgrind --tool=memcheck
--   ARROW_USE_ASAN=OFF [default=OFF]
--       Enable Address Sanitizer checks
--   ARROW_USE_TSAN=OFF [default=OFF]
--       Enable Thread Sanitizer checks
--   ARROW_USE_UBSAN=OFF [default=OFF]
--       Enable Undefined Behavior sanitizer checks
-- 

@pitrou
Copy link
Member

pitrou commented Feb 13, 2024

Again: the reason I believe ASAN is enabled there is seeing this violation on an AppVeyor build.

Can you please give a link?

@felipecrv
Copy link
Contributor

Again: the reason I believe ASAN is enabled there is seeing this violation on an AppVeyor build.

Can you please give a link?

I can't find it anymore. It must have been "AMD64 Ubuntu 22.04 C++ ASAN UBSAN" and I am mis-remembering it.

@pitrou
Copy link
Member

pitrou commented Feb 26, 2024

The AppVeyor timeout is (temporarily?) fixed by #40225

@pitrou
Copy link
Member

pitrou commented Feb 26, 2024

Closing now that #40225 was merged, though a longer term fix will be required (see #40166).

@pitrou pitrou closed this as completed Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants