Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timing problems post Merge pull request #3081 from SunderB/patch/2022-05-09-cmake-copy-bin… #3129

Closed
rbnpi opened this issue Jul 11, 2022 · 40 comments · Fixed by #3132
Closed

Comments

@rbnpi
Copy link
Contributor

rbnpi commented Jul 11, 2022

Both on latest Mac and RPi builds.
Try this program

use_bpm 60
#link

live_loop :test do
  puts current_bpm
  play scale(:c,:major).tick
  sleep 1
end

Runs OK as shown. Then uncomment link command and run again. Thread death Timing exception
Discovered by running example :bach. and then changing use_bpm 60 to link in that program.
Version 4 release works OK with this

@samaaron
Copy link
Collaborator

Hmm, that’s odd. If a fix is not immediately obvious I suggest reverting this commit for 4.0.1 and then looking into things in detail after that.

@rbnpi
Copy link
Contributor Author

rbnpi commented Jul 11, 2022

EDIT WRONG(Actually think Pi is OK) Pi has the same behaviour. Further investigation on Mac shows that midi i/o is not being detected. Also link externally is not working. Tau log shows.

      Powered by Erlang 25             
                                        
   Number of detected peers:           
   0 
                                        
   Current tempo:          
   60 
+--------------------------------------+



14:43:39.274 [info] 
+--------------------------------------+
    This is the Sonic Pi MIDI Server    
       Powered by Erlang 25             
                                        
   Detected MIDI input ports:           
   [] 
                                        
   Detected MIDI output ports:          
   [] 
+--------------------------------------+



14:43:39.275 [info] 
+--------------------------------------+
    This is the Sonic Pi OSC Server     
             == Tau ==                  
       Powered by Erlang 25             
                                        
        Incoming OSC on port 4560         
  OSC cue forwarding to {127, 0, 0, 1}              
                     on port 29444         
+--------------------------------------+



14:43:39.275 [info] 
+--------------------------------------+
    This is the Sonic Pi API Server     
       Powered by Erlang 25             
                                        
       API listening on port 29440         
+--------------------------------------+



14:43:39.277 [info] Running TauWeb.Endpoint with cowboy 2.9.0 at 127.0.0.1:29442 (http)
14:43:39.278 [info] Access TauWeb.Endpoint at http://localhost:29442
14:43:40.218 [info] API /send-pid-to-daemon -> sending pid to Daemon...
14:43:42.593 [error] Cue Server got unexpected message: {:midi_on, 1}
14:43:42.593 [info] Enabling cue forwarding 
14:43:43.326 [info] Enabling cue forwarding 

Still trying to ascertain which commit is causing problem.

@rbnpi
Copy link
Contributor Author

rbnpi commented Jul 11, 2022

Midi I/O devices are there, as earlier versions working. Is Tau [error] Cue Server unexpected message: {midi_on,1) relevant?

@rbnpi
Copy link
Contributor Author

rbnpi commented Jul 11, 2022

On my Mac I built 4.0 from scratch, then dropped the beam folder from that into a build of 4.0.1, replacing the beam folder there, as I don't think anything had changed for this, apart from the revamping of the build scripts. It then all worked. I suspect something may have gone amiss in the large revamp of the build process in recent merged sonderb commit. Going to try the same thing on my RPi build of 4.0.1

@rbnpi
Copy link
Contributor Author

rbnpi commented Jul 11, 2022

Same "solution" on RPi. I dropped in the beam folder from a working 4.0 build into a non working build of 4.0.1 It then worked fine. So it looks like there is an issue with the Tau (or possibly sp-midi or sp-link builds) following the changes between 4.0 and 4.0.1

@lilyinstarlight
Copy link
Contributor

What happens if you only put libsp_midi.so and libsp_link.so from the working beam/priv folder into the non-working beam folder?

@lilyinstarlight
Copy link
Contributor

lilyinstarlight commented Jul 11, 2022

Also, do you have any ideas @SunderB? I did a local revert of #3081 and it did fix the issue for me as well (but I can't investigate any more until later today)

@samaaron
Copy link
Collaborator

Apologies, I’m still travelling. Will look at this tonight and if the solution isn’t obvious (or already fixed) I’ll revert this for 4.0.1 and then we can have a bit longer to really understand what’s going on here.

@rbnpi
Copy link
Contributor Author

rbnpi commented Jul 11, 2022

I think there's several things going on. First the sp_link and sp_midi libs are named with dylib extension when built and installed by 4.0.1 version. I think they should be renamed to .so Secondly, the linux clean script uses the line
rm -rf server/beam/tau/priv/*.{so,dylib,dll} to remove old libs, but it doesn't work with the {so,dylib,dll}
deleting just with *.so did. This means that the first problem would be masked in rebuilds after a clean as the original libs were still left installed.
The actual libs appear to be identical for 4.0 and 4.0.1 builds as you would expect.
I copied the new libs to a 4.0 beam folder and they worked (when renamed to .so) However I think the remainder of teh folder for 4.0.1 doesn't copying a 4.0 beam folder into a 4.0.1 build does wkr, with the sp libararies built by either process.

have to go out now, but may look further later.

@lilyinstarlight
Copy link
Contributor

lilyinstarlight commented Jul 11, 2022

@rbnpi The libs correctly end in .so for me. I figured out what's going on though, and it's related to them not getting copied into the _build folder in the beam distribution

See #3081 (review)

Thank you for doing your testing! Noticing that the beam folder was what gets messed up definitely pointed me in the right direction, and I noticed the issue when doing a diff -r between a good and bad beam folder

@samaaron
Copy link
Collaborator

I think the issue with the .so extension is that macOS typically uses .dylib but the BEAM weirdly requires them to be .so see:

cp "$f" "${SCRIPT_DIR}"/server/beam/tau/priv/$(basename "$f" .dylib).so

@samaaron
Copy link
Collaborator

Unless anyone sees that there's a super quick and reliable fix for this - I think it's probably sensible to roll back for 4.0.1 then look at it again. Any thoughts?

@SunderB
Copy link
Contributor

SunderB commented Jul 11, 2022

@samaaron I’ll have a look into this and see if I can see anything obviously wrong

@rbnpi
Copy link
Contributor Author

rbnpi commented Jul 11, 2022

Yes the name problems was on my Mac. linux/rpi is OK here.

@samaaron
Copy link
Collaborator

samaaron commented Jul 11, 2022

@samaaron I’ll have a look into this and see if I can see anything obviously wrong

As far as I can see from the discussion there are currently two issues:

  1. BEAM NIFs aren't copied before mix release is called
  2. BEAM NIFs don't end up with a .so extension on macOS

@rbnpi
Copy link
Contributor Author

rbnpi commented Jul 11, 2022

Just tried running pi-pre-tau-prod-release.rb a second time AFTER the complete build had finished. That sorted it I think, and fits in with Lily's diagnosis.

@SunderB
Copy link
Contributor

SunderB commented Jul 11, 2022

Would it work if mix release was called at the end of *-build.sh scripts after sp_link and sp_midi were built?

@lilyinstarlight
Copy link
Contributor

lilyinstarlight commented Jul 11, 2022

Yes the name problems was on my Mac. linux/rpi is OK here.

Ah I misunderstood and remember that macOS .so oddity now for NIFs :)

Unless anyone sees that there's a super quick and reliable fix for this - I think it's probably sensible to roll back for 4.0.1 then look at it again. Any thoughts?

The "super quick" fix would be to make linux-pre-tau-prod-release.sh a post-build item rather than a pre-build item. But reverting and then trying again after discussing so as not to hold up 4.0.1 is probably best. I'll let @SunderB answer their thoughts though

@lilyinstarlight
Copy link
Contributor

lilyinstarlight commented Jul 11, 2022

Would it work if mix release was called at the end of *-build.sh scripts after sp_link and sp_midi were built?

Yep that seems to be the only thing missing afaict that is causing this issue

edit: and the stuff @rbnpi mentioned in the below comment need to be addressed

@rbnpi
Copy link
Contributor Author

rbnpi commented Jul 11, 2022

Also renaming on mac, and check the deletion of sp_link and sp_midi on clean scripts. I don't think the delete command used worked.

@SunderB
Copy link
Contributor

SunderB commented Jul 11, 2022

@rbnpi can you modify the last line of app/external/sp_link/CMakeLists.txt and app/external/sp_midi/CMakeLists.txt to look something like this:

if(APPLE)
  install(TARGETS libsp_link.so LIBRARY DESTINATION ${CMAKE_INSTALL_PREFIX})
else()
  install(TARGETS libsp_link LIBRARY DESTINATION ${CMAKE_INSTALL_PREFIX})
endif()

(and swap out 'link' for 'midi' for sp_midi)

Does this solve the macOS .so issue?

@SunderB
Copy link
Contributor

SunderB commented Jul 11, 2022

Wait that won't work libsp_link is the target name - not the output file name

@SunderB
Copy link
Contributor

SunderB commented Jul 11, 2022

Don't dynamic libs have a .dylib extension on macOS? If so, shouldn't there be a way to work with that? Edit: Oops I didn't see the talk about .dylib above

@SunderB
Copy link
Contributor

SunderB commented Jul 11, 2022

@samaaron @rbnpi Try this instead: (Edit: I added the PROGRAMS bit which I missed)

if(APPLE)
  install(PROGRAMS ${CMAKE_CURRENT_BINARY_DIR}/libsp_link.dylib DESTINATION ${CMAKE_INSTALL_PREFIX} RENAME libsp_link.so)
else()
  install(TARGETS libsp_link LIBRARY DESTINATION ${CMAKE_INSTALL_PREFIX})
endif()

@lilyinstarlight
Copy link
Contributor

Don't dynamic libs have a .dylib extension on macOS? If so, shouldn't there be a way to work with that?

Erlang's load_nif/2 assumes .so extension for all "unix" systems, including macOS (though .dylib would probably make more sense on macOS): https://www.erlang.org/doc/man/erlang.html#load_nif-2

@samaaron, should Tau be exiting if the nifs fail to load? Tau doesn't seem to be quitting on that failure, just throwing tons of timing errors for any run started in the editor

@SunderB
Copy link
Contributor

SunderB commented Jul 11, 2022

@lilyinstarlight I see, thanks :) And yeah you'd think there would be a critical error if it couldn't load a NIF, unless Erlang was built to always keep running no matter what.

@SunderB
Copy link
Contributor

SunderB commented Jul 11, 2022

Great - now when I'm building Sonic Pi, I'm getting random errors interrupting the build after some targets

gmake: *** [Makefile:103: all] Error 2

@samaaron
Copy link
Collaborator

@samaaron, should Tau be exiting if the nifs fail to load? Tau doesn't seem to be quitting on that failure, just throwing tons of timing errors for any run started in the editor

Good question. Yes and no. It should totally be possible to boot Tau without support for either MIDI or Link (there is already some work to creating flags for this which the CI currently uses). However, it should also probably send some sort of error message back to the daemon when the NIFs aren't discovered. I guess I've been working on the assumption that they will be there as I package them for Windows and macOS - but that obviously doesn't help to make this kind of situation clear.

@samaaron
Copy link
Collaborator

@SunderB would you be ok if we rolled this back for v4.0.1 which I hope to get out tomorrow and then we can start looking into getting this properly working after that?

@SunderB
Copy link
Contributor

SunderB commented Jul 11, 2022

Currently rebuilding from scratch to see if that fixes it - maybe the vcpkg packages have changed a bit?

@SunderB
Copy link
Contributor

SunderB commented Jul 11, 2022

@SunderB would you be ok if we rolled this back for v4.0.1 which I hope to get out tomorrow and then we can start looking into getting this properly working after that?

Good idea 👍🏼 Sorry for not testing this as thoroughly as I maybe should have!

@lilyinstarlight
Copy link
Contributor

unless Erlang was built to always keep running no matter what

I mean, it was built for writing highly available and scalable real-time communication systems, if I remember correctly :)

Good idea 👍🏼 Sorry for not testing this as thoroughly as I maybe should have!

Thank you for continuing to work on it though! It should help considerably in both build system robustness and for downstream Linux packagers, as we move more things to CMake and vcpkg

@SunderB
Copy link
Contributor

SunderB commented Jul 11, 2022

I don't wanna celebrate too early, but I think I may have it working :)
Edit: changing the tempo works!

@SunderB
Copy link
Contributor

SunderB commented Jul 11, 2022

Should I make a new patch branch and PR for the fixes or put them on the existing one?

@lilyinstarlight
Copy link
Contributor

Good question. Yes and no. It should totally be possible to boot Tau without support for either MIDI or Link (there is already some work to creating flags for this which the CI currently uses). However, it should also probably send some sort of error message back to the daemon when the NIFs aren't discovered. I guess I've been working on the assumption that they will be there as I package them for Windows and macOS - but that obviously doesn't help to make this kind of situation clear.

I mean there could be reasons other than the NIFs aren't found that they might fail to load (e.g. dynamic lib loader errors). But I'm not too concerned about it since yeah in any reasonable scenario they should be found and load successfully. Maybe we could prevent the daemon from spewing all those errors when they fail without changing too much if we adjust the default stub methods that get used before the NIFs are loaded, but it's up to how much of a problem you feel it is and whether it should be accounted for (I'm not invested either way in it)

@lilyinstarlight
Copy link
Contributor

Should I make a new patch branch and PR for the fixes or put them on the existing one?

It can be the same branch if you prefer (assuming it's been rebased or has no conflicts), but it should for-sure at least be a new PR

@rbnpi
Copy link
Contributor Author

rbnpi commented Jul 11, 2022

@SunderB the patches in teh CMake files for renaming the libs for Mac work fine.

also need to replace rm -rf server/beam/tau/priv/*.{so,dylib,dll} in the linux-clean.sh script which doesn't work

@SunderB
Copy link
Contributor

SunderB commented Jul 11, 2022

also need to replace rm -rf server/beam/tau/priv/*.{so,dylib,dll} in the linux-clean.sh script which doesn't work

Was the linux-clean.sh script working before? I don't see why that shouldn't work. - Edit: I've removed the '-r' as it's redundant. Does doing that make it work?

@SunderB
Copy link
Contributor

SunderB commented Jul 11, 2022

I've made a PR with the fixes so far: #3132

@lilyinstarlight
Copy link
Contributor

also need to replace rm -rf server/beam/tau/priv/*.{so,dylib,dll} in the linux-clean.sh script which doesn't work

This does work on Linux and I just tried it with Bash v3.2 (which is the ancient version macOS uses) in a container and it still seemed to work. I'm assuming you're only having issues with it on macOS?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants