Skip to content
This repository has been archived by the owner on Sep 20, 2024. It is now read-only.

Houdini: Farm caching submission to Deadline #4903

Merged

Conversation

moonyuet
Copy link
Member

@moonyuet moonyuet commented Apr 25, 2023

Changelog Description

Implements functionality to offload instances of the specific families to be processed on Deadline instead of locally. This increases productivity as artist can use local machine could be used for other tasks.
Implemented for families:

  • ass
  • redshift proxy
  • ifd
  • abc
  • bgeo
  • vdb

Additional info

Abc export via farm caching submission doesn't include any animation
Current version of deadline does not support vdb farm caching. (Tried with the manual deadline submission)

Testing notes:

  1. Launch Houdini via launcher
  2. Enable "Submitting to Farm" and create instance with the families mentioned above
  3. The selection of the object would be submitting to farm
  4. After farm publishing, the loader would have subset parented to the families.

image

@ynbot
Copy link
Contributor

ynbot commented Apr 25, 2023

Task linked: OP-5744 Houdini farm caching - Analysis

@ynbot ynbot added host: Houdini module: Deadline AWS Deadline related features type: feature Larger, user affecting changes and completely new things labels Apr 25, 2023
@moonyuet moonyuet changed the title farm caching implementation simple farm caching implementation Apr 25, 2023
@ynbot ynbot added the size/S Denotes a PR changes 100-499 lines, ignoring general files label Apr 25, 2023
Copy link
Collaborator

@BigRoy BigRoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions

@moonyuet moonyuet marked this pull request as ready for review May 9, 2023 07:13
@moonyuet moonyuet requested a review from antirotor May 9, 2023 07:14
@mkolar mkolar added the sponsored Client endorsed or requested label May 9, 2023
@mkolar mkolar requested a review from MustafaJafar July 5, 2023 09:50
@mkolar mkolar assigned MustafaJafar and unassigned antirotor Jul 5, 2023
@moonyuet moonyuet requested a review from MustafaJafar July 18, 2023 06:29
@mkolar
Copy link
Member

mkolar commented Jul 18, 2023

@mustafa-zarkash can you give it a test again please?

@MustafaJafar
Copy link
Contributor

MustafaJafar commented Jul 18, 2023

I was testing this PR as a user
the good news:

  • it submits to deadline and caches successfully.

the bad news:


Results

output path refers to the output path in a rop node

oh, I have noticed that I wrote all deadline caches weren't saved however I can remember that some families worked just fine.
I'm going to double check again :"

image

@moonyuet
Copy link
Member Author

moonyuet commented Jul 18, 2023

@mustafa-zarkash I am not sure if I understand your point correctly.
The output of the farm cache should be stored according to what you have assigned in the render output when you create the instances. (see the outputFile). So if you submit to deadline, it is supposed that it would render save outputFile according to the node. Similar ideas to the extractor if you choose not to use the farm instance
image

The deadline submission in Houdini depends on what render nodes you submitted. (as well as the scene for sure)
image

@MustafaJafar
Copy link
Contributor

MustafaJafar commented Jul 18, 2023

@moonyuet
I doubled checked and here are the summary:

  1. deadline did work (however go to no.2)
    image

  2. deadline outputs to the paths specified in the ROP nodes which is pyblish folder in the work folder (however go to no.3)

    {root[work]}/{project[name]}/{hierarchy}/{asset}/work/{task[name]}/pyblish
    
  3. I believe there's a missing step because I can't find anything in the publish folder

    {root[work]}/{project[name]}/{hierarchy}/{asset}/publish/{family}/{subset}/{@version}
    
  4. Unlike when caching locally (untick farm checkbox), every thing work as expected!
    image


Side Notes:
I didn't find any files in the pyblish folder when using farm caching with these families (redshift proxy and vdb)
image

Not Related to this PR:
I think there is something wrong with redshift proxy and bgeo as follows

Copy link
Member

@antirotor antirotor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not triggering publishing plugin submit_publish_job and there are more modifications needed for that.

@moonyuet moonyuet marked this pull request as draft July 25, 2023 10:21
@moonyuet moonyuet assigned moonyuet and unassigned MustafaJafar Jul 25, 2023
@moonyuet moonyuet requested a review from antirotor July 28, 2023 10:04
@moonyuet
Copy link
Member Author

@MustafaJafar can you test it again?

@MustafaJafar MustafaJafar self-requested a review September 21, 2023 13:41
@MustafaJafar
Copy link
Contributor

@moonyuet
It submits to farm and I assume it works fine.

However, Cache jobs and Render Jobs doesn't inject environment variables in the same way which is not compatible with this PR

Oh, Cache Jobs don't inject any environments at all!

image

I think it's related to Job Environments
Injection requires either OPENPYPE_RENDER_JOB or OPENPYPE_REMOTE_PUBLISH to be 1
I think we can add one more OPENPYPE_CACHE_JOB and its Ayon equivalent AYON_CACHE_JOB then updating GlobalJobPreLoad


I have a little question, could you tell me the difference between these two HoudiniCacheSubmitDeadline and ProcessSubmittedCacheJobOnFarm as I'm a little confused ?

@moonyuet
Copy link
Member Author

moonyuet commented Sep 21, 2023

@moonyuet It submits to farm and I assume it works fine.

However, Cache jobs and Render Jobs doesn't inject environment variables in the same way which is not compatible with this PR

Oh, Cache Jobs don't inject any environments at all!

image

I think it's related to Job Environments Injection requires either OPENPYPE_RENDER_JOB or OPENPYPE_REMOTE_PUBLISH to be 1 I think we can add one more OPENPYPE_CACHE_JOB and its Ayon equivalent AYON_CACHE_JOB then updating GlobalJobPreLoad


I have a little question, could you tell me the difference between these two HoudiniCacheSubmitDeadline and ProcessSubmittedCacheJobOnFarm as I'm a little confused ?

There are some differences between two. One is for rendering image from Houdini and the other is for publish job from the finished renders to the publish folder.
Maybe we need to do the same way with the submit render deadline(I mean in terms of environment)

Copy link
Contributor

@MustafaJafar MustafaJafar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is working 🥳️🥳️🥳️
one last thing would you add kitsu keys like in this one #5455

image

@moonyuet
Copy link
Member Author

This is working 🥳️🥳️🥳️ one last thing would you add kitsu keys like in this one #5455

image

If kitsu keys are needed, which means ftrack keys are also needed.
I will include both for just in case

Copy link
Contributor

@MustafaJafar MustafaJafar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet!
image

@fabiaserra
Copy link
Contributor

Hello, I'm sorry if this comes out as a bit harsh but I think the approach this PR is taking to support caching in the farm is wrong and over engineered.

First of all, caching in the farm (and rendering or any other Houdini processes) are already supported by third-party toolsets (Deadline, HQueue, Tractor...) and in WAY more powerful ways that this PR tries to accomplish and the OP plugin framework can manage. This is duplicating all of that logic in OP and adding 1,398 more lines to the already super complex code base!! Most Houdini TDs are already familiarized with those vanilla workflows and having them learn this other "black box" approach through OP is backwards and doesn't add any benefit in my opinion. You can see an example of a very normal submission to the farm here #5621 (comment)

OpenPype shouldn't try to orchestrate the extract/render dependencies of the Houdini node graph, that's already done by these schedulers/submitters, we just need means to be able to run OP publish tasks of the generated outputs, but without doing any gimmicks, just taking a path, a family, a few other data inputs and registering it to the DB so it runs the other integrate plugins of OP like publishing that to SG/ftrack as well (and ideally the API for doing that in OP should be super straightforward to call it from anywhere! the current json file input isn't the best access point to that funcionality). If we wanted to help the existing vanilla submission OP could provide a wrapper of the vanilla submitters so it sets some reasonable defaults and we could intersect the submission to run some pre-validations on the graph... set some parms that might be driven by the OP settings or create utility HDAs to facilitate the creation of the submitted graph so frame dependencies are set correctly and chunk sizes for simulations... but that's it, we don't need to reinvent the wheel by interpreting how the graph needs to be evaluated.

On the other hand, I still don't quite get why the existing submit_publish_job is limited to "render" families and why it's not abstracted in a simple reusable module that any plugins that require to submit to Deadline can reuse it. This PR showed how a lot of the code had to be duplicated again with most of the lines exactly the same, doubling in technical debt. This PR https://github.com/ynput/OpenPype/pull/5451/files goes in the right direction in abstracting some of those things but the right approach should be to remove all of the noise from submit publish job that's render specific and make use of the same util module every time we just need to run an OP publish task in the farm. However, as ï said initially, I don't think we should even take this approach for Houdini and we should just leverage the existing farm submitters code, but this is relevant for any other tasks that we choose to submit to the farm, we are going to have to write a "submit_<insert_family_type>_job" set of plugins for each?

Extra notes:

  • I feel like I already mentioned this elsewhere in another PR but have you gotten anyone requesting to publish IFDs? Is there any use cases to load those after the fact? AFAIK those aren't really consumable by anything other than Houdini and just as a pre-process to then run the render, it's not like ASS files that can be used as an interchange format from the Arnold plugins.

@moonyuet moonyuet marked this pull request as draft November 15, 2023 11:39
@moonyuet moonyuet marked this pull request as ready for review November 24, 2023 04:56
Copy link
Contributor

@MustafaJafar MustafaJafar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've successfully published these types with the default deadline options settings.

  • ifd
  • point cache abc
  • point cache bgeo
  • vdb

I can't test the other two.

OpenPype.
image

Ayon.
image


Could we add a validator that checks if camera exists ?
I created a mantra IFD, set it to farm.
and then my job failed with no clue in the log,
Although, it was my fault that I didn't update the camera path on the node, I think such a validator will help.
we can create a ticket later.
image

Copy link
Member

@antirotor antirotor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @fabiaserra and I think we must take some effort to create more lightweight Deadline support. But that is more of a long-term plan because of the vanilla deadline support in DCCs differs vastly. There are multiple issues with the farm publishing currently, starting from farm specific attributes in the Publisher UI (and how are we handling local/farm rendering) to unifying deadline specific attributes on host/renderers. So I would merge it as it is adding functionality that is useful even if it is not "final" solution.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
host: Houdini module: Deadline AWS Deadline related features size/S Denotes a PR changes 100-499 lines, ignoring general files sponsored Client endorsed or requested type: documentation type: feature Larger, user affecting changes and completely new things
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Enhancement: Publish Houdini point cache through Deadline
10 participants