Fix JAX RNN backend issue. #924

qlzh727 · 2023-09-19T22:18:18Z

This PR address several issues:

The existing RNN layer is not training properly due the usage of a fresh StatelessScope in the jax.lax.scan loop. This is causing all the trainable variables to miss the mapping to the actual value in the training loop. Update them to use the parent Stateless scope if it is there. This will address the training issue Example nlp/lstm_seq2seq.py doesn't train with JAX backend #322
The RNN layers with dropout will have a RNG seed update in the step function, which is not allowed by the jax.lax.scan. We noticed this issue since the updated seed is traced for non-trainable variable, and raise error when we try to put sharding constraint for distribution. Added a new method to pre-populate the dropout mask on the layer and make the inner_loop to be stateless.
During the unit test, I noticed the stackRNNCell doesn't work with existing RNNCell, since it unwrap the list for the state, make the call function to keep the list if the input state is a list.
Expose the SimpleRNN|GRU|LSTM cells in the init.py since they are public API.

codecov · 2023-09-19T22:23:43Z

Codecov Report

Patch coverage: 80.95% and project coverage change: +0.03% 🎉

Comparison is base (c64de55) 79.73% compared to head (a340b01) 79.76%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #924      +/-   ##
==========================================
+ Coverage   79.73%   79.76%   +0.03%     
==========================================
  Files         318      318              
  Lines       28627    28645      +18     
  Branches     5447     5451       +4     
==========================================
+ Hits        22827    22850      +23     
+ Misses       4333     4332       -1     
+ Partials     1467     1463       -4

Flag	Coverage Δ
keras_core	`79.69% <80.95%> (+0.03%)`	⬆️
keras_core-numpy	`60.40% <80.95%> (+0.01%)`	⬆️
keras_core-tensorflow	`66.84% <71.42%> (+0.02%)`	⬆️
keras_core-torch	`69.25% <71.42%> (+0.03%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed	Coverage Δ
keras_core/backend/jax/rnn.py	`9.09% <33.33%> (+0.54%)`	⬆️
keras_core/layers/__init__.py	`96.00% <100.00%> (+0.09%)`	⬆️
keras_core/layers/rnn/rnn.py	`85.98% <100.00%> (+0.95%)`	⬆️
keras_core/layers/rnn/stacked_rnn_cells.py	`87.01% <100.00%> (+5.43%)`	⬆️

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

fchollet

Thanks for the fix!

Fix JAX RNN backend issue.

a340b01

qlzh727 requested a review from fchollet September 19, 2023 22:18

fchollet approved these changes Sep 19, 2023

View reviewed changes

fchollet merged commit f2c3766 into keras-team:main Sep 19, 2023
7 of 8 checks passed

qlzh727 mentioned this pull request Sep 19, 2023

Memory optimization for jax trainer. #888

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix JAX RNN backend issue. #924

Fix JAX RNN backend issue. #924

qlzh727 commented Sep 19, 2023

codecov bot commented Sep 19, 2023 •

edited

Loading

fchollet left a comment

Fix JAX RNN backend issue. #924

Fix JAX RNN backend issue. #924

Conversation

qlzh727 commented Sep 19, 2023

codecov bot commented Sep 19, 2023 • edited Loading

Codecov Report

fchollet left a comment

Choose a reason for hiding this comment

codecov bot commented Sep 19, 2023 •

edited

Loading