-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attention visualization #121
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job, thanks a lot! Please pay attention to the tests. One of the unit tests is failing because of the TensorBoard summaries (it issue can be also in the test). The integration test is failing because it cannot download the data it should work with. It should get fixed when you rebase with master.
@@ -47,6 +48,12 @@ def attention_decoder(decoder_inputs, initial_state, attention_objects, | |||
outputs.append(output) | |||
states.append(state) | |||
|
|||
if summary_collections: | |||
for i, a in enumerate(attention_objects): | |||
alignments = tf.expand_dims(tf.transpose(tf.pack(a.attentions_in_time), perm=[1, 2, 0]), -1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is too long. We check the style of the files using pylint
. If you run tests/lint_run.sh
, it will check all python files get you a detailed report.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this is general enough. If you look e.g, at the Recurrent Neural Machine Translation paper, they use GRU net do the attention instead of the softmax weighting and come with a clever way visualizing this attention. i think that reserving the attention_in_time
field for the visualization purpose should work even for the recurrent attention, but please check if it is so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just FYI, the line length used by me is 80. I will add it to the documentation for developers
@@ -223,9 +223,10 @@ def training_loop(sess, saver, | |||
|
|||
if step % validation_period == validation_period - 1: | |||
decoded_val_sentences, decoded_raw_val_sentences, \ | |||
val_evaluation = run_on_dataset( | |||
val_evaluation, val_images = run_on_dataset( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering that the inputs of NM can be also images (we did image captioning with it previously), name images
may be confusing. What about plots
, visualizations
, ... I don't know, maybe even images are OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about out_images
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd stick with val_plots
.
@@ -29,7 +29,7 @@ name=translation | |||
output=tests/tmp-test-output | |||
overwrite_output_dir=True | |||
batch_size=16 | |||
epochs=2 | |||
epochs=5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 5?
@cifkao: Can you rebase the branch with master and find out whether it will fix the failing unit tets? |
@jlibovicky @cifkao I think the best solution is maybe to cherry-pick the commit that adds the testing data to the repository: ab0a0ba |
@jindrahelcl I agree, rebasing onto master would be difficult. But maybe it's better to cherry-pick ab0a0ba into Apart from that, I found that the visualization is incorrect because |
The data should be already in the bahdanau branch (commit 5d88de3).. The graph contains the attention mechanism ops twice - one for training and the other one for runtime. During validation, the training ops are not executed, so there should not be a problem. |
@cifkao if you don't want to cherry-pick the commit (which is imho the best option), merge the bahdanau branch to this one. (Or rebase this on bahdanau.. But rebasing something that is already pushed to github should be avoided) |
The problem is that the same This was, of course, easily fixed by taking only the last |
Huh, this looks very much like some sort of bug. I'll look into it. 2016-11-08 8:30 GMT+00:00 Ondřej Cífka [email protected]:
|
It's definitely a bug, but luckily, it does not influence performance of most of the models, but it is probably a reason why the coverage model worked so poorly. |
Soft alignment visualization (#108) for the bahdanau branch.