-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to generate LM image and GC via two separate jobs #446
base: main
Are you sure you want to change the base?
Conversation
6b40167
to
b74c654
Compare
recognition/advanced_tree_search.py
Outdated
if separate_lmi_gc_generation: | ||
gc = BuildGlobalCacheJob(crp, extra_config, extra_post_config).out_global_cache | ||
|
||
arpa_lms = AdvancedTreeSearchLmImageAndGlobalCacheJob.find_arpa_lms( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe find_arpa_lms
can be moved from a classmethod to a standalone method in the lm module?
I just discovered during testing the LM image and GC are set on the post config, not in the normal config. This means splitting the LMGC-Job into two does not change the hash of any existing jobs. Therefore, I think this flag can be enabled by default, or rather, we can in-line the flag and make it the new default! WDYT? To finish testing I need to run a decoding w/ the new GC and new LM, but otherwise the stuff here is now tested. This is what it looks like when the flag is set in a pipeline w/ thousands of jobs (note due to the GC not being hashed only a few GC-jobs are even run):
|
Apparently switching the default here changes hashes at AppTek. In case it is possible to live w/ the hash breakage (e.g. because it is in unused parts of the code) I'd like the flag to be on by default so as many folks as possible can profit from the changes here. If the hash breakage is unacceptable we leave it off by default of course. |
*Any existing search jobs. The graph however is changed as in all of the LMGC Jobs are now removed and separate LM and GC Jobs are added. This is caught in the pipeline.
Unfortunately all parts that are tested in the pipeline are used. |
I have successfully run recognitions w/ this setup. This is tested now. |
recognition/advanced_tree_search.py
Outdated
@@ -190,6 +173,7 @@ def __init__( | |||
:param lmgc_mem: Memory requirement for the AdvancedTreeSearchLmImageAndGlobalCacheJob | |||
:param lmgc_alias: Alias for the AdvancedTreeSearchLmImageAndGlobalCacheJob | |||
:param lmgc_scorer: Dummy scorer for the AdvancedTreeSearchLmImageAndGlobalCacheJob which is required but unused | |||
:param separate_lm_image_gc_generation: Whether to generate the LM image and the global cache via two separate jobs for a more stable hash. Whether or not this flag is set is not part of the hash, so using separate jobs is the default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be a so NOT using separate jobs is the default?
AppTek hash broke after changing only a comment, did the AppTek pipeline change in the meantime? |
@curufinwe @michelwi @christophmluscher any idea? |
We now do require the changes that were introduced in #455 and your PR is so old it does not include them. If you rebase to the current main all should be fine. Thanks. |
@curufinwe @michelwi can we merge this? |
Since the apptek test passes, I see no objections from our site. I'll dismiss my old review, but I currently don't have much time to re-review it.. sorry. |
import i6_core.rasr as rasr | ||
|
||
|
||
def _has_image(c: rasr.RasrConfig, pc: rasr.RasrConfig): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would either like to spell config and post_config out or have a small docstring here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @Atticus1806
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @Atticus1806
return res | ||
|
||
|
||
def find_arpa_lms(lm_config: rasr.RasrConfig, lm_post_config=None) -> List[Tuple[rasr.RasrConfig, rasr.RasrConfig]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the post config also rasr.RasrConfig
? Or some other type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def find_arpa_lms(lm_config: rasr.RasrConfig, lm_post_config=None) -> List[Tuple[rasr.RasrConfig, rasr.RasrConfig]]: | |
def find_arpa_lms(lm_config: rasr.RasrConfig, lm_post_config: Optional[rasr.RasrConfig] = None) -> List[Tuple[rasr.RasrConfig, rasr.RasrConfig]]: |
yes, post configs should also be of type RasrConfig
TODO: add Optional
import above and check with black for line length :P
@@ -206,6 +190,8 @@ def __init__( | |||
self.config, | |||
self.post_config, | |||
self.lm_gc_job, | |||
self.gc_job, | |||
self.lm_image_jobs, | |||
) = AdvancedTreeSearchJob.create_config(**kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is .create_config
only called within this job? Otherwise we need to be careful with changing the returned variables. But I think you caught the cases here and otherwise the change is easy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the AdvancedTreeSearchWithRescoringJob
below in this file inherits from AdvancedTreeSearchJob
and uses super().create_config
. that should be fixed.
if lmgc_alias is not None: | ||
lm_gc.add_alias(lmgc_alias) | ||
lm_gc.rqmt["mem"] = lmgc_mem | ||
def add_lm_config_to_crp(crp, lm_config): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def add_lm_config_to_crp(crp, lm_config): | |
def add_lm_config_to_crp(crp: rasr.CommonRasrParameters, lm_config: rasr.RasrConfig): |
config.flf_lattice_tool.network.recognizer.lm, | ||
post_config.flf_lattice_tool.network.recognizer.lm, | ||
) | ||
for i, lm_config in enumerate(arpa_lms): | ||
lm_config[1].image = lm_gc.out_lm_images[i + 1] | ||
for i, (_lm_config, lm_post_config) in enumerate(arpa_lms): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
(i + 1): lm.CreateLmImageJob( | ||
add_lm_config_to_crp(crp, lm_config), extra_config=extra_config, extra_post_config=extra_post_config | ||
) | ||
for i, (lm_config, _lm_post_config) in enumerate(arpa_lms) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for the _
. Maybe I am overlooking something in the web view. If its not used you could just fully replace it with _
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the `post_config here is unused, because
- both
config
andpost_confic
are extracted from thecrp
- only the
config
is modified - the
crp
with the oldpost_config
is still being used
+1 to can be just _
to make clear it is unused
for i, (lm_config, _lm_post_config) in enumerate(arpa_lms) | |
for i, (lm_config, _) in enumerate(arpa_lms) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the `post_config here is unused, because
- both
config
andpost_confic
are extracted from thecrp
- only the
config
is modified - the
crp
with the oldpost_config
is still being used
+1 to can be just _
to make clear it is unused
for i, (lm_config, _lm_post_config) in enumerate(arpa_lms) | |
for i, (lm_config, _) in enumerate(arpa_lms) |
With two approves yes :). I just got a few comments then I can approve, but please ask someone else to also review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think many open comments should be addressed before merging.
import i6_core.rasr as rasr | ||
|
||
|
||
def _has_image(c: rasr.RasrConfig, pc: rasr.RasrConfig): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @Atticus1806
return res | ||
|
||
|
||
def find_arpa_lms(lm_config: rasr.RasrConfig, lm_post_config=None) -> List[Tuple[rasr.RasrConfig, rasr.RasrConfig]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def find_arpa_lms(lm_config: rasr.RasrConfig, lm_post_config=None) -> List[Tuple[rasr.RasrConfig, rasr.RasrConfig]]: | |
def find_arpa_lms(lm_config: rasr.RasrConfig, lm_post_config: Optional[rasr.RasrConfig] = None) -> List[Tuple[rasr.RasrConfig, rasr.RasrConfig]]: |
yes, post configs should also be of type RasrConfig
TODO: add Optional
import above and check with black for line length :P
(i + 1): lm.CreateLmImageJob( | ||
add_lm_config_to_crp(crp, lm_config), extra_config=extra_config, extra_post_config=extra_post_config | ||
) | ||
for i, (lm_config, _lm_post_config) in enumerate(arpa_lms) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the `post_config here is unused, because
- both
config
andpost_confic
are extracted from thecrp
- only the
config
is modified - the
crp
with the oldpost_config
is still being used
+1 to can be just _
to make clear it is unused
for i, (lm_config, _lm_post_config) in enumerate(arpa_lms) | |
for i, (lm_config, _) in enumerate(arpa_lms) |
(i + 1): lm.CreateLmImageJob( | ||
add_lm_config_to_crp(crp, lm_config), extra_config=extra_config, extra_post_config=extra_post_config | ||
) | ||
for i, (lm_config, _lm_post_config) in enumerate(arpa_lms) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the `post_config here is unused, because
- both
config
andpost_confic
are extracted from thecrp
- only the
config
is modified - the
crp
with the oldpost_config
is still being used
+1 to can be just _
to make clear it is unused
for i, (lm_config, _lm_post_config) in enumerate(arpa_lms) | |
for i, (lm_config, _) in enumerate(arpa_lms) |
@@ -190,6 +173,7 @@ def __init__( | |||
:param lmgc_mem: Memory requirement for the AdvancedTreeSearchLmImageAndGlobalCacheJob | |||
:param lmgc_alias: Alias for the AdvancedTreeSearchLmImageAndGlobalCacheJob | |||
:param lmgc_scorer: Dummy scorer for the AdvancedTreeSearchLmImageAndGlobalCacheJob which is required but unused | |||
:param separate_lm_image_gc_generation: Whether to generate the LM image and the global cache via two separate jobs for a more stable hash. Whether or not this flag is set is not part of the hash, so NOT using separate jobs is the default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:param separate_lm_image_gc_generation: Whether to generate the LM image and the global cache via two separate jobs for a more stable hash. Whether or not this flag is set is not part of the hash, so NOT using separate jobs is the default. | |
:param separate_lm_image_gc_generation: Whether to generate the LM image and the global cache via two separate | |
jobs for a more stable hash. This flag is not part of the hash, using a combined job is the default. |
break comment into two lines to respect 120 char line length?
if separate_lm_image_gc_generation: | ||
gc_job = BuildGlobalCacheJob(crp, extra_config, extra_post_config) | ||
|
||
arpa_lms = lm.find_arpa_lms(crp.language_model_config, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function lm.find_arpa_lms
only returns the LMs that do not already have an lm.image - because for those that already have an image we do not need to create a new one.
But the lm.image usually is defined in the post_config. When we here not pass the post config, then all (arpa) LMs are found and returned. Therefore we do extra work here.
And even worse: below in line 404/418 we call the function again, but with the post config. So if the original crp contain a mix of arpa LMs out of which some have already images and others do not, then the items in arpa_lms
differ between the calls. And since we use the index to match the image to the LM, the mapping will be off and the wrong image will be assigned.
arpa_lms = lm.find_arpa_lms(crp.language_model_config, None) | |
arpa_lms = lm.find_arpa_lms(crp.language_model_config, crp.language_model_post_config) |
|
||
arpa_lms = AdvancedTreeSearchLmImageAndGlobalCacheJob.find_arpa_lms( | ||
arpa_lms = lm.find_arpa_lms( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have called lm.find_arpa_lms above. Maybe move the above call out of the if condition and then reuse the arpa_lms
from above here. This would also avoid mismatching items in the list as I outlined above.
@@ -206,6 +190,8 @@ def __init__( | |||
self.config, | |||
self.post_config, | |||
self.lm_gc_job, | |||
self.gc_job, | |||
self.lm_image_jobs, | |||
) = AdvancedTreeSearchJob.create_config(**kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the AdvancedTreeSearchWithRescoringJob
below in this file inherits from AdvancedTreeSearchJob
and uses super().create_config
. that should be fixed.
Closes #430
Closes #514
Now testing...