Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix log parsing undefined variable and duplicate sequence id errors #1809

Merged

Conversation

dagardner-nv
Copy link
Contributor

Description

  • Fixes two related bugs which occur when running the log_parsing example with unexpected data.

  • Fix error in LogParsingPostProcessingStage::__get_label_dicts where there is a possibility that the new_label and new_confidence variables are undefined.

  • Fix error where the tokenize_text_series method can return a sequence_ids list with duplicate ids when a text series is passed in and one of the values has a length longer than seq_len and truncation=False, this prevents a down-stream error in the inference stage

RuntimeError: Inconsistent ID column. Last element in 'seq_ids' tensor, [1], must not extend beyond last message, [0]

Not sure if this is the best place to catch this error.

Closes #1765

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

@dagardner-nv dagardner-nv requested a review from a team as a code owner July 11, 2024 21:14
@dagardner-nv dagardner-nv added bug Something isn't working non-breaking Non-breaking change labels Jul 11, 2024
@dagardner-nv dagardner-nv self-assigned this Jul 12, 2024
examples/log_parsing/postprocessing.py Outdated Show resolved Hide resolved
tests/utils/test_cudf_subword_helper.py Outdated Show resolved Hide resolved
@dagardner-nv
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit a527ee7 into nv-morpheus:branch-24.10 Jul 25, 2024
11 checks passed
@dagardner-nv dagardner-nv deleted the david-log-parsing-undefined-var branch July 25, 2024 18:30
@dagardner-nv dagardner-nv restored the david-log-parsing-undefined-var branch October 30, 2024 00:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Non-breaking change
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[BUG]: undefined variable error in examples/log_parsing/postprocessing.py
2 participants