Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert Intent ID Hashes from Integer to String #8929

Merged
merged 10 commits into from
Jul 21, 2021

Conversation

b-quachtran
Copy link
Contributor

@b-quachtran b-quachtran commented Jun 21, 2021

Proposed changes:

Status (please check what you already did):

  • added some tests for the functionality
  • updated the documentation
  • updated the changelog (please check changelog for instructions)
  • reformat files using black (please check Readme for instructions)

@joejuzl
Copy link
Contributor

joejuzl commented Jun 24, 2021

While I think a string probably makes more sense, I'm a bit worried about the further implications of this change:

  • I think this will break the event schema (rasa/shared/utils/schemas/events.py::24).
  • Could this be a problem if a custom action server relies on it being an integer?
  • Same for custom event brokers?
  • Rasa X I think is ok because it stores the whole parse_data as json text.

@b-quachtran
Copy link
Contributor Author

@joejuzl Do you think this solution still makes sense given the implications or if there's a better place in the code to cast it to avoid breaking changes?

@joejuzl
Copy link
Contributor

joejuzl commented Jun 25, 2021

@b-quachtran could it be converted in the KafkaEventBroker?

@b-quachtran
Copy link
Contributor Author

@joejuzl That would bring up object consistency issues between event brokers though right? I think this specific customer would be okay with a temporary fix using a custom Kafka Broker, but long-term does it make sense to consider changing type "id": {"type": "number"} to a string?

@wochinge
Copy link
Contributor

How about we use a config flag in Kafka now to change the ID to string and create a 3.0 issue to change it globally to string as we can do breaking changes in a major?

@b-quachtran b-quachtran requested a review from a team as a code owner June 28, 2021 22:25
@b-quachtran b-quachtran removed the request for review from dakshvar22 June 28, 2021 22:25
changelog/8929.improvement.md Outdated Show resolved Hide resolved
rasa/core/brokers/kafka.py Show resolved Hide resolved
rasa/core/brokers/kafka.py Show resolved Hide resolved
@b-quachtran b-quachtran added the area:rasa-oss/event-brokers Issues focused around the integration of rasa with different event brokers (such as RabbitMQ/Kafka) label Jun 29, 2021
@b-quachtran b-quachtran requested a review from joejuzl June 30, 2021 18:37
Copy link
Contributor

@joejuzl joejuzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one!
One, small comment, and then I think it's good to go 👍

@@ -212,6 +212,9 @@ def _convert_intent_id_to_string(self, event: Dict[Text, Any]) -> Dict[Text, Any
event["parse_data"]["intent"]["id"] = str(
event["parse_data"]["intent"]["id"]
)
for idx, parse_data in enumerate(event["parse_data"]["intent_ranking"]):
parse_data["id"] = str(parse_data["id"])
event["parse_data"]["intent_ranking"][idx] = parse_data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need this line (or the enumerate) as you are modifying the dict in the line above.

Copy link
Contributor Author

@b-quachtran b-quachtran Jul 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joejuzl intent_ranking is a separate field from intent and contains the full breakdown of intents + ID's, so the loop takes care of that piece. Does removing enumerate still make sense in this case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry missed this before was merged.
I mean that parse_data is a direct reference to event["parse_data"]["intent_ranking"][idx] so when you update parse_data["id"] you are also also updating event["parse_data"]["intent_ranking"][idx]["id"] so no need to re-assign parse_data to event["parse_data"]["intent_ranking"][idx]. And thus you no longer need to know idx.

Copy link
Contributor Author

@b-quachtran b-quachtran Jul 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joejuzl Does python pass by reference though? I understood it as you have to use the index to update list elements in a loop.

@b-quachtran b-quachtran enabled auto-merge July 21, 2021 13:51
@b-quachtran b-quachtran merged commit 3ac9530 into main Jul 21, 2021
@b-quachtran b-quachtran deleted the fix-intent-id-formatting branch July 21, 2021 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:rasa-oss/event-brokers Issues focused around the integration of rasa with different event brokers (such as RabbitMQ/Kafka)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants