Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle possible duplicates in random tags generated by tag_torrent #6820

Merged
merged 1 commit into from
Mar 21, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 15 additions & 4 deletions src/tribler/core/components/metadata_store/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,10 +41,21 @@ def get_random_word(min_length=0):


def tag_torrent(infohash, tags_db, tags=None, suggested_tags=None):
tags = tags or [get_random_word(min_length=MIN_TAG_LENGTH)
for _ in range(random.randint(2, 6))]
suggested_tags = suggested_tags or [get_random_word(min_length=MIN_TAG_LENGTH)
for _ in range(random.randint(1, 3))]
if tags is None:
tags_count = random.randint(2, 6)
tags = []
while len(tags) < tags_count:
tag = get_random_word(min_length=MIN_TAG_LENGTH)
if tag not in tags:
tags.append(tag)

if suggested_tags is None:
suggested_tags_count = random.randint(1, 3)
suggested_tags = []
while len(suggested_tags) < suggested_tags_count:
tag = get_random_word(min_length=MIN_TAG_LENGTH)
if tag not in suggested_tags:
suggested_tags.append(tag)
Comment on lines +44 to +58
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT, noncritical: you use the same pattern twice. It is possible to extract this pattern:

def generate_sentence_with_fixed_word_length(words_count: int, min_word_length) -> List[str]:
    sentence = []
    while len(sentence) < words_count:
        word = get_random_word(min_length=min_word_length)
        if word not in sentence:
            sentence.append(word)

    return sentence

and refactor the code block to:

Suggested change
if tags is None:
tags_count = random.randint(2, 6)
tags = []
while len(tags) < tags_count:
tag = get_random_word(min_length=MIN_TAG_LENGTH)
if tag not in tags:
tags.append(tag)
if suggested_tags is None:
suggested_tags_count = random.randint(1, 3)
suggested_tags = []
while len(suggested_tags) < suggested_tags_count:
tag = get_random_word(min_length=MIN_TAG_LENGTH)
if tag not in suggested_tags:
suggested_tags.append(tag)
if tags is None:
tags_count = random.randint(2, 6)
tags = generate_sentence_with_fixed_word_length(tags_count, MIN_TAG_LENGTH)
if suggested_tags is None:
suggested_tags_count = random.randint(1, 3)
suggested_tags = generate_sentence_with_fixed_word_length(suggested_tags_count, MIN_TAG_LENGTH)

Copy link
Contributor Author

@kozlovsky kozlovsky Mar 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion, but I'm not persuaded we should do it, for the following reasons:

  1. A usual "rule of thumb" is "if the same code is repeated three or more times, consider to extract it into a function". Here the same pattern is repeated only twice.

  2. The version with function does not make the code shorter.

  3. It is not obvious that the function will be reusable in other places. In other situations it may be totally ok to have repeated words in a sentence.

  4. Right now all repeated patterns are inside the same function, extracting the code to a second function makes the code "less local" and so a bit more complex to analyze (as the reader cannot assume that the new function is not used from other places).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, but the rationale behind referenced "rule of thumb" is the following:

Attempting premature refactoring risks selecting a wrong abstraction, which can result in worse code as new requirements emerge and will eventually need to be refactored again.

In the code above suggested abstraction is clear and correct. It extracts exactly the same logic to the function with a very clear intention.

The version with function does not make the code shorter.

Agree, but shorter doesn't mean better.
The suggested refactoring code is easier to read because it doesn't require from reader to analyze the code blocs and to determine the pattern.

It is not obvious that the function will be reusable in other places. In other situations it may be totally ok to have repeated words in a sentence.

Agree, you can make it local.

Right now all repeated patterns are inside the same function, extracting the code to a second function makes the code "less local" and so a bit more complex to analyze (as the reader cannot assume that the new function is not used from other places).

Agree, you can make it local.


def _add_operation(_tag, _op, _key):
operation = TagOperation(infohash=infohash, tag=_tag, operation=_op, clock=0,
Expand Down