Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adagrams: Snow Leopards ( Kae & Diana) #66

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

sorindevops
Copy link

No description provided.

@kaesluder
Copy link

This passes all tests on 3.9, but fails on Learn using 3.8.8

Copy link

@anselrognlie anselrognlie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job!

All your tests are passing and your code is laid out well overall. Nice use of using some built-ins as well your own approaches. And great job catching the warnings from sample. Of course, we always recommend rolling your own approach to practice breaking down problems on your own, so even if you have found a library function to help out, do consider implementing it yourself for the practice.

@@ -1,5 +1,16 @@
# AdaGrams

## Collaboration Plan:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice documentation of your collaboration plan.

@@ -65,3 +68,64 @@ def test_letter_not_selected_too_many_times():
# Assert
for letter in letters:
assert letter_freq[letter] <= LETTER_POOL[letter]

def test_chi_square_frequencies():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I DO NOT have stats experience, so beyond being aware of the chi-square test sounding like the right thing to do, I can only say that your approach makes sense to me: tally up the observed letter counts over a "large" number of runs, and check whether the distribution looks like the expected distribution from the tile counts.

The test that we supplied really just checks that no impossible hands can be drawn, and while it would be possible for that test to be "defeated" by a faulty implementation (i.e., the implementation might not be properly enforceing the limits, but for all of the 1000 test runs, it never happens to exceed any), it will never yield a false negative. It will never fail as long as the implementation is correct.

Statistical based tests are a touchy subject. We want to be sure that unit tests NEVER have a chance of failing for working code. EVen worse if it fails sporadically. So any statistical test would need to be careful tuned to ensure that the tolerances are such that they would NEVER cause the test to fail. If there is a test that sporadically fails, that trains the programmers to ignore the test. If it fails, "it's just that test that fails sometimes," and that can mean that if it truly is failing, no one will take any notice.

Code that depends on randomness does present a problem for testing. Another approach is to avoid the randomness entirely in the test. In our code, rather than directly relying on a library function that has randomness baked in (like randint or sample) we might have our code depend on a passed in supplier of randomness. In the real code, we can provide an implementation that uses the typical provider of randomness, but in the test, we might pass in a more deterministic provider that we know will always produce a particular result, turning the test from a check of statistical probability more towards a test of expected stable behavior.

If we do still want to run some statistical tests (probably still a good idea), we might store them in a different suite of tests. There are more kinds of tests than just unit tests (short, small tests that we run frequently while developing to check basic behaviors), like integration testing, smoke testing, or fuzz testing. These aren't usually run by devs all the time, and hence, might provide a better situation for the test runner to spend more time looking into any failures without it becoming a blocker to the regular devlopment cycle. As it is, this test actually did fail for me once in the few times I ran it! 😱

adagrams/game.py Outdated
# import pprint


LETTER_POOL = {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I like that you pulled these large data objects outside the functions. This lets the functions just have the logic, rather than being dominated by the data definitions.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using strings as the dictionary keys, I like to the use the dict() function-stlye of making a dictionary. It uses keyword argument calling to let us set string keys without needing to quote them. For example

    LETTER_POOL = dict(
        A=9,
        B=2,
        C=2,
        ...
    )

adagrams/game.py Outdated

# get the lists of letter and count for each letter
# as list (preferred by random.sample)
letter_list = list(LETTER_POOL.keys())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Nice. Without doing this, sample raises a deprecation warning.

adagrams/game.py Outdated
# to determine the frequency of each item.
# https://docs.python.org/3/library/random.html#random.sample
# requires python version 3.9 or greater
this_hand = sample(letter_list, k=10, counts=letter_count)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice, concise approach. Thanks for including a comment about how you understand the function to work, and how you reasearched it. Now that we've discussed Big O, it's also useful to think about what the time/space complexity of the function might be. Thinking about how we can implement this functionality ourselves can help us gain insight into what that might be.

Remember that not every language has the same built-in functions, so we should be able to implement this same functionality ourselves. In the projects, we don't ask for any features that we don't think you should be able to write yourself. For drawing a random hand, the only external function that would be "required" is a way to pick a random number (such as randint). At this stage in our coding journeys, it's often more valuable to re-implement things by hand to practice thinking about how to breakdown and approach the problem than to use a single library function to solve the problem for us.

adagrams/game.py Outdated

letters_from_word = list(word)

if 7 <= len(word) <= 10:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

adagrams/game.py Outdated
if letter in SCORE_CHART.keys():
score_total += SCORE_CHART[letter]

return(score_total)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for () around the value being returned.

adagrams/game.py Outdated
return(score_total)


def _high_score_key_helper(word):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you intentionally start the function with _ to indicate this was intended for internal use?

adagrams/game.py Outdated
score_total += extra_points

for letter in letters_from_word:
if letter in SCORE_CHART.keys():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You did a check to see whether a letter was in score table. This is fine and has the affect that any unexpected tiles in the word don't contribute to the score. But where would those unexpected tiles have come from? And should it be the scoring function's job to handle those? We might expect some other part of the game to validate that any words played by a player include only tiles from the tile bank, in which case this function could safely ignore the in check. In fact, we might want to let python raise a KeyError so that the program can find out that, somehow, bad data got into our word.

adagrams/game.py Outdated
# use our helper function to find the highest ranked
# word using max
high_word = max(word_list, key=_high_score_key_helper)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really clever! I can't decide whether I think this is more or less understandable than a more typical approach where we write the loop ourself, performing the comparison and tie handling behavior explcitly.

Writing it out is visually more noisy, but let's us see the decisions being made a bit more explicitly.

This approach depends on the reader understanding that tuples values are compared position by position, and that we'll only move on to the next value if the early one is a tie. So by building a tuple with the score, whether it has 10 tiles, and ultimately its length, we are providing the three criteria for the comparison in weighted order. So a higher score always wins. If there's a tied score, then 10 letter word wins (because True > False), but so long as everything had been False, we go on and check the length, and the shortest wins since you flipped the sign of the length.

So really great on concision and a fantastic use of the max higher order function, but it does leave a lot up to the reader to put it together about what's going on.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having sat with this a bit more, I'm going to go with I like this better than the explicit approach in the long run, and that my initial hesitation has more to do with the languages I've used before. But this is Python, so when in Rome...

My main hesitation come from the fact that in other languages, the way we tend to provide customization for functions like min, max, and sort comes from providing a comparison function. We explicitly write the logic to compare two instances of the data in the list. Python does support this style of customization through the use of an adapter type (functools.cmp_to_key) that can turn a comparison function into something that will work as the key function. But if we embrace the key function fully, it can effectively give us a compound key, comparing several aspects as needed.

It still feels odd to me to compare True and False in a relative sense (Java, for instance does not allow this, but we could simply map true to 1 and false to 0), but Python definitely allows it.

So again, really clever implementation, and great use of the tools of the language!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got really confused trying to figure out the if/else tree and thought about putting it into a comparison function so that we could test how the logic works just word vs. word. But python3 doesn't have comparison functions anymore, just key functions, and somewhere tuple sorting clicked analogous to (LastName, FirstName, MiddleInitial).

I agree that sorting on a Boolean feels a bit weird though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants