-
-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Names grouped by country #17
Comments
I need a bit different output that i think i can not get from this one. :( Something like this would work:
|
@djuxy I'm generating you that! |
@djuxy Here you go: https://drive.google.com/file/d/1wmVNXcfOYOcqhVesilI7LE0Lcm9-ivnw/view?usp=sharing. It's a ZIP with this folder structure:
For a given JSON file, I provide pairs <name, count> where count is the number of occurrences of this name for this country. It's just a count without any normalization, except for a very simple filtering that I describe here: NOTE: To save a bit of space and to reduce the noise, I kept the records with
|
@philipperemy Amazing! Thank You! P.S. I don't need this, but probably somebody needs only male/female names. It would be great to separate male and female names if it's possible. |
I'm happy I could help here. Yeah def a great idea here. We have male/female in the FB dump so we could add a I'll think of some nice way to integrate it |
@djuxy I'm going to generate a giant CSV containing Then with pandas it will be easy for anyone to manipulate it and derive some stats or whatever metric they want. |
@philipperemy 👏 👏 👏 That would be great! Thank you! |
I give a thought. It's better to have 2 csv:
Why? |
@djuxy I have just seen your message after I generated the large CSV. Have a look: It is a folder containing one CSV per country using the country ISO code alpha_2 (hint: pycountry). I also added the country code in the last column if you wanted to concatenate all of them together. The uncompressed version takes around 10GB on the disk.
It should contain all the information that you need. The gender is either I agree that we can optimize it. Your suggestions make sense and I'll update all of that once I can find enough time. Have a look in the mean time and let me know! |
@philipperemy I just sent an email to your gmail :) |
@SHEFKEVIN I replied you |
Hi @philipperemy ,
Great work!
Is it possible to get a list of names (first/last) and % of appearance in a given country?
Cheers
The text was updated successfully, but these errors were encountered: