-
Notifications
You must be signed in to change notification settings - Fork 13
Dictionaries
Contains the name of the browser and its probability. Used to assign browsers to the persons based on the popularity probability. Used by BrowserDictionary.java
Sample:
Chrome 0.279
Internet Explorer 0.232
Firefox 0.422
Contains the country and the name of a company for that country. Used to give a workplace to the persons corresponding to its homeland if available. Used by CompanyDictionary.
Sample:
Afghanistan Kam Air
Afghanistan Balkh Airlines
Afghanistan Khyber Afghan Airlines
Afghanistan MarcoPolo Airways
Afghanistan Pamir Airways
Afghanistan Bakhtar Afghan Airlines
Afghanistan Safi Airways
Albania Ada Air
Contains the abbreviation and the name of the country that its refer for. This is used to link countries to ips See: Ipzones. Used by IPAddressDictionary.java.
Sample:
ac United Kingdom academic institutions
ad Andorra
ae United Arab Emirates
af Afghanistan
Contains the countryId, the celebrityId and its cumulated probability of popularity within that country. Used to assign a celebrity of the same country of the person if available. Used by TagDictionary.java
Sample:
0 0 0.27328605200945627
0 1 0.4884160756501182
0 2 0.649645390070922
Contains the continent name, the country name, latd, longt, population and cumulated probability of population. Used to create the region-country hierarchy and to distribute the user nationality according to the population data. Used by LocationDictionary.java
Sample:
Asia Afghanistan 35 69 15500000 0.0028010447
Africa Algeria 37 3 29100867 0.008059937
Africa Angola -9 13 5646177 0.0090802721
Contains the tagId, the tagClassId, the tag name and the tag foaf:name. Used in the serialization part of the software to assign names to the tags and write the tag basic class. Used by TagDictionary.java
Sample:
0 349 Hamid_Karzai Hamid Karzai
1 211 Rumi Jalal ad-Dīn Muhammad Rumi
2 98 Mahmud_of_Ghazni Yamīn al-Dawlah Abul-Qāṣim Maḥmūd Ibn Sebük Tegīn
3 336 Abbas_I_of_Persia Shah ‘Abbās I
Contains the email domain name and its probability for the most popular ones and only the name for the rest. Used to assign email domains to the user. Used by EmailDictionary.java.
Sample:
gmail.com 0.45
gmx.com 0.20
yahoo.com 0.18
hotmail.com 0.07
zoho.com 0.06
Contains the CountryName, firstName, gender, birthdate period and an unused number. Used to assign a first name to the user according to the gender and age. Used by NamesDictionary.java
Sample:
Abkhazia Diana 0 0 1
Abkhazia Maya 0 0 1
Abkhazia Diana Gurtskaya 1 0 1
Abkhazia Diana 0 1 1
Contains the country name, the university name and the city of that university. Used to create the country->city hierarchy and to assign to the user a university from the same country. Used by OrganizationsDictionary,java (all data) and LocationDictionary.java (the country->city data)
Sample:
Aland_Islands Aland University of Applied Sciences Mariehamn
Abkhazia Abkhazian State University Sukhumi
Afghanistan Paktia University Gardez
Afghanistan Baghlan University Puli_Khumri
[Work In Progress] Contains the name of the country and a list of language data: the ISO 639-1 code, * if it is a official language and the speaker percentage (0 if unknown). Used to assign languages of its country to the user. Used by LanguageDictionary.java
Sample:
Aruba es 12.6 en 7.7 nl * 5.8
Antigua and Barbuda en * 0
United_Arab_Emirates ar * 0 fa 0 en 0 hi 0 ur 0
Contains the country name, the location name, the location name with spaces, latitude and longitude. Used by PopularPlacesDictionary.java.
Sample:
Afghanistan Ab-Kol Ab-Kol 36.22000122070312 68.5
Afghanistan Ab_Bazan Ab Bazan 36.93333435058594 69.94999694824219
Afghanistan Ab_Daw Ab Daw 36.25 71.16666412353516
Afghanistan Ab_Gaj Ab Gaj 36.98333358764648 72.69999694824219
Contains the name of smarthphone providers. Used by UserAgentDictionary.java
Sample:
IPhone
IPad
HTC
Samsung
LG
Contains the number of appearances of the last name, the country name and the last name. Used to assign a surname to the user. Used by NamesDicationary.java
Sample:
2,Abkhazia,Gurtskaya
1,Abkhazia,Kopitseva
1,Adjara,Vashalomidze
7,Afghanistan,Zaland
Contains the tagClassId, the name and the rdf label. Used to serialize the name and label of the tagClasses. Used by TagDictionary.java
Sample:
0 Thing thing
1 BasketballLeague basketball league
2 LunarCrater lunar crater
3 MilitaryPerson military person
4 AutomobileEngine automobile engine
Contains the base tagClassId and the parent tagClassId. Used to build the tag hierarchy in the serialize process. Used by TagDictionary.java
Sample:
19 179
136 338
173 211
230 149
305 0
Contains the tagId and a text. Used to assign a text to the post and comments related to its tags. Used by TagDictionary.java
Sample:
0 Hamid Karzai, GCMG (Pashto: حامد کرزی, Hāmid Karzay; born 24 December 1957) is the 12th and …
1 Jalāl ad-Dīn Muḥammad Balkhī, also known as Jalāl ad-Dīn Muḥammad Rūmī and …
2 Mahmud of Ghazni, actually Yamīn ad-Dawlah Abdul-Qāṣim …
Contains a the topic id 1, the topic id 2, the cumulative % for topic1 and the number of references the topic1 and topic2 appears in the same text. Used to select a list of correlated tags of the main interest of the user. Used by TagDictionary.java
Sample:
2909 4870 0.0 8.0
2909 4871 2.392072671167751E-4 8.0
2909 4872 4.784145342335503E-4 2.0
Not Used but not deleted yet. Contains the name of the university, the country and a cumulative percentage.
Sample:
University of Cambridge United_Kingdom 100
Harvard University United_States 99.18
Yale University United_States 98.68
Contained in the folder resources/ipaddrByCountries there ara a list of files named XX.zone where XX is a valid country abbreviation contained in the countryAbbrMapping.txt dictionary. Each file contains a list of IP from the country.
Sample of ad.zone (Andorra):
85.94.160.0/19
91.187.64.0/19
109.111.96.0/19
194.158.64.0/19