Аналіз дадзеных тэгу name у ОСМ Беларусі
Праблематыка
Спампуем дамп ОСМ
Усталёўваем залежанасьцьі
Пошук сувязей дзеля падтрыманьня спасылачнай цэласнасьці
У беларускім ОСМ шырока выкарыстоўваюцца беларуская і расейская мова, для іх ёсьць адпаведнікі name:be
і name:ru
, таксама мовы выкарыстоўваюцца ў агульных тэгах як name
, addr:*
і іншых. Праблематка выкарыстоўваньня аднае, ці іншае, ці абедзьвух моваў апісанае тут https://wiki.openstreetmap.org/wiki/BE:Belarus_language_issues . Незалежна ад варыянту выкарыстоўваньня мовы павінны вытрымлівацца наступныя правілы: пошук на любое мове мусіць працаваць, павінна быць магчымасьць паказываць подпісы на любой мове (ці ў арыгінале, але гэтае правіла зараз не выконваецца), павінна захоўвацца спасылкавая цэласнасьць (што можа ўплываць на папярэднія два пункты).
Гэты аналіз ставіць мэтаю знайсьці адпаведныя катэгорыі і тэгі якія ўтрымліваюць кірылічныя значэньні тэгу name і падлічыць запаўняльнасьць тэгаў name:be, name:ru.
# !rm belarus-latest.osm.pbf
!wget - - backups = 1 - N https :// download .geofabrik .de / europe / belarus - latest .osm .pbf
!cp belarus - latest .osm .pbf belarus - updated .osm .pbf
#!osmium fileinfo -e belarus-latest.osm.pbf
info = !osmium fileinfo - e belarus - latest .osm .pbf
--2022-06-19 11:39:53-- https://download.geofabrik.de/europe/belarus-latest.osm.pbf
Resolving download.geofabrik.de (download.geofabrik.de)... 116.202.112.212, 95.216.28.113
Connecting to download.geofabrik.de (download.geofabrik.de)|116.202.112.212|:443... connected.
HTTP request sent, awaiting response... 304 Not Modified
File ‘belarus-latest.osm.pbf’ not modified on server. Omitting download.
largest_node_id = [int (l .split (':' )[- 1 ].strip ()) for l in info if 'Largest node ID' in l ][0 ]
largest_way_id = [int (l .split (':' )[- 1 ].strip ()) for l in info if 'Largest way ID' in l ][0 ]
largest_rel_id = [int (l .split (':' )[- 1 ].strip ()) for l in info if 'Largest relation ID' in l ][0 ]
try :
from belarus_utils import OverpassApiSearchEnigne
overpass_api = OverpassApiSearchEnigne (cache = True )
osc = overpass_api .get_updates_osc (largest_node_id , largest_way_id , largest_rel_id )
with open ('belarus-latest.osc' , 'w' ) as h :
h .write (osc )
except Exception as err :
# print(err)
raise err
!osmium apply - changes - - overwrite - o belarus - updated .osm .pbf belarus - latest .osm .pbf belarus - latest .osc
[======================================================================] 100% =================================================> ] 70%
патрэбна толькі калі хочам атрымаць больш дакладныя дадзеныя, але можа не ўтрымліваць некаторыя дачыненьні
!PGPASSWORD = $POSTGRES_PASSWORD psql - h $POSTGRES_HOST - p $POSTGRES_POST - U $POSTGRES_USER - d $POSTGRES_DB - c "CREATE EXTENSION IF NOT EXISTS hstore"
!PGPASSWORD = $POSTGRES_PASSWORD psql - h $POSTGRES_HOST - p $POSTGRES_POST - U $POSTGRES_USER - d $POSTGRES_DB - c "CREATE EXTENSION IF NOT EXISTS postgis"
!PGPASSWORD = $POSTGRES_PASSWORD psql - h $POSTGRES_HOST - p $POSTGRES_POST - U $POSTGRES_USER - d $POSTGRES_DB - c "DROP MATERIALIZED VIEW IF EXISTS planet_osm_region CASCADE"
NOTICE: extension "hstore" already exists, skipping
CREATE EXTENSION
NOTICE: extension "postgis" already exists, skipping
CREATE EXTENSION
NOTICE: drop cascades to materialized view planet_osm_named_data
DROP MATERIALIZED VIEW
!PGPASSWORD = $POSTGRES_PASSWORD osm2pgsql - H $POSTGRES_HOST - P $POSTGRES_POST - U $POSTGRES_USER - d $POSTGRES_DB - v - l - j - G - x - - hstore - add - index - C $OSM2PGSQL_CACHE - S / usr / share / osm2pgsql / default .style belarus - updated .osm .pbf
2022-06-19 11:40:23 osm2pgsql version 1.6.0
2022-06-19 11:40:23 [0] Database version: 14.3
2022-06-19 11:40:23 [0] PostGIS version: 3.2
2022-06-19 11:40:23 [0] Reading file: belarus-updated.osm.pbf
2022-06-19 11:40:23 [0] Started pool with 4 threads.
2022-06-19 11:40:23 [0] Using projection SRS 4326 (Latlong)
2022-06-19 11:40:23 [0] Using built-in tag transformations
2022-06-19 11:40:23 [0] Middle 'ram' options:
2022-06-19 11:40:23 [0] locations: true
2022-06-19 11:40:23 [0] way_nodes: true
2022-06-19 11:40:23 [0] nodes: false
2022-06-19 11:40:23 [0] untagged_nodes: true
2022-06-19 11:40:23 [0] ways: false
2022-06-19 11:40:23 [0] relations: false
2022-06-19 11:40:23 [0] Setting up table 'planet_osm_point'
2022-06-19 11:40:23 [0] Setting up table 'planet_osm_line'
2022-06-19 11:40:23 [0] Setting up table 'planet_osm_polygon'
2022-06-19 11:40:23 [0] Setting up table 'planet_osm_roads'
2022-06-19 11:41:50 [0] Reading input files done in 87s (1m 27s).
2022-06-19 11:41:50 [0] Processed 31134348 nodes in 29s - 1074k/s
2022-06-19 11:41:50 [0] Processed 4395479 ways in 52s - 85k/s
2022-06-19 11:41:50 [0] Processed 65014 relations in 6s - 11k/s
2022-06-19 11:41:50 [0] Overall memory usage: peak=1420MByte current=1380MByte
2022-06-19 11:41:51 [0] Middle 'ram': Node locations: size=31134348 bytes=296M
2022-06-19 11:41:51 [0] Middle 'ram': Way nodes data: size=80795694 capacity=125829120 bytes=120M
2022-06-19 11:41:51 [0] Middle 'ram': Way nodes index: size=4395479 capacity=7340032 bytes=56M
2022-06-19 11:41:51 [0] Middle 'ram': Object data: size=0 capacity=1048576 bytes=1M
2022-06-19 11:41:51 [0] Middle 'ram': Object indexes: size=0 capacity=0 bytes=0M
2022-06-19 11:41:51 [0] Middle 'ram': Memory used overall: 473MBytes
2022-06-19 11:41:51 [1] Starting task...
2022-06-19 11:41:51 [3] Starting task...
2022-06-19 11:41:51 [2] Starting task...
2022-06-19 11:41:51 [4] Starting task...
2022-06-19 11:41:51 [1] Clustering table 'planet_osm_point' by geometry...
2022-06-19 11:41:51 [3] Clustering table 'planet_osm_polygon' by geometry...
2022-06-19 11:41:51 [2] Clustering table 'planet_osm_line' by geometry...
2022-06-19 11:41:51 [4] Clustering table 'planet_osm_roads' by geometry...
2022-06-19 11:41:51 [3] Using native order for clustering table 'planet_osm_polygon'
2022-06-19 11:41:51 [2] Using native order for clustering table 'planet_osm_line'
2022-06-19 11:41:51 [1] Using native order for clustering table 'planet_osm_point'
2022-06-19 11:41:51 [4] Using native order for clustering table 'planet_osm_roads'
2022-06-19 11:41:57 [4] Creating geometry index on table 'planet_osm_roads'...
2022-06-19 11:42:00 [4] Creating hstore indexes on table 'planet_osm_roads'...
2022-06-19 11:42:08 [1] Creating geometry index on table 'planet_osm_point'...
2022-06-19 11:42:11 [4] Analyzing table 'planet_osm_roads'...
2022-06-19 11:42:14 [4] Done task in 22402ms.
2022-06-19 11:42:24 [2] Creating geometry index on table 'planet_osm_line'...
2022-06-19 11:42:26 [1] Creating hstore indexes on table 'planet_osm_point'...
2022-06-19 11:42:51 [3] Creating geometry index on table 'planet_osm_polygon'...
2022-06-19 11:42:55 [1] Analyzing table 'planet_osm_point'...
2022-06-19 11:42:57 [1] Done task in 66251ms.
2022-06-19 11:42:57 [0] All postprocessing on table 'planet_osm_point' done in 66s (1m 6s).
2022-06-19 11:42:58 [2] Creating hstore indexes on table 'planet_osm_line'...
2022-06-19 11:43:37 [2] Analyzing table 'planet_osm_line'...
2022-06-19 11:43:38 [2] Done task in 107218ms.
2022-06-19 11:43:38 [0] All postprocessing on table 'planet_osm_line' done in 107s (1m 47s).
2022-06-19 11:44:05 [3] Creating hstore indexes on table 'planet_osm_polygon'...
2022-06-19 11:45:31 [3] Analyzing table 'planet_osm_polygon'...
2022-06-19 11:45:33 [3] Done task in 221919ms.
2022-06-19 11:45:33 [0] All postprocessing on table 'planet_osm_polygon' done in 221s (3m 41s).
2022-06-19 11:45:33 [0] All postprocessing on table 'planet_osm_roads' done in 22s.
2022-06-19 11:45:33 [0] Overall memory usage: peak=1420MByte current=744MByte
2022-06-19 11:45:33 [0] osm2pgsql took 310s (5m 10s) overall.
!pip install pandas matplotlib psycopg2 - binary https :// github .com / lechup / imposm - parser / archive / python3 .zip
import os
import re
from collections import defaultdict , Counter
from imposm .parser import OSMParser
import psycopg2
import pandas as pd
pd .set_option ('display.max_rows' , None )
pd .set_option ('display.float_format' , '{:.3f}' .format )
cirylic_chars = frozenset ('абвгдеёжзіийклмнопрстуўфхцчшщьыъэюяАБВГДЕЁЖЗІИІЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯ' )
categories_rules = {
'admin' : [
['admin_level' , '2' ],
['admin_level' , '4' ],
['admin_level' , '6' ],
['admin_level' , '8' ],
['admin_level' , '9' ],
],
'place' : [
['place' , 'city' ],
['place' , 'town' ],
['place' , 'village' ],
['place' , 'hamlet' ],
['place' , 'isolated_dwelling' ],
['admin_level' , None ],
['boundary' , 'administrative' ],
['traffic_sign' , 'city_limit' ],
],
'allotments' : [
['place' , 'allotments' ],
['landuse' , 'allotments' ],
],
'locality' : [
['place' , 'locality' ],
['abandoned:place' , None ],
],
'suburb' : [
['landuse' , 'commercial' ],
['landuse' , 'construction' ],
['landuse' , 'industrial' ],
['landuse' , 'residential' ],
['landuse' , 'retail' ],
['place' , None ],
['residential' , None ],
['industrial' , None ],
],
'highway' : [
['highway' , 'motorway' ],
['highway' , 'trunk' ],
['highway' , 'primary' ],
['highway' , 'secondary' ],
['highway' , 'tertiary' ],
['highway' , 'unclassified' ],
['highway' , 'residential' ],
['highway' , 'service' ],
['highway' , 'track' ],
['highway' , None ],
['type' , 'associatedStreet' ],
['type' , 'street' ],
],
'public_transport' : [
['highway' , 'bus_stop' ],
['public_transport' , None ],
['route' , None ],
['type' , 'route' ],
['railway' , None ],
['type' , 'route_master' ],
['route_master' , None ],
],
'infrastructure' : [
['tunnel' , None ],
['barrier' , None ],
['power' , None ],
['bridge' , None ],
['substation' , None ],
['emergency' , None ],
['ele' , None ],
['man_made' , None ],
['embankment' , None ],
],
'religion' : [
['religion' , None ],
['amenity' , 'place_of_worship' ],
['amenity' , 'monastery' ],
['building' , 'church' ],
['building' , 'cathedral' ],
['building' , 'chapel' ],
],
'education' : [
['landuse' , 'education' ],
['amenity' , 'university' ],
['amenity' , 'college' ],
['amenity' , 'school' ],
['amenity' , 'kindergarten' ],
['building' , 'university' ],
['building' , 'college' ],
['building' , 'school' ],
['building' , 'kindergarten' ],
],
'healthcare' : [
['healthcare' , None ],
['amenity' , 'hospital' ],
['amenity' , 'pharmacy' ],
['amenity' , 'clinic' ],
['amenity' , 'doctors' ],
['amenity' , 'dentist' ],
['building' , 'hospital' ],
['building' , 'clinic' ],
],
'government' : [
['amenity' , 'post_office' ],
['amenity' , 'police' ],
['amenity' , 'library' ],
['office' , 'government' ],
['government' , None ],
['landuse' , 'military' ],
['military' , None ],
],
'office' : [
['office' , None ],
],
'tourism' : [
['tourism' , None ],
['historic' , None ],
['memorial' , None ],
['ruins' , None ],
['information' , None ],
['attraction' , None ],
['resort' , None ],
['artwork_type' , None ],
],
'amenity' : [
['amenity' , 'cafe' ],
['amenity' , 'atm' ],
['amenity' , 'bank' ],
['amenity' , 'fast_food' ],
['amenity' , 'fuel' ],
['amenity' , 'community_centre' ],
['amenity' , 'restaurant' ],
['amenity' , 'bar' ],
['amenity' , None ],
['shop' , 'convenience' ],
['shop' , 'clothes' ],
['shop' , 'car_repair' ],
['shop' , 'hairdresser' ],
['shop' , 'chemist' ],
['shop' , 'supermarket' ],
['shop' , 'car_parts' ],
['shop' , 'furniture' ],
['shop' , 'hardware' ],
['shop' , 'kiosk' ],
['shop' , 'doityourself' ],
['shop' , 'pet' ],
['shop' , 'florist' ],
['shop' , 'beauty' ],
['shop' , 'mobile_phone' ],
['shop' , 'shoes' ],
['shop' , 'newsagent' ],
['shop' , 'electronics' ],
['shop' , 'alcohol' ],
['shop' , 'jewelry' ],
['shop' , 'mall' ],
['shop' , 'butcher' ],
['shop' , 'cosmetics' ],
['shop' , None ],
['leisure' , None ],
['sport' , None ],
['craft' , 'shoemaker' ],
['clothes' , None ],
],
'building' : [
['building' , 'industrial' ],
['building' , 'service' ],
['building' , 'retail' ],
['building' , 'commercial' ],
['building' , 'warehouse' ],
['building' , 'public' ],
['building' , 'dormitory' ],
['building' , 'warehouse' ],
['building' , None ],
],
'water' : [
['waterway' , 'drain' ],
['waterway' , 'ditch' ],
['waterway' , 'stream' ],
['waterway' , 'river' ],
['waterway' , 'canal' ],
['waterway' , None ],
['type' , 'waterway' ],
['water' , None ],
['natural' , 'water' ],
['natural' , 'spring' ],
],
'natural' : [
['boundary' , None ],
['natural' , None ],
['place' , 'island' ],
['place' , 'islet' ],
['landuse' , None ],
],
}
dependants = [
'addr:region' ,
'addr:district' ,
'addr:subdistrict' ,
'addr:city' ,
'addr:place' ,
'addr:street' ,
'addr2:street' ,
'from' ,
'to' ,
'via' ,
'destination' ,
'destination:backward' ,
'destination:forward' ,
'water_tank:city' ,
]
usage = defaultdict (set )
categories_rules2 = {}
for category , group in categories_rules .items ():
if category not in categories_rules2 :
categories_rules2 [category ] = []
for tag , value in group :
if value is not None :
categories_rules2 [category ].append ([tag , True , {value }])
usage [tag ].add (value )
for category , group in categories_rules .items ():
if category not in categories_rules2 :
categories_rules2 [category ] = []
for tag , value in group :
if value is None :
categories_rules2 [category ].append ([tag , False , usage [tag ]])
Падлічам статыстыку для дампу
дамп падліча ўсе дадзеныя, але можа быць трошкі недакладным таму што не ўлічвае грубую абрэзку Беларусі
key_counter = defaultdict (lambda : defaultdict (list ))
categories_tags = {}
categories_rules_tags_set = {}
for category , group in categories_rules2 .items ():
for tag , eq , values in group :
if tag not in categories_tags :
categories_tags [tag ] = {category }
else :
categories_tags [tag ].add (category )
def process (params ):
for _ , tags , _ in params :
if 'name' not in tags :
continue
if not (frozenset (tags ['name' ]) & cirylic_chars ):
continue
categories = {category for tag in categories_tags .keys () & tags .keys () for category in categories_tags [tag ]}
for tag in ['name' , 'name:be' , 'name:ru' ]:
if tag not in tags :
continue
value = tags [tag ]
cyr = frozenset (value ) & cirylic_chars
if not cyr :
continue
match = False
for category in categories :
group = categories_rules2 [category ]
category_match = False
for i , (k , eq , vv ) in enumerate (group ):
if k not in tags :
continue
if eq :
if tags [k ] in vv :
if not category_match :
key_counter [(category ,)][tag ].append (value )
match = category_match = True
key_counter [(category , i )][tag ].append (value )
else :
if tags [k ] not in vv :
if not category_match :
key_counter [(category ,)][tag ].append (value )
match = category_match = True
key_counter [(category , i )][tag ].append (value )
if not match :
key_counter [('other' ,)][tag ].append (value )
OSMParser (
nodes_callback = process ,
ways_callback = process ,
relations_callback = process ,
).parse ('belarus-latest.osm.pbf' )
data = []
for c in list (categories_rules ) + ['other' ]:
name_cnt = len (key_counter [(c ,)]['name' ])
name_uniq = len (set (key_counter [(c ,)]['name' ]))
name_be_cnt = len (key_counter [(c ,)]['name:be' ])
name_be_uniq = len (set (key_counter [(c ,)]['name:be' ]))
name_ru_cnt = len (key_counter [(c ,)]['name:ru' ])
name_ru_uniq = len (set (key_counter [(c ,)]['name:ru' ]))
data .append ([
'#' , c ,
name_cnt , name_be_cnt , name_ru_cnt , name_be_cnt / (name_cnt or 1 ), name_ru_cnt / (name_cnt or 1 ),
name_uniq , name_be_uniq , name_ru_uniq , name_be_uniq / (name_uniq or 1 ), name_ru_uniq / (name_uniq or 1 ),
])
if c == 'other' :
continue
for i , (k , eq , vv ) in enumerate (categories_rules2 [c ]):
if eq :
tag = f'{ k } = { list (vv )[0 ]} '
else :
tag = f'{ k } = *'
name_cnt = len (key_counter [(c , i )]['name' ])
name_uniq = len (set (key_counter [(c , i )]['name' ]))
name_be_cnt = len (key_counter [(c , i )]['name:be' ])
name_be_uniq = len (set (key_counter [(c , i )]['name:be' ]))
name_ru_cnt = len (key_counter [(c , i )]['name:ru' ])
name_ru_uniq = len (set (key_counter [(c , i )]['name:ru' ]))
data .append ([
'' , tag ,
name_cnt , name_be_cnt , name_ru_cnt , name_be_cnt / (name_cnt or 1 ), name_ru_cnt / (name_cnt or 1 ),
name_uniq , name_be_uniq , name_ru_uniq , name_be_uniq / (name_uniq or 1 ), name_ru_uniq / (name_uniq or 1 ),
])
df = pd .DataFrame (data , columns = [
'lvl' , 'category' ,
'all name' , 'all name:be' , 'all name:ru' , 'all name:be%' , 'all name:ru%' ,
'uniq name' , 'uniq name:be' , 'uniq name:ru' , 'uniq name:be%' , 'uniq name:ru%' ,
])
(
df
.style
.set_properties (subset = ['category' ], ** {'text-align' : 'left' })
.background_gradient ('YlOrRd' , subset = [c for c in df .columns if c .endswith ('%' )], vmin = 0 , vmax = 1 )
.format ({f : '{:.3f}' for f in [c for c in df .columns if c .endswith ('%' )]})
.apply (lambda row : [("font-weight: bold" if row .loc ['lvl' ] == '#' else '' ) for _ in row ], axis = 1 )
)
lvl
category
all name
all name:be
all name:ru
all name:be%
all name:ru%
uniq name
uniq name:be
uniq name:ru
uniq name:be%
uniq name:ru%
0
#
admin
2389
2260
2107
0.946
0.882
1571
1482
1487
0.943
0.947
1
admin_level = 2
703
640
479
0.910
0.681
109
84
86
0.771
0.789
2
admin_level = 4
73
73
73
1.000
1.000
40
39
39
0.975
0.975
3
admin_level = 6
234
215
222
0.919
0.949
189
170
177
0.899
0.937
4
admin_level = 8
1352
1306
1307
0.966
0.967
1234
1195
1192
0.968
0.966
5
admin_level = 9
27
26
26
0.963
0.963
14
13
13
0.929
0.929
6
#
place
49941
49526
49404
0.992
0.989
17290
17069
16934
0.987
0.979
7
place = city
31
31
31
1.000
1.000
16
16
16
1.000
1.000
8
place = town
274
273
274
0.996
1.000
138
138
138
1.000
1.000
9
place = village
5030
4990
5013
0.992
0.997
2229
2209
2214
0.991
0.993
10
place = hamlet
38814
38633
38638
0.995
0.995
13589
13643
13513
1.004
0.994
11
place = isolated_dwelling
1272
1260
1262
0.991
0.992
644
638
639
0.991
0.992
12
boundary = administrative
25046
24834
24681
0.992
0.985
16981
16965
16823
0.999
0.991
13
traffic_sign = city_limit
2064
2052
2052
0.994
0.994
1003
932
926
0.929
0.923
14
admin_level = *
22653
22571
22571
0.996
0.996
15487
15568
15420
1.005
0.996
15
#
allotments
2872
2469
2865
0.860
0.998
1969
1560
1961
0.792
0.996
16
place = allotments
1834
1833
1834
0.999
1.000
1326
1292
1324
0.974
0.998
17
landuse = allotments
2698
2295
2691
0.851
0.997
1920
1510
1913
0.786
0.996
18
#
locality
12026
11894
11946
0.989
0.993
8437
8250
8384
0.978
0.994
19
place = locality
11966
11834
11886
0.989
0.993
8403
8212
8350
0.977
0.994
20
abandoned:place = *
3038
3028
3031
0.997
0.998
2206
2204
2201
0.999
0.998
21
#
suburb
7959
3963
7824
0.498
0.983
6322
2674
6200
0.423
0.981
22
landuse = commercial
179
96
177
0.536
0.989
166
84
164
0.506
0.988
23
landuse = construction
162
30
158
0.185
0.975
154
28
150
0.182
0.974
24
landuse = industrial
4383
1050
4331
0.240
0.988
3778
696
3741
0.184
0.990
25
landuse = residential
875
662
835
0.757
0.954
794
585
751
0.737
0.946
26
landuse = retail
233
140
230
0.601
0.987
148
59
146
0.399
0.986
27
place = *
2146
2108
2113
0.982
0.985
1502
1486
1471
0.989
0.979
28
residential = *
701
568
677
0.810
0.966
637
498
609
0.782
0.956
29
industrial = *
474
78
466
0.165
0.983
383
56
376
0.146
0.982
30
#
highway
94598
92170
94294
0.974
0.997
12175
10665
12070
0.876
0.991
31
highway = motorway
76
76
76
1.000
1.000
1
1
1
1.000
1.000
32
highway = trunk
1327
1285
1286
0.968
0.969
140
134
135
0.957
0.964
33
highway = primary
7670
7576
7631
0.988
0.995
532
509
522
0.957
0.981
34
highway = secondary
7261
7233
7261
0.996
1.000
748
729
747
0.975
0.999
35
highway = tertiary
10684
10434
10631
0.977
0.995
1684
1527
1653
0.907
0.982
36
highway = unclassified
3018
2893
2947
0.959
0.976
931
859
912
0.923
0.980
37
highway = residential
56704
55671
56666
0.982
0.999
9718
8937
9692
0.920
0.997
38
highway = service
3189
2926
3181
0.918
0.997
1572
1346
1564
0.856
0.995
39
highway = track
792
688
781
0.869
0.986
214
152
204
0.710
0.953
40
type = associatedStreet
1821
1819
1821
0.999
1.000
1174
1167
1173
0.994
0.999
41
type = street
37
37
37
1.000
1.000
35
35
35
1.000
1.000
42
highway = *
2019
1532
1976
0.759
0.979
1004
750
997
0.747
0.993
43
#
public_transport
34834
29185
34023
0.838
0.977
14348
9813
13105
0.684
0.913
44
highway = bus_stop
17464
15418
17367
0.883
0.994
8493
6686
7918
0.787
0.932
45
type = route
3995
2096
3920
0.525
0.981
3966
2086
3893
0.526
0.982
46
type = route_master
591
100
122
0.169
0.206
542
98
119
0.181
0.220
47
public_transport = *
24880
22610
24676
0.909
0.992
7884
6289
7269
0.798
0.922
48
route = *
4047
2126
3954
0.525
0.977
4004
2103
3915
0.525
0.978
49
railway = *
2886
2491
2866
0.863
0.993
1358
1199
1314
0.883
0.968
50
route_master = *
598
100
124
0.167
0.207
549
98
121
0.179
0.220
51
#
infrastructure
18752
11072
18405
0.590
0.981
10412
5336
10105
0.512
0.971
52
tunnel = *
5843
4529
5826
0.775
0.997
1660
1221
1644
0.736
0.990
53
barrier = *
4477
2935
4424
0.656
0.988
3735
2252
3679
0.603
0.985
54
power = *
4864
2326
4731
0.478
0.973
3972
1851
3867
0.466
0.974
55
bridge = *
1363
1198
1357
0.879
0.996
497
409
491
0.823
0.988
56
substation = *
2566
1481
2532
0.577
0.987
2404
1349
2377
0.561
0.989
57
emergency = *
1155
159
1153
0.138
0.998
288
122
285
0.424
0.990
58
ele = *
311
161
308
0.518
0.990
285
138
282
0.484
0.989
59
man_made = *
1679
549
1549
0.327
0.923
1217
328
1100
0.270
0.904
60
embankment = *
213
138
213
0.648
1.000
90
72
90
0.800
1.000
61
#
religion
3825
2703
3706
0.707
0.969
2319
1209
2176
0.521
0.938
62
amenity = place_of_worship
2686
1769
2642
0.659
0.984
1886
988
1828
0.524
0.969
63
amenity = monastery
21
13
18
0.619
0.857
20
12
17
0.600
0.850
64
building = church
1434
987
1398
0.688
0.975
1107
633
1064
0.572
0.961
65
building = cathedral
25
21
25
0.840
1.000
24
21
24
0.875
1.000
66
building = chapel
267
197
261
0.738
0.978
163
96
153
0.589
0.939
67
religion = *
3656
2610
3564
0.714
0.975
2211
1171
2095
0.530
0.948
68
#
education
5654
3236
4737
0.572
0.838
4048
1853
3123
0.458
0.771
69
landuse = education
805
792
805
0.984
1.000
798
783
798
0.981
1.000
70
amenity = university
200
145
128
0.725
0.640
188
134
119
0.713
0.633
71
amenity = college
321
153
180
0.477
0.561
295
139
159
0.471
0.539
72
amenity = school
1995
1173
1957
0.588
0.981
1438
625
1377
0.435
0.958
73
amenity = kindergarten
1903
1281
1884
0.673
0.990
1367
829
1327
0.606
0.971
74
building = university
165
65
55
0.394
0.333
157
61
53
0.389
0.338
75
building = college
116
49
54
0.422
0.466
103
41
43
0.398
0.417
76
building = school
672
231
327
0.344
0.487
569
136
228
0.239
0.401
77
building = kindergarten
369
171
226
0.463
0.612
274
96
136
0.350
0.496
78
#
healthcare
5663
2314
4684
0.409
0.827
2999
801
2112
0.267
0.704
79
amenity = hospital
653
295
501
0.452
0.767
500
182
350
0.364
0.700
80
amenity = pharmacy
2502
1243
2172
0.497
0.868
1308
353
1045
0.270
0.799
81
amenity = clinic
734
361
600
0.492
0.817
445
159
316
0.357
0.710
82
amenity = doctors
931
112
831
0.120
0.893
257
60
162
0.233
0.630
83
amenity = dentist
373
111
261
0.298
0.700
291
37
177
0.127
0.608
84
building = hospital
405
177
237
0.437
0.585
297
93
134
0.313
0.451
85
building = clinic
31
16
22
0.516
0.710
29
14
20
0.483
0.690
86
healthcare = *
2713
992
2669
0.366
0.984
1419
456
1400
0.321
0.987
87
#
government
4980
1936
3936
0.389
0.790
3401
878
2609
0.258
0.767
88
amenity = post_office
1083
538
593
0.497
0.548
405
115
132
0.284
0.326
89
amenity = police
710
308
418
0.434
0.589
518
157
249
0.303
0.481
90
amenity = library
499
249
261
0.499
0.523
336
116
111
0.345
0.330
91
office = government
1888
592
1872
0.314
0.992
1565
368
1547
0.235
0.988
92
landuse = military
366
139
363
0.380
0.992
320
108
317
0.338
0.991
93
government = *
733
321
726
0.438
0.990
579
195
570
0.337
0.984
94
military = *
566
174
560
0.307
0.989
382
68
376
0.178
0.984
95
#
office
5276
1471
5096
0.279
0.966
3872
521
3713
0.135
0.959
96
office = *
5276
1471
5096
0.279
0.966
3872
521
3713
0.135
0.959
97
#
tourism
15315
7732
14262
0.505
0.931
10936
4279
9942
0.391
0.909
98
tourism = *
9781
5548
9247
0.567
0.945
7045
3196
6523
0.454
0.926
99
historic = *
6028
2580
5509
0.428
0.914
4444
1505
3962
0.339
0.892
100
memorial = *
2313
641
1965
0.277
0.850
1859
405
1558
0.218
0.838
101
ruins = *
263
86
263
0.327
1.000
208
75
207
0.361
0.995
102
information = *
999
782
970
0.783
0.971
297
120
270
0.404
0.909
103
attraction = *
173
79
164
0.457
0.948
151
64
139
0.424
0.921
104
resort = *
142
29
140
0.204
0.986
134
27
132
0.201
0.985
105
artwork_type = *
753
278
630
0.369
0.837
661
239
547
0.362
0.828
106
#
amenity
48078
22613
29658
0.470
0.617
21567
4629
7719
0.215
0.358
107
amenity = cafe
2463
1058
1347
0.430
0.547
1702
497
714
0.292
0.420
108
amenity = atm
1454
1366
1404
0.939
0.966
109
61
78
0.560
0.716
109
amenity = bank
2075
1817
1912
0.876
0.921
324
115
213
0.355
0.657
110
amenity = fast_food
758
354
420
0.467
0.554
425
98
148
0.231
0.348
111
amenity = fuel
1137
635
764
0.558
0.672
626
244
313
0.390
0.500
112
amenity = community_centre
742
343
457
0.462
0.616
353
92
133
0.261
0.377
113
amenity = restaurant
603
297
338
0.493
0.561
515
229
267
0.445
0.518
114
amenity = bar
471
161
221
0.342
0.469
422
127
183
0.301
0.434
115
shop = convenience
7311
4100
5387
0.561
0.737
2471
836
1152
0.338
0.466
116
shop = clothes
1120
302
509
0.270
0.454
673
113
217
0.168
0.322
117
shop = car_repair
1133
313
432
0.276
0.381
882
90
198
0.102
0.224
118
shop = hairdresser
1075
403
550
0.375
0.512
718
151
254
0.210
0.354
119
shop = chemist
1478
1074
1308
0.727
0.885
87
42
40
0.483
0.460
120
shop = supermarket
1250
861
1018
0.689
0.814
377
177
226
0.469
0.599
121
shop = car_parts
911
248
454
0.272
0.498
486
44
133
0.091
0.274
122
shop = furniture
745
248
408
0.333
0.548
398
59
119
0.148
0.299
123
shop = hardware
724
263
399
0.363
0.551
519
117
211
0.225
0.407
124
shop = kiosk
395
223
258
0.565
0.653
205
82
92
0.400
0.449
125
shop = doityourself
723
226
328
0.313
0.454
493
85
144
0.172
0.292
126
shop = pet
475
119
178
0.251
0.375
210
31
61
0.148
0.290
127
shop = florist
525
211
267
0.402
0.509
262
52
87
0.198
0.332
128
shop = beauty
481
70
157
0.146
0.326
434
64
134
0.147
0.309
129
shop = mobile_phone
503
226
374
0.449
0.744
112
19
40
0.170
0.357
130
shop = shoes
258
108
133
0.419
0.516
152
28
53
0.184
0.349
131
shop = newsagent
535
378
433
0.707
0.809
65
24
29
0.369
0.446
132
shop = electronics
449
150
246
0.334
0.548
304
57
118
0.188
0.388
133
shop = alcohol
426
123
185
0.289
0.434
194
44
65
0.227
0.335
134
shop = jewelry
342
133
151
0.389
0.442
81
25
33
0.309
0.407
135
shop = mall
369
180
211
0.488
0.572
304
133
154
0.438
0.507
136
shop = butcher
390
133
205
0.341
0.526
219
59
76
0.269
0.347
137
shop = cosmetics
280
97
147
0.346
0.525
148
44
67
0.297
0.453
138
craft = shoemaker
225
160
162
0.711
0.720
64
11
24
0.172
0.375
139
amenity = *
6436
2698
3801
0.419
0.591
4328
1334
2077
0.308
0.480
140
shop = *
7097
2470
3495
0.348
0.492
4016
788
1285
0.196
0.320
141
leisure = *
3252
1369
2012
0.421
0.619
2386
772
1267
0.324
0.531
142
sport = *
779
317
364
0.407
0.467
589
170
212
0.289
0.360
143
clothes = *
228
91
123
0.399
0.539
130
29
46
0.223
0.354
144
#
building
30254
11353
17379
0.375
0.574
17824
4405
8422
0.247
0.473
145
building = industrial
1617
287
779
0.177
0.482
1230
146
470
0.119
0.382
146
building = service
1380
690
1086
0.500
0.787
1098
540
848
0.492
0.772
147
building = retail
1947
1152
1369
0.592
0.703
1132
501
642
0.443
0.567
148
building = commercial
593
189
285
0.319
0.481
541
154
244
0.285
0.451
149
building = warehouse
242
52
83
0.215
0.343
172
17
35
0.099
0.203
150
building = public
367
209
258
0.569
0.703
302
156
199
0.517
0.659
151
building = dormitory
606
167
185
0.276
0.305
449
75
100
0.167
0.223
152
building = warehouse
242
52
83
0.215
0.343
172
17
35
0.099
0.203
153
building = *
23502
8607
13334
0.366
0.567
13766
3347
6551
0.243
0.476
154
#
water
24019
18587
23862
0.774
0.993
5527
3379
5335
0.611
0.965
155
waterway = drain
430
124
425
0.288
0.988
119
45
117
0.378
0.983
156
waterway = ditch
645
183
644
0.284
0.998
130
39
125
0.300
0.962
157
waterway = stream
6330
4396
6312
0.694
0.997
1133
724
1105
0.639
0.975
158
waterway = river
12240
11206
12161
0.916
0.994
1805
1420
1699
0.787
0.941
159
waterway = canal
584
336
578
0.575
0.990
124
55
120
0.444
0.968
160
type = waterway
2093
1708
2083
0.816
0.995
1846
1467
1829
0.795
0.991
161
natural = water
3152
2085
3107
0.661
0.986
2563
1590
2516
0.620
0.982
162
natural = spring
591
232
589
0.393
0.997
530
179
527
0.338
0.994
163
waterway = *
39
18
38
0.462
0.974
31
13
30
0.419
0.968
164
water = *
3036
2040
2997
0.672
0.987
2467
1555
2424
0.630
0.983
165
#
natural
5442
3109
5192
0.571
0.954
3908
1830
3665
0.468
0.938
166
place = island
12
12
12
1.000
1.000
12
12
12
1.000
1.000
167
place = islet
88
68
88
0.773
1.000
78
60
77
0.769
0.987
168
boundary = *
878
685
763
0.780
0.869
839
657
726
0.783
0.865
169
natural = *
1488
803
1415
0.540
0.951
1219
584
1159
0.479
0.951
170
landuse = *
3137
1669
3074
0.532
0.980
1908
623
1838
0.327
0.963
171
#
other
2675
797
1127
0.298
0.421
2157
567
755
0.263
0.350
Падлічам статыстыку для выгрузкі ў postgis
вынік будзе больш дакладным, але можа ня ўлічываць дачыненьні што не пераносяцца ў postgis
query_template = """
SELECT '{category}' AS category, {num} AS num, g.tags->'name' AS name, g.tags->'name:be' AS name_be, g.tags->'name:ru' AS name_ru
FROM {table} g
WHERE {condition}
"""
dependant_query_template = """
WITH tagged_table AS (
SELECT field, COUNT(*) AS count
FROM (
SELECT DISTINCT osm_type, osm_id, field
FROM (
SELECT 'node' AS osm_type, g.osm_id, g.tags->'{field}' AS field
FROM planet_osm_point g
INNER JOIN planet_osm_region p
ON ST_Intersects(p.way, g.way)
WHERE p.osm_id = -59065
AND tags->'{field}' ~ '({cyr_regexp})'
UNION ALL
SELECT 'w+r' AS osm_type, g.osm_id, g.tags->'{field}' AS field
FROM planet_osm_line g
INNER JOIN planet_osm_region p
ON ST_Intersects(p.way, g.way)
WHERE p.osm_id = -59065
AND tags->'{field}' ~ '({cyr_regexp})'
UNION ALL
SELECT 'w+r' AS osm_type, g.osm_id, g.tags->'{field}' AS field
FROM planet_osm_roads g
INNER JOIN planet_osm_region p
ON ST_Intersects(p.way, g.way)
WHERE p.osm_id = -59065
AND tags->'{field}' ~ '({cyr_regexp})'
UNION ALL
SELECT 'w+r' AS osm_type, g.osm_id, g.tags->'{field}' AS field
FROM planet_osm_polygon g
INNER JOIN planet_osm_region p
ON ST_Intersects(p.way, g.way)
WHERE p.osm_id = -59065
AND tags->'{field}' ~ '({cyr_regexp})'
) t1
) t2
GROUP BY field
)
SELECT field, num, name, name_be, name_ru, SUM(count) AS count
FROM (
SELECT '{field}' AS field, {num} AS num, c.count,
bool_or(p.name IS NOT NULL) AS name,
bool_or(p.name_be IS NOT NULL) AS name_be,
bool_or(p.name_ru IS NOT NULL) AS name_ru
FROM tagged_table c
LEFT JOIN {table} p
ON (c.field = p.name OR c.field = p.name_be OR c.field = p.name_ru)
GROUP BY c.field, c.count
) t
GROUP BY field, num, name, name_be, name_ru
ORDER BY num
"""
tables = ['planet_osm_named_data' ]
cyr_regexp = '|' .join (cirylic_chars )
queries = []
exclude = defaultdict (lambda : [[], []])
for category , group in categories_rules2 .items ():
conditions = []
for i , (k , eq , vv ) in enumerate (group ):
if vv :
eq_str = 'IN' if eq else 'NOT IN'
vv_str = ',' .join (f"'{ v } '" for v in vv )
condition = f"g.tags->'{ k } ' { eq_str } ({ vv_str } )"
elif not eq :
condition = f"g.tags->'{ k } ' IS NOT NULL"
else :
raise ValueError ()
conditions .append (condition )
exclude [k ][eq ].append (condition )
for table in tables :
query = query_template .format (category = category , num = i , table = table , condition = condition )
queries .append (query )
condition = ' OR ' .join (f'({ c } )' for c in conditions )
for table in tables :
query = query_template .format (category = category , num = - 1 , table = table , condition = condition )
queries .append (query )
condition = ' OR ' .join (f"(g.tags->'{ k } ' IS NOT NULL)" for k , eq_c in exclude .items ())
for table in tables :
query = query_template .format (category = 'other' , num = - 1 , table = table , condition = f'NOT ({ condition } )' )
queries .append (query )
for table in tables :
query = query_template .format (category = 'TOTAL' , num = - 1 , table = table , condition = 'TRUE' )
queries .append (query )
query = ' UNION ALL ' .join (queries )
dependant_queries = []
for i , dependant in enumerate (dependants ):
for table in tables :
query = dependant_query_template .format (num = i , table = table , field = dependant , cyr_regexp = cyr_regexp )
dependant_queries .append (query )
dependant_query = ' UNION ALL ' .join (dependant_queries )
print (len (queries ), len (dependant_queries ))
key_counter = defaultdict (lambda : defaultdict (list ))
dependant_key_counter = defaultdict (Counter )
REGION_VIEW_SQL = """
CREATE MATERIALIZED VIEW IF NOT EXISTS planet_osm_region AS
SELECT
osm_id,
tags->'name' AS name,
tags->'name:be' AS name_be,
tags->'name:ru' AS name_ru,
tags->'admin_level' AS admin_level,
ST_Buffer(way, -0.000000001) AS way
FROM planet_osm_polygon
WHERE tags->'admin_level' IN ('2', '4', '6', '8', '9')
"""
REGION_WAY_INDEX_SQL = """
CREATE INDEX IF NOT EXISTS "planet_osm_region_way_idx" ON planet_osm_region USING GIST (way)
"""
REGION_ANALYZE_SQL = "ANALYZE planet_osm_region"
OSM_DATA_VIEW_SQL = f"""
CREATE MATERIALIZED VIEW IF NOT EXISTS planet_osm_named_data AS
SELECT
g.osm_id AS osm_id,
'node' AS osm_type,
'point' AS kind,
g.tags->'name' AS name,
g.tags->'name:be' AS name_be,
g.tags->'name:ru' AS name_ru,
g.tags AS tags
FROM planet_osm_point g
INNER JOIN planet_osm_region p
ON ST_Intersects(p.way, g.way)
WHERE p.osm_id = -59065
AND g.tags->'name' ~ '({ cyr_regexp } )'
UNION ALL
SELECT
ABS(g.osm_id) AS osm_id,
CASE WHEN g.osm_id < 0 THEN 'relation' ELSE 'way' END AS osm_type,
'line' AS kind,
g.tags->'name' AS name,
g.tags->'name:be' AS name_be,
g.tags->'name:ru' AS name_ru,
g.tags AS tags
FROM planet_osm_line g
INNER JOIN planet_osm_region p
ON ST_Intersects(p.way, g.way)
WHERE p.osm_id = -59065
AND g.tags->'name' ~ '({ cyr_regexp } )'
UNION ALL
SELECT
ABS(g.osm_id) AS osm_id,
CASE WHEN g.osm_id < 0 THEN 'relation' ELSE 'way' END AS osm_type,
'road' AS kind,
g.tags->'name' AS name,
g.tags->'name:be' AS name_be,
g.tags->'name:ru' AS name_ru,
g.tags AS tags
FROM planet_osm_roads g
INNER JOIN planet_osm_region p
ON ST_Intersects(p.way, g.way)
WHERE p.osm_id = -59065
AND g.tags->'name' ~ '({ cyr_regexp } )'
UNION ALL
SELECT
ABS(g.osm_id) AS osm_id,
CASE WHEN g.osm_id < 0 THEN 'relation' ELSE 'way' END AS osm_type,
'poly' AS kind,
g.tags->'name' AS name,
g.tags->'name:be' AS name_be,
g.tags->'name:ru' AS name_ru,
g.tags AS tags
FROM planet_osm_polygon g
INNER JOIN planet_osm_region p
ON ST_Intersects(p.way, g.way)
WHERE p.osm_id = -59065
AND g.tags->'name' ~ '({ cyr_regexp } )'
"""
OSM_DATA_OSM_ID_TYPE_INDEX_SQL = """
CREATE INDEX IF NOT EXISTS "planet_osm_named_data_osm_id_type_idx" ON planet_osm_named_data (osm_id, osm_type)
"""
OSM_DATA_TAGS_INDEX_SQL = """
CREATE INDEX IF NOT EXISTS "planet_osm_named_data_tags_idx" ON planet_osm_named_data USING GIN (tags)
"""
OSM_DATA_NAME_INDEX_SQL = """
CREATE INDEX IF NOT EXISTS "planet_osm_named_data_name_idx" ON planet_osm_named_data (name)
"""
OSM_DATA_NAME_BE_INDEX_SQL = """
CREATE INDEX IF NOT EXISTS "planet_osm_named_data_name_be_idx" ON planet_osm_named_data (name_be)
"""
OSM_DATA_NAME_RU_INDEX_SQL = """
CREATE INDEX IF NOT EXISTS "planet_osm_named_data_name_ru_idx" ON planet_osm_named_data (name_ru)
"""
OSM_DATA_ANALYZE_SQL = "ANALYZE planet_osm_named_data"
postgres_params = {
'host' : os .environ ['POSTGRES_HOST' ],
'dbname' : os .environ ['POSTGRES_DB' ],
'user' : os .environ ['POSTGRES_USER' ],
'password' : os .environ ['POSTGRES_PASSWORD' ],
}
try :
from belarus_utils import PostgisSearchReadEngine , OverpassApiSearchEnigne
postgis_api = PostgisSearchReadEngine (** postgres_params )
overpass_api = OverpassApiSearchEnigne (cache = True )
print ('type = street' )
postgis_api .insert_extra_relations (overpass_api .search ({'type' : ['street' ]}))
print ('type = associatedStreet' )
postgis_api .insert_extra_relations (overpass_api .search ({'type' : ['associatedStreet' ]}))
print ('type = route' )
postgis_api .insert_extra_relations (overpass_api .search ({'type' : ['route' ]}))
print ('type = route_master' )
postgis_api .insert_extra_relations (overpass_api .search ({'type' : ['route_master' ]}))
print ('type = waterway' )
postgis_api .insert_extra_relations (overpass_api .search ({'type' : ['waterway' ]}))
except Exception as err :
print (err )
with psycopg2 .connect (** postgres_params ) as conn :
conn .autocommit = True
with conn .cursor () as cur :
cur .execute (REGION_VIEW_SQL )
cur .execute (REGION_WAY_INDEX_SQL )
cur .execute (REGION_ANALYZE_SQL )
cur .execute (OSM_DATA_VIEW_SQL )
cur .execute (OSM_DATA_OSM_ID_TYPE_INDEX_SQL )
cur .execute (OSM_DATA_TAGS_INDEX_SQL )
cur .execute (OSM_DATA_NAME_INDEX_SQL )
cur .execute (OSM_DATA_NAME_BE_INDEX_SQL )
cur .execute (OSM_DATA_NAME_RU_INDEX_SQL )
cur .execute (OSM_DATA_ANALYZE_SQL )
with psycopg2 .connect (** postgres_params ) as conn :
with conn .cursor () as cur :
for i , query in enumerate (queries , 1 ):
cur .execute (query )
records = cur .fetchall ()
for category , num , name , name_be , name_ru in records :
key = (category ,) if num == - 1 else (category , num )
key_counter [key ]['name' ].append (name )
if name_be is not None :
key_counter [key ]['name:be' ].append (name_be )
if name_ru is not None :
key_counter [key ]['name:ru' ].append (name_ru )
if name == name_be == name_ru :
key_counter [key ]['res_both' ].append (name )
elif name == name_be and name_ru is not None :
key_counter [key ]['res_be_ru' ].append (name )
elif name == name_be :
key_counter [key ]['res_be' ].append (name )
elif name == name_ru and name_be is not None :
key_counter [key ]['res_ru_be' ].append (name )
elif name == name_ru :
key_counter [key ]['res_ru' ].append (name )
elif name_be is not None and name_ru is not None :
key_counter [key ]['res_other_both' ].append (name )
elif name_be is not None :
key_counter [key ]['res_other_be' ].append (name )
elif name_ru is not None :
key_counter [key ]['res_other_ru' ].append (name )
else :
key_counter [key ]['res_none' ].append (name )
for i , query in enumerate (dependant_queries , 1 ):
cur .execute (query )
records = cur .fetchall ()
for field , num , name , name_be , name_ru , count in records :
if name and name_be and name_ru :
dependant_key_counter [field ]['all' ] += count
elif name and name_be and not name_ru :
dependant_key_counter [field ]['be' ] += count
elif name and not name_be and name_ru :
dependant_key_counter [field ]['ru' ] += count
elif name :
dependant_key_counter [field ]['name' ] += count
else :
dependant_key_counter [field ]['not found' ] += count
type = street
type = associatedStreet
type = route
type = route_master
type = waterway
data = []
for c in list (categories_rules ) + ['other' , 'TOTAL' ]:
name_cnt = len (key_counter [(c ,)]['name' ])
name_uniq = len (set (key_counter [(c ,)]['name' ]))
name_be_cnt = len (key_counter [(c ,)]['name:be' ])
name_be_uniq = len (set (key_counter [(c ,)]['name:be' ]))
name_ru_cnt = len (key_counter [(c ,)]['name:ru' ])
name_ru_uniq = len (set (key_counter [(c ,)]['name:ru' ]))
data .append ([
'#' , c ,
name_cnt , name_be_cnt , name_ru_cnt , name_be_cnt / (name_cnt or 1 ), name_ru_cnt / (name_cnt or 1 ),
name_uniq , name_be_uniq , name_ru_uniq , name_be_uniq / (name_uniq or 1 ), name_ru_uniq / (name_uniq or 1 ),
])
if c in {'other' , 'TOTAL' }:
continue
for i , (k , eq , vv ) in enumerate (categories_rules2 [c ]):
if eq :
tag = f'{ k } = { list (vv )[0 ]} '
else :
tag = f'{ k } = *'
name_cnt = len (key_counter [(c , i )]['name' ])
name_uniq = len (set (key_counter [(c , i )]['name' ]))
name_be_cnt = len (key_counter [(c , i )]['name:be' ])
name_be_uniq = len (set (key_counter [(c , i )]['name:be' ]))
name_ru_cnt = len (key_counter [(c , i )]['name:ru' ])
name_ru_uniq = len (set (key_counter [(c , i )]['name:ru' ]))
data .append ([
'' , tag ,
name_cnt , name_be_cnt , name_ru_cnt , name_be_cnt / (name_cnt or 1 ), name_ru_cnt / (name_cnt or 1 ),
name_uniq , name_be_uniq , name_ru_uniq , name_be_uniq / (name_uniq or 1 ), name_ru_uniq / (name_uniq or 1 ),
])
df = pd .DataFrame (data , columns = [
'lvl' , 'category' ,
'all name' , 'all name:be' , 'all name:ru' , 'all name:be%' , 'all name:ru%' ,
'uniq name' , 'uniq name:be' , 'uniq name:ru' , 'uniq name:be%' , 'uniq name:ru%' ,
])
(
df
.style
.set_properties (subset = ['category' ], ** {'text-align' : 'left' })
.background_gradient ('YlOrRd' , subset = [c for c in df .columns if c .endswith ('%' )], vmin = 0 , vmax = 1 )
.format ({f : '{:.3f}' for f in [c for c in df .columns if c .endswith ('%' )]})
.apply (lambda row : [("font-weight: bold" if row .loc ['lvl' ] == '#' else '' ) for _ in row ], axis = 1 )
)
lvl
category
all name
all name:be
all name:ru
all name:be%
all name:ru%
uniq name
uniq name:be
uniq name:ru
uniq name:be%
uniq name:ru%
0
#
admin
6874
6874
6874
1.000
1.000
1393
1397
1392
1.003
0.999
1
admin_level = 2
4
4
4
1.000
1.000
3
3
3
1.000
1.000
2
admin_level = 4
323
323
323
1.000
1.000
32
31
31
0.969
0.969
3
admin_level = 6
1304
1304
1304
1.000
1.000
165
165
165
1.000
1.000
4
admin_level = 8
5139
5139
5139
1.000
1.000
1191
1195
1191
1.003
1.000
5
admin_level = 9
104
104
104
1.000
1.000
13
13
13
1.000
1.000
6
#
place
114282
114275
114277
1.000
1.000
16946
16971
16805
1.001
0.992
7
place = city
119
119
119
1.000
1.000
16
16
16
1.000
1.000
8
place = town
677
677
677
1.000
1.000
137
138
137
1.007
1.000
9
place = village
12280
12277
12277
1.000
1.000
2198
2207
2195
1.004
0.999
10
place = hamlet
89489
89488
89488
1.000
1.000
13497
13625
13492
1.009
1.000
11
place = isolated_dwelling
2892
2890
2892
0.999
1.000
633
632
633
0.998
1.000
12
boundary = administrative
89587
89587
89587
1.000
1.000
16720
16872
16719
1.009
1.000
13
traffic_sign = city_limit
2053
2052
2052
1.000
1.000
997
932
926
0.935
0.929
14
admin_level = *
82694
82694
82694
1.000
1.000
15399
15547
15399
1.010
1.000
15
#
allotments
2871
2468
2864
0.860
0.998
1969
1560
1961
0.792
0.996
16
place = allotments
1833
1832
1833
0.999
1.000
1326
1292
1324
0.974
0.998
17
landuse = allotments
2697
2294
2690
0.851
0.997
1920
1510
1913
0.786
0.996
18
#
locality
12739
12695
12734
0.997
1.000
8381
8248
8373
0.984
0.999
19
place = locality
12678
12634
12673
0.997
1.000
8347
8210
8339
0.984
0.999
20
abandoned:place = *
3762
3762
3762
1.000
1.000
2195
2202
2195
1.003
1.000
21
#
suburb
7981
4058
7908
0.508
0.991
6244
2665
6181
0.427
0.990
22
landuse = commercial
178
96
176
0.539
0.989
165
84
163
0.509
0.988
23
landuse = construction
162
30
158
0.185
0.975
154
28
150
0.182
0.974
24
landuse = industrial
4385
1052
4333
0.240
0.988
3778
696
3741
0.184
0.990
25
landuse = residential
881
707
877
0.802
0.995
755
584
747
0.774
0.989
26
landuse = retail
232
140
230
0.603
0.991
148
59
146
0.399
0.986
27
place = *
2132
2126
2123
0.997
0.996
1463
1478
1456
1.010
0.995
28
residential = *
719
608
716
0.846
0.996
616
498
608
0.808
0.987
29
industrial = *
474
78
467
0.165
0.985
383
56
377
0.146
0.984
30
#
highway
111068
108549
110777
0.977
0.997
12135
10665
12064
0.879
0.994
31
highway = motorway
152
152
152
1.000
1.000
1
1
1
1.000
1.000
32
highway = trunk
2618
2568
2570
0.981
0.982
137
133
134
0.971
0.978
33
highway = primary
15328
15152
15262
0.989
0.996
529
509
522
0.962
0.987
34
highway = secondary
14515
14465
14515
0.997
1.000
746
729
746
0.977
1.000
35
highway = tertiary
10652
10436
10632
0.980
0.998
1665
1527
1653
0.917
0.993
36
highway = unclassified
3005
2889
2941
0.961
0.979
923
859
911
0.931
0.987
37
highway = residential
56685
55665
56651
0.982
0.999
9710
8937
9688
0.920
0.998
38
highway = service
3184
2921
3176
0.917
0.997
1571
1345
1563
0.856
0.995
39
highway = track
788
688
781
0.873
0.991
212
152
204
0.717
0.962
40
type = associatedStreet
1819
1819
1819
1.000
1.000
1172
1167
1172
0.996
1.000
41
type = street
37
37
37
1.000
1.000
35
35
35
1.000
1.000
42
highway = *
2285
1757
2241
0.769
0.981
1003
749
996
0.747
0.993
43
#
public_transport
34477
29204
34309
0.847
0.995
13803
9731
13010
0.705
0.943
44
highway = bus_stop
17456
15415
17365
0.883
0.995
8492
6687
7918
0.787
0.932
45
type = route
3925
2095
3918
0.534
0.998
3896
2085
3891
0.535
0.999
46
type = route_master
18
3
18
0.167
1.000
18
3
18
0.167
1.000
47
public_transport = *
24074
21893
23960
0.909
0.995
7833
6278
7256
0.801
0.926
48
route = *
3958
2124
3951
0.537
0.998
3918
2102
3913
0.536
0.999
49
railway = *
4077
3396
4051
0.833
0.994
1344
1198
1306
0.891
0.972
50
route_master = *
20
3
20
0.150
1.000
20
3
20
0.150
1.000
51
#
infrastructure
19699
11903
19367
0.604
0.983
10362
5305
10071
0.512
0.972
52
tunnel = *
5862
4546
5846
0.776
0.997
1648
1216
1636
0.738
0.993
53
barrier = *
4433
2898
4386
0.654
0.989
3708
2228
3658
0.601
0.987
54
power = *
4918
2371
4789
0.482
0.974
3967
1850
3866
0.466
0.975
55
bridge = *
2148
1916
2141
0.892
0.997
495
409
491
0.826
0.992
56
substation = *
2572
1487
2538
0.578
0.987
2408
1353
2381
0.562
0.989
57
emergency = *
1155
160
1153
0.139
0.998
288
123
285
0.427
0.990
58
ele = *
312
163
310
0.522
0.994
282
136
280
0.482
0.993
59
man_made = *
1692
556
1566
0.329
0.926
1214
328
1098
0.270
0.904
60
embankment = *
335
218
335
0.651
1.000
90
72
90
0.800
1.000
61
#
religion
3808
2693
3697
0.707
0.971
2316
1209
2176
0.522
0.940
62
amenity = place_of_worship
2674
1759
2634
0.658
0.985
1885
988
1828
0.524
0.970
63
amenity = monastery
21
13
18
0.619
0.857
20
12
17
0.600
0.850
64
building = church
1431
985
1397
0.688
0.976
1105
632
1063
0.572
0.962
65
building = cathedral
25
21
25
0.840
1.000
24
21
24
0.875
1.000
66
building = chapel
267
197
261
0.738
0.978
163
96
153
0.589
0.939
67
religion = *
3640
2600
3555
0.714
0.977
2209
1171
2095
0.530
0.948
68
#
education
5648
3234
4735
0.573
0.838
4045
1852
3122
0.458
0.772
69
landuse = education
804
791
804
0.984
1.000
797
782
797
0.981
1.000
70
amenity = university
199
144
127
0.724
0.638
187
133
118
0.711
0.631
71
amenity = college
320
153
180
0.478
0.562
295
139
159
0.471
0.539
72
amenity = school
1993
1172
1956
0.588
0.981
1437
625
1377
0.435
0.958
73
amenity = kindergarten
1903
1281
1884
0.673
0.990
1367
829
1327
0.606
0.971
74
building = university
164
65
55
0.396
0.335
157
61
53
0.389
0.338
75
building = college
116
49
54
0.422
0.466
103
41
43
0.398
0.417
76
building = school
671
231
327
0.344
0.487
568
136
228
0.239
0.401
77
building = kindergarten
369
171
226
0.463
0.612
274
96
136
0.350
0.496
78
#
healthcare
5657
2318
4691
0.410
0.829
2994
805
2119
0.269
0.708
79
amenity = hospital
653
295
501
0.452
0.767
500
182
350
0.364
0.700
80
amenity = pharmacy
2502
1247
2179
0.498
0.871
1308
357
1052
0.273
0.804
81
amenity = clinic
734
361
600
0.492
0.817
445
159
316
0.357
0.710
82
amenity = doctors
925
112
831
0.121
0.898
252
60
162
0.238
0.643
83
amenity = dentist
373
111
261
0.298
0.700
291
37
177
0.127
0.608
84
building = hospital
405
177
237
0.437
0.585
297
93
134
0.313
0.451
85
building = clinic
31
16
22
0.516
0.710
29
14
20
0.483
0.690
86
healthcare = *
2712
994
2672
0.367
0.985
1418
458
1403
0.323
0.989
87
#
government
4965
1930
3930
0.389
0.792
3393
877
2607
0.258
0.768
88
amenity = post_office
1082
537
593
0.496
0.548
405
115
132
0.284
0.326
89
amenity = police
707
308
418
0.436
0.591
516
157
249
0.304
0.483
90
amenity = library
499
249
262
0.499
0.525
336
116
112
0.345
0.333
91
office = government
1873
581
1860
0.310
0.993
1560
367
1545
0.235
0.990
92
landuse = military
370
144
368
0.389
0.995
318
107
316
0.336
0.994
93
government = *
724
313
718
0.432
0.992
580
197
572
0.340
0.986
94
military = *
569
178
563
0.313
0.989
381
68
375
0.178
0.984
95
#
office
5272
1470
5099
0.279
0.967
3870
520
3715
0.134
0.960
96
office = *
5272
1470
5099
0.279
0.967
3870
520
3715
0.134
0.960
97
#
tourism
15241
7702
14235
0.505
0.934
10872
4258
9924
0.392
0.913
98
tourism = *
9716
5520
9223
0.568
0.949
6989
3177
6507
0.455
0.931
99
historic = *
6019
2578
5506
0.428
0.915
4436
1503
3959
0.339
0.892
100
memorial = *
2312
641
1965
0.277
0.850
1858
405
1558
0.218
0.839
101
ruins = *
262
85
262
0.324
1.000
207
74
206
0.357
0.995
102
information = *
983
778
966
0.791
0.983
284
117
267
0.412
0.940
103
attraction = *
173
79
164
0.457
0.948
151
64
139
0.424
0.921
104
resort = *
142
29
140
0.204
0.986
134
27
132
0.201
0.985
105
artwork_type = *
753
279
630
0.371
0.837
661
240
547
0.363
0.828
106
#
amenity
48289
22836
29892
0.473
0.619
21545
4632
7726
0.215
0.359
107
amenity = cafe
2461
1059
1350
0.430
0.549
1702
498
717
0.293
0.421
108
amenity = atm
1454
1368
1404
0.941
0.966
109
62
78
0.569
0.716
109
amenity = bank
2073
1817
1910
0.877
0.921
323
115
212
0.356
0.656
110
amenity = fast_food
758
355
420
0.468
0.554
425
99
148
0.233
0.348
111
amenity = fuel
1126
636
760
0.565
0.675
622
246
311
0.395
0.500
112
amenity = community_centre
742
343
457
0.462
0.616
353
92
133
0.261
0.377
113
amenity = restaurant
603
297
339
0.493
0.562
515
229
268
0.445
0.520
114
amenity = bar
471
161
221
0.342
0.469
422
127
183
0.301
0.434
115
shop = convenience
7313
4102
5390
0.561
0.737
2471
836
1153
0.338
0.467
116
shop = clothes
1120
304
510
0.271
0.455
673
114
218
0.169
0.324
117
shop = car_repair
1133
313
433
0.276
0.382
882
90
199
0.102
0.226
118
shop = hairdresser
1075
403
551
0.375
0.513
718
151
255
0.210
0.355
119
shop = chemist
1478
1074
1308
0.727
0.885
87
42
40
0.483
0.460
120
shop = supermarket
1249
860
1018
0.689
0.815
377
177
227
0.469
0.602
121
shop = car_parts
911
248
457
0.272
0.502
486
44
135
0.091
0.278
122
shop = furniture
745
248
408
0.333
0.548
398
59
119
0.148
0.299
123
shop = hardware
724
263
399
0.363
0.551
519
117
211
0.225
0.407
124
shop = kiosk
395
223
258
0.565
0.653
205
82
92
0.400
0.449
125
shop = doityourself
723
226
329
0.313
0.455
493
85
145
0.172
0.294
126
shop = pet
475
119
178
0.251
0.375
210
31
61
0.148
0.290
127
shop = florist
525
211
267
0.402
0.509
262
52
87
0.198
0.332
128
shop = beauty
481
70
159
0.146
0.331
434
64
136
0.147
0.313
129
shop = mobile_phone
503
226
375
0.449
0.746
112
19
41
0.170
0.366
130
shop = shoes
258
110
134
0.426
0.519
152
29
54
0.191
0.355
131
shop = newsagent
534
377
432
0.706
0.809
65
24
29
0.369
0.446
132
shop = electronics
449
150
246
0.334
0.548
304
57
118
0.188
0.388
133
shop = alcohol
426
123
185
0.289
0.434
194
44
65
0.227
0.335
134
shop = jewelry
342
133
151
0.389
0.442
81
25
33
0.309
0.407
135
shop = mall
369
180
211
0.488
0.572
304
133
154
0.438
0.507
136
shop = butcher
390
133
205
0.341
0.526
219
59
76
0.269
0.347
137
shop = cosmetics
280
97
147
0.346
0.525
148
44
67
0.297
0.453
138
craft = shoemaker
225
160
162
0.711
0.720
64
11
24
0.172
0.375
139
amenity = *
6433
2699
3802
0.420
0.591
4325
1334
2077
0.308
0.480
140
shop = *
7095
2468
3497
0.348
0.493
4016
787
1289
0.196
0.321
141
leisure = *
3483
1584
2232
0.455
0.641
2371
769
1257
0.324
0.530
142
sport = *
779
317
365
0.407
0.469
589
170
213
0.289
0.362
143
clothes = *
228
92
123
0.404
0.539
130
30
46
0.231
0.354
144
#
building
30223
11353
17379
0.376
0.575
17801
4402
8421
0.247
0.473
145
building = industrial
1617
287
780
0.177
0.482
1230
146
471
0.119
0.383
146
building = service
1386
696
1092
0.502
0.788
1100
542
850
0.493
0.773
147
building = retail
1947
1152
1370
0.592
0.704
1131
501
642
0.443
0.568
148
building = commercial
593
189
286
0.319
0.482
541
154
245
0.285
0.453
149
building = warehouse
242
52
83
0.215
0.343
172
17
35
0.099
0.203
150
building = public
367
209
258
0.569
0.703
302
156
199
0.517
0.659
151
building = dormitory
607
168
186
0.277
0.306
450
76
101
0.169
0.224
152
building = warehouse
242
52
83
0.215
0.343
172
17
35
0.099
0.203
153
building = *
23464
8600
13324
0.367
0.568
13741
3341
6545
0.243
0.476
154
#
water
23561
18307
23540
0.777
0.999
5356
3355
5260
0.626
0.982
155
waterway = drain
431
125
426
0.290
0.988
119
45
117
0.378
0.983
156
waterway = ditch
640
182
640
0.284
1.000
128
38
125
0.297
0.977
157
waterway = stream
6242
4332
6241
0.694
1.000
1104
718
1095
0.650
0.992
158
waterway = river
11987
11005
11984
0.918
1.000
1736
1407
1675
0.810
0.965
159
waterway = canal
579
331
573
0.572
0.990
122
53
118
0.434
0.967
160
type = waterway
2058
1692
2058
0.822
1.000
1817
1455
1809
0.801
0.996
161
natural = water
3047
2076
3043
0.681
0.999
2479
1583
2468
0.639
0.996
162
natural = spring
590
232
588
0.393
0.997
529
179
527
0.338
0.996
163
waterway = *
37
17
37
0.459
1.000
29
12
29
0.414
1.000
164
water = *
2938
2031
2935
0.691
0.999
2388
1548
2378
0.648
0.996
165
#
natural
6205
3896
5992
0.628
0.966
3784
1817
3619
0.480
0.956
166
place = island
12
12
12
1.000
1.000
12
12
12
1.000
1.000
167
place = islet
87
68
87
0.782
1.000
77
60
76
0.779
0.987
168
boundary = *
1663
1472
1567
0.885
0.942
731
644
681
0.881
0.932
169
natural = *
1488
818
1428
0.550
0.960
1204
584
1157
0.485
0.961
170
landuse = *
3134
1671
3076
0.533
0.981
1906
623
1838
0.327
0.964
171
#
other
1471
523
657
0.356
0.447
1279
409
527
0.320
0.412
172
#
TOTAL
417028
336436
383675
0.807
0.920
114459
57443
87309
0.502
0.763
data = []
for c in list (categories_rules ) + ['other' , 'TOTAL' ]:
name_both_cnt = len (key_counter [(c ,)]['res_both' ])
name_be_ru_cnt = len (key_counter [(c ,)]['res_be_ru' ])
name_be_cnt = len (key_counter [(c ,)]['res_be' ])
name_ru_be_cnt = len (key_counter [(c ,)]['res_ru_be' ])
name_ru_cnt = len (key_counter [(c ,)]['res_ru' ])
name_other_both_cnt = len (key_counter [(c ,)]['res_other_both' ])
name_other_be_cnt = len (key_counter [(c ,)]['res_other_be' ])
name_other_ru_cnt = len (key_counter [(c ,)]['res_other_ru' ])
name_none_cnt = len (key_counter [(c ,)]['res_none' ])
total = (
name_both_cnt + name_be_ru_cnt + name_be_cnt + name_ru_be_cnt + name_ru_cnt +
name_other_both_cnt + name_other_be_cnt + name_other_ru_cnt +
name_none_cnt
)
data .append ([
'#' , c ,
name_both_cnt ,
name_be_ru_cnt , name_be_cnt ,
name_ru_be_cnt , name_ru_cnt ,
name_other_both_cnt , name_other_be_cnt , name_other_ru_cnt ,
name_none_cnt ,
name_both_cnt / (total or 1 ),
name_be_ru_cnt / (total or 1 ), name_be_cnt / (total or 1 ),
name_ru_be_cnt / (total or 1 ), name_ru_cnt / (total or 1 ),
name_other_both_cnt / (total or 1 ), name_other_be_cnt / (total or 1 ), name_other_ru_cnt / (total or 1 ),
name_none_cnt / (total or 1 ),
])
if c in {'other' , 'TOTAL' }:
continue
for i , (k , eq , vv ) in enumerate (categories_rules2 [c ]):
if eq :
tag = f'{ k } = { list (vv )[0 ]} '
else :
tag = f'{ k } = *'
name_both_cnt = len (key_counter [(c , i )]['res_both' ])
name_be_ru_cnt = len (key_counter [(c , i )]['res_be_ru' ])
name_be_cnt = len (key_counter [(c , i )]['res_be' ])
name_ru_be_cnt = len (key_counter [(c , i )]['res_ru_be' ])
name_ru_cnt = len (key_counter [(c , i )]['res_ru' ])
name_other_both_cnt = len (key_counter [(c , i )]['res_other_both' ])
name_other_be_cnt = len (key_counter [(c , i )]['res_other_be' ])
name_other_ru_cnt = len (key_counter [(c , i )]['res_other_ru' ])
name_none_cnt = len (key_counter [(c , i )]['res_none' ])
total = (
name_both_cnt + name_be_ru_cnt + name_be_cnt + name_ru_be_cnt + name_ru_cnt +
name_other_both_cnt + name_other_be_cnt + name_other_ru_cnt +
name_none_cnt
)
data .append ([
'' , tag ,
name_both_cnt ,
name_be_ru_cnt , name_be_cnt ,
name_ru_be_cnt , name_ru_cnt ,
name_other_both_cnt , name_other_be_cnt , name_other_ru_cnt ,
name_none_cnt ,
name_both_cnt / (total or 1 ),
name_be_ru_cnt / (total or 1 ), name_be_cnt / (total or 1 ),
name_ru_be_cnt / (total or 1 ), name_ru_cnt / (total or 1 ),
name_other_both_cnt / (total or 1 ), name_other_be_cnt / (total or 1 ), name_other_ru_cnt / (total or 1 ),
name_none_cnt / (total or 1 ),
])
df = pd .DataFrame (data , columns = [
'lvl' , 'category' ,
'name be=ru' , 'name be+ru' , 'name be' , 'name ru+be' , 'name ru' ,
'other both' , 'other be' , 'other ru' , 'no lang' ,
'name be=ru%' , 'name be+ru%' , 'name be%' , 'name ru+be%' , 'name ru%' ,
'other both%' , 'other be%' , 'other ru%' , 'no lang%' ,
])
(
df
.style
.set_properties (subset = ['category' ], ** {'text-align' : 'left' })
.set_properties (subset = ['name be=ru' , 'name be+ru' , 'name be' ], ** {'background-color' : '#d9ead3' })
.set_properties (subset = ['name ru+be' ], ** {'background-color' : '#fff2cc' })
.set_properties (subset = ['name ru' , 'other both' , 'other be' , 'other ru' , 'no lang' ], ** {'background-color' : '#f4cccc' })
.background_gradient ('YlOrRd' , subset = [c for c in df .columns if c .endswith ('%' )], vmin = 0 , vmax = 1 )
.format ({f : '{:.3f}' for f in [c for c in df .columns if c .endswith ('%' )]})
.apply (lambda row : [("font-weight: bold" if row .loc ['lvl' ] == '#' else '' ) for _ in row ], axis = 1 )
)
lvl
category
name be=ru
name be+ru
name be
name ru+be
name ru
other both
other be
other ru
no lang
name be=ru%
name be+ru%
name be%
name ru+be%
name ru%
other both%
other be%
other ru%
no lang%
0
#
admin
146
4
0
6724
0
0
0
0
0
0.021
0.001
0.000
0.978
0.000
0.000
0.000
0.000
0.000
1
admin_level = 2
1
2
0
1
0
0
0
0
0
0.250
0.500
0.000
0.250
0.000
0.000
0.000
0.000
0.000
2
admin_level = 4
34
2
0
287
0
0
0
0
0
0.105
0.006
0.000
0.889
0.000
0.000
0.000
0.000
0.000
3
admin_level = 6
51
0
0
1253
0
0
0
0
0
0.039
0.000
0.000
0.961
0.000
0.000
0.000
0.000
0.000
4
admin_level = 8
60
0
0
5079
0
0
0
0
0
0.012
0.000
0.000
0.988
0.000
0.000
0.000
0.000
0.000
5
admin_level = 9
0
0
0
104
0
0
0
0
0
0.000
0.000
0.000
1.000
0.000
0.000
0.000
0.000
0.000
6
#
place
9364
198
0
104713
2
0
0
0
5
0.082
0.002
0.000
0.916
0.000
0.000
0.000
0.000
0.000
7
place = city
22
0
0
97
0
0
0
0
0
0.185
0.000
0.000
0.815
0.000
0.000
0.000
0.000
0.000
8
place = town
56
0
0
621
0
0
0
0
0
0.083
0.000
0.000
0.917
0.000
0.000
0.000
0.000
0.000
9
place = village
1186
0
0
11091
0
0
0
0
3
0.097
0.000
0.000
0.903
0.000
0.000
0.000
0.000
0.000
10
place = hamlet
7588
4
0
81896
0
0
0
0
1
0.085
0.000
0.000
0.915
0.000
0.000
0.000
0.000
0.000
11
place = isolated_dwelling
222
0
0
2668
2
0
0
0
0
0.077
0.000
0.000
0.923
0.001
0.000
0.000
0.000
0.000
12
boundary = administrative
7267
4
0
82316
0
0
0
0
0
0.081
0.000
0.000
0.919
0.000
0.000
0.000
0.000
0.000
13
traffic_sign = city_limit
165
190
0
1697
0
0
0
0
1
0.080
0.093
0.000
0.827
0.000
0.000
0.000
0.000
0.000
14
admin_level = *
7121
0
0
75573
0
0
0
0
0
0.086
0.000
0.000
0.914
0.000
0.000
0.000
0.000
0.000
15
#
allotments
361
3
0
2101
395
2
1
2
6
0.126
0.001
0.000
0.732
0.138
0.001
0.000
0.001
0.002
16
place = allotments
270
3
0
1559
1
0
0
0
0
0.147
0.002
0.000
0.851
0.001
0.000
0.000
0.000
0.000
17
landuse = allotments
330
2
0
1959
395
2
1
2
6
0.122
0.001
0.000
0.726
0.146
0.001
0.000
0.001
0.002
18
#
locality
1299
12
1
11380
40
3
0
0
4
0.102
0.001
0.000
0.893
0.003
0.000
0.000
0.000
0.000
19
place = locality
1298
12
1
11320
40
3
0
0
4
0.102
0.001
0.000
0.893
0.003
0.000
0.000
0.000
0.000
20
abandoned:place = *
331
0
0
3431
0
0
0
0
0
0.088
0.000
0.000
0.912
0.000
0.000
0.000
0.000
0.000
21
#
suburb
337
75
20
3614
3857
11
1
14
52
0.042
0.009
0.003
0.453
0.483
0.001
0.000
0.002
0.007
22
landuse = commercial
4
4
0
88
79
0
0
1
2
0.022
0.022
0.000
0.494
0.444
0.000
0.000
0.006
0.011
23
landuse = construction
3
0
0
26
128
1
0
0
4
0.019
0.000
0.000
0.160
0.790
0.006
0.000
0.000
0.025
24
landuse = industrial
72
8
10
956
3281
6
0
10
42
0.016
0.002
0.002
0.218
0.748
0.001
0.000
0.002
0.010
25
landuse = residential
81
43
1
581
170
1
0
1
3
0.092
0.049
0.001
0.659
0.193
0.001
0.000
0.001
0.003
26
landuse = retail
2
4
1
132
91
1
0
0
1
0.009
0.017
0.004
0.569
0.392
0.004
0.000
0.000
0.004
27
place = *
179
19
8
1917
6
2
1
0
0
0.084
0.009
0.004
0.899
0.003
0.001
0.000
0.000
0.000
28
residential = *
70
42
1
494
108
1
0
1
2
0.097
0.058
0.001
0.687
0.150
0.001
0.000
0.001
0.003
29
industrial = *
4
1
4
68
388
1
0
5
3
0.008
0.002
0.008
0.143
0.819
0.002
0.000
0.011
0.006
30
#
highway
238
195
11
107923
2239
181
1
1
279
0.002
0.002
0.000
0.972
0.020
0.002
0.000
0.000
0.003
31
highway = motorway
152
0
0
0
0
0
0
0
0
1.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
32
highway = trunk
58
88
0
2410
2
12
0
0
48
0.022
0.034
0.000
0.921
0.001
0.005
0.000
0.000
0.018
33
highway = primary
0
44
0
15084
110
24
0
0
66
0.000
0.003
0.000
0.984
0.007
0.002
0.000
0.000
0.004
34
highway = secondary
0
4
0
14461
50
0
0
0
0
0.000
0.000
0.000
0.996
0.003
0.000
0.000
0.000
0.000
35
highway = tertiary
3
7
5
10421
201
0
0
0
15
0.000
0.001
0.000
0.978
0.019
0.000
0.000
0.000
0.001
36
highway = unclassified
1
0
0
2875
52
13
0
0
64
0.000
0.000
0.000
0.957
0.017
0.004
0.000
0.000
0.021
37
highway = residential
3
0
5
55655
992
1
1
0
28
0.000
0.000
0.000
0.982
0.018
0.000
0.000
0.000
0.000
38
highway = service
0
13
0
2894
255
14
0
0
8
0.000
0.004
0.000
0.909
0.080
0.004
0.000
0.000
0.003
39
highway = track
0
0
1
570
93
117
0
1
6
0.000
0.000
0.001
0.723
0.118
0.148
0.000
0.001
0.008
40
type = associatedStreet
0
0
0
1819
0
0
0
0
0
0.000
0.000
0.000
1.000
0.000
0.000
0.000
0.000
0.000
41
type = street
0
0
0
37
0
0
0
0
0
0.000
0.000
0.000
1.000
0.000
0.000
0.000
0.000
0.000
42
highway = *
21
39
0
1697
484
0
0
0
44
0.009
0.017
0.000
0.743
0.212
0.000
0.000
0.000
0.019
43
#
public_transport
2179
13260
104
13626
5129
35
0
80
64
0.063
0.385
0.003
0.395
0.149
0.001
0.000
0.002
0.002
44
highway = bus_stop
1363
7358
62
6632
1943
0
0
69
29
0.078
0.422
0.004
0.380
0.111
0.000
0.000
0.004
0.002
45
type = route
10
52
7
2007
1829
19
0
1
0
0.003
0.013
0.002
0.511
0.466
0.005
0.000
0.000
0.000
46
type = route_master
0
0
0
2
15
1
0
0
0
0.000
0.000
0.000
0.111
0.833
0.056
0.000
0.000
0.000
47
public_transport = *
1790
12261
58
7774
2071
10
0
54
56
0.074
0.509
0.002
0.323
0.086
0.000
0.000
0.002
0.002
48
route = *
14
52
7
2031
1833
20
0
1
0
0.004
0.013
0.002
0.513
0.463
0.005
0.000
0.000
0.000
49
railway = *
173
475
20
2724
675
4
0
0
6
0.042
0.117
0.005
0.668
0.166
0.001
0.000
0.000
0.001
50
route_master = *
0
0
0
2
17
1
0
0
0
0.000
0.000
0.000
0.100
0.850
0.050
0.000
0.000
0.000
51
#
infrastructure
2071
324
27
9350
7508
92
39
22
266
0.105
0.016
0.001
0.475
0.381
0.005
0.002
0.001
0.014
52
tunnel = *
978
69
0
3481
1300
18
0
0
16
0.167
0.012
0.000
0.594
0.222
0.003
0.000
0.000
0.003
53
barrier = *
238
233
13
2380
1497
34
0
4
34
0.054
0.053
0.003
0.537
0.338
0.008
0.000
0.001
0.008
54
power = *
815
1
1
1504
2443
12
38
14
90
0.166
0.000
0.000
0.306
0.497
0.002
0.008
0.003
0.018
55
bridge = *
29
7
6
1864
231
10
0
0
1
0.014
0.003
0.003
0.868
0.108
0.005
0.000
0.000
0.000
56
substation = *
592
0
1
890
1044
2
2
10
31
0.230
0.000
0.000
0.346
0.406
0.001
0.001
0.004
0.012
57
emergency = *
1
0
2
147
992
10
0
3
0
0.001
0.000
0.002
0.127
0.859
0.009
0.000
0.003
0.000
58
ele = *
34
1
0
125
147
3
0
0
2
0.109
0.003
0.000
0.401
0.471
0.010
0.000
0.000
0.006
59
man_made = *
59
13
5
474
1014
5
0
1
121
0.035
0.008
0.003
0.280
0.599
0.003
0.000
0.001
0.072
60
embankment = *
4
0
0
212
117
2
0
0
0
0.012
0.000
0.000
0.633
0.349
0.006
0.000
0.000
0.000
61
#
religion
40
127
57
2436
1062
23
10
9
44
0.011
0.033
0.015
0.640
0.279
0.006
0.003
0.002
0.012
62
amenity = place_of_worship
36
55
30
1620
899
18
0
6
10
0.013
0.021
0.011
0.606
0.336
0.007
0.000
0.002
0.004
63
amenity = monastery
0
0
0
12
6
0
1
0
2
0.000
0.000
0.000
0.571
0.286
0.000
0.048
0.000
0.095
64
building = church
0
22
17
930
432
11
5
2
12
0.000
0.015
0.012
0.650
0.302
0.008
0.003
0.001
0.008
65
building = cathedral
0
0
0
21
4
0
0
0
0
0.000
0.000
0.000
0.840
0.160
0.000
0.000
0.000
0.000
66
building = chapel
18
6
2
170
66
1
0
0
4
0.067
0.022
0.007
0.637
0.247
0.004
0.000
0.000
0.015
67
religion = *
39
117
49
2366
1004
23
6
6
30
0.011
0.032
0.013
0.650
0.276
0.006
0.002
0.002
0.008
68
#
education
363
115
35
2611
1581
43
67
22
811
0.064
0.020
0.006
0.462
0.280
0.008
0.012
0.004
0.144
69
landuse = education
4
7
0
778
13
2
0
0
0
0.005
0.009
0.000
0.968
0.016
0.002
0.000
0.000
0.000
70
amenity = university
1
6
2
108
7
4
23
1
47
0.005
0.030
0.010
0.543
0.035
0.020
0.116
0.005
0.236
71
amenity = college
1
6
1
127
38
6
12
2
127
0.003
0.019
0.003
0.397
0.119
0.019
0.037
0.006
0.397
72
amenity = school
294
36
22
811
797
9
0
9
15
0.148
0.018
0.011
0.407
0.400
0.005
0.000
0.005
0.008
73
amenity = kindergarten
19
52
9
1182
609
19
0
3
10
0.010
0.027
0.005
0.621
0.320
0.010
0.000
0.002
0.005
74
building = university
3
0
1
42
8
1
18
1
90
0.018
0.000
0.006
0.256
0.049
0.006
0.110
0.006
0.549
75
building = college
0
2
0
37
12
3
7
0
55
0.000
0.017
0.000
0.319
0.103
0.026
0.060
0.000
0.474
76
building = school
49
10
1
161
101
1
9
5
334
0.073
0.015
0.001
0.240
0.151
0.001
0.013
0.007
0.498
77
building = kindergarten
0
3
0
166
56
0
2
1
141
0.000
0.008
0.000
0.450
0.152
0.000
0.005
0.003
0.382
78
#
healthcare
66
55
20
1954
2379
160
63
77
883
0.012
0.010
0.004
0.345
0.421
0.028
0.011
0.014
0.156
79
amenity = hospital
7
6
4
248
224
13
17
3
131
0.011
0.009
0.006
0.380
0.343
0.020
0.026
0.005
0.201
80
amenity = pharmacy
36
35
4
1040
887
124
8
57
311
0.014
0.014
0.002
0.416
0.355
0.050
0.003
0.023
0.124
81
amenity = clinic
3
5
8
317
254
14
14
7
112
0.004
0.007
0.011
0.432
0.346
0.019
0.019
0.010
0.153
82
amenity = doctors
13
1
0
88
727
0
10
2
84
0.014
0.001
0.000
0.095
0.786
0.000
0.011
0.002
0.091
83
amenity = dentist
0
5
1
95
149
5
5
7
106
0.000
0.013
0.003
0.255
0.399
0.013
0.013
0.019
0.284
84
building = hospital
6
2
3
153
73
3
10
0
155
0.015
0.005
0.007
0.378
0.180
0.007
0.025
0.000
0.383
85
building = clinic
0
1
0
14
6
1
0
0
9
0.000
0.032
0.000
0.452
0.194
0.032
0.000
0.000
0.290
86
healthcare = *
22
42
12
852
1647
65
1
44
27
0.008
0.015
0.004
0.314
0.607
0.024
0.000
0.016
0.010
87
#
government
101
237
22
1431
2075
49
90
37
923
0.020
0.048
0.004
0.288
0.418
0.010
0.018
0.007
0.186
88
amenity = post_office
1
220
4
278
77
12
22
5
463
0.001
0.203
0.004
0.257
0.071
0.011
0.020
0.005
0.428
89
amenity = police
3
1
3
254
140
10
37
10
249
0.004
0.001
0.004
0.359
0.198
0.014
0.052
0.014
0.352
90
amenity = library
2
6
4
203
46
3
31
2
202
0.004
0.012
0.008
0.407
0.092
0.006
0.062
0.004
0.405
91
office = government
59
8
10
482
1271
22
0
18
3
0.032
0.004
0.005
0.257
0.679
0.012
0.000
0.010
0.002
92
landuse = military
22
1
1
118
225
2
0
0
1
0.059
0.003
0.003
0.319
0.608
0.005
0.000
0.000
0.003
93
government = *
40
6
4
249
406
14
0
3
2
0.055
0.008
0.006
0.344
0.561
0.019
0.000
0.004
0.003
94
military = *
22
3
0
153
383
0
0
2
6
0.039
0.005
0.000
0.269
0.673
0.000
0.000
0.004
0.011
95
#
office
244
9
16
1174
3590
26
1
56
156
0.046
0.002
0.003
0.223
0.681
0.005
0.000
0.011
0.030
96
office = *
244
9
16
1174
3590
26
1
56
156
0.046
0.002
0.003
0.223
0.681
0.005
0.000
0.011
0.030
97
#
tourism
362
259
327
5993
6798
748
13
75
666
0.024
0.017
0.021
0.393
0.446
0.049
0.001
0.005
0.044
98
tourism = *
243
172
170
4208
3832
717
10
51
313
0.025
0.018
0.017
0.433
0.394
0.074
0.001
0.005
0.032
99
historic = *
118
108
159
2152
3064
39
2
25
352
0.020
0.018
0.026
0.358
0.509
0.006
0.000
0.004
0.058
100
memorial = *
27
30
43
524
1359
16
1
9
303
0.012
0.013
0.019
0.227
0.588
0.007
0.000
0.004
0.131
101
ruins = *
5
16
0
62
177
2
0
0
0
0.019
0.061
0.000
0.237
0.676
0.008
0.000
0.000
0.000
102
information = *
8
17
10
113
198
629
1
1
6
0.008
0.017
0.010
0.115
0.201
0.640
0.001
0.001
0.006
103
attraction = *
13
3
7
52
93
3
1
0
1
0.075
0.017
0.040
0.301
0.538
0.017
0.006
0.000
0.006
104
resort = *
2
0
0
26
110
1
0
1
2
0.014
0.000
0.000
0.183
0.775
0.007
0.000
0.007
0.014
105
artwork_type = *
27
18
29
174
370
27
4
14
90
0.036
0.024
0.039
0.231
0.491
0.036
0.005
0.019
0.120
106
#
amenity
4610
688
254
14982
7987
1065
1237
560
16906
0.095
0.014
0.005
0.310
0.165
0.022
0.026
0.012
0.350
107
amenity = cafe
159
20
14
746
353
32
88
40
1009
0.065
0.008
0.006
0.303
0.143
0.013
0.036
0.016
0.410
108
amenity = atm
840
2
1
512
38
12
1
0
48
0.578
0.001
0.001
0.352
0.026
0.008
0.001
0.000
0.033
109
amenity = bank
1104
19
1
615
88
74
4
10
158
0.533
0.009
0.000
0.297
0.042
0.036
0.002
0.005
0.076
110
amenity = fast_food
43
1
10
252
102
15
34
7
294
0.057
0.001
0.013
0.332
0.135
0.020
0.045
0.009
0.388
111
amenity = fuel
68
3
0
496
110
52
17
31
349
0.060
0.003
0.000
0.440
0.098
0.046
0.015
0.028
0.310
112
amenity = community_centre
171
11
2
129
138
1
29
7
254
0.230
0.015
0.003
0.174
0.186
0.001
0.039
0.009
0.342
113
amenity = restaurant
46
14
5
196
66
6
30
11
229
0.076
0.023
0.008
0.325
0.109
0.010
0.050
0.018
0.380
114
amenity = bar
36
6
4
89
80
4
22
6
224
0.076
0.013
0.008
0.189
0.170
0.008
0.047
0.013
0.476
115
shop = convenience
696
96
82
2689
1335
426
113
148
1728
0.095
0.013
0.011
0.368
0.183
0.058
0.015
0.020
0.236
116
shop = clothes
42
23
2
215
208
7
15
15
593
0.037
0.021
0.002
0.192
0.186
0.006
0.013
0.013
0.529
117
shop = car_repair
12
3
0
256
127
13
29
22
671
0.011
0.003
0.000
0.226
0.112
0.011
0.026
0.019
0.592
118
shop = hairdresser
30
13
3
315
185
6
36
2
485
0.028
0.012
0.003
0.293
0.172
0.006
0.033
0.002
0.451
119
shop = chemist
2
1
1
1013
261
26
31
5
138
0.001
0.001
0.001
0.685
0.177
0.018
0.021
0.003
0.093
120
shop = supermarket
260
18
3
410
141
164
5
25
223
0.208
0.014
0.002
0.328
0.113
0.131
0.004
0.020
0.179
121
shop = car_parts
2
1
0
199
237
6
40
12
414
0.002
0.001
0.000
0.218
0.260
0.007
0.044
0.013
0.454
122
shop = furniture
20
2
0
203
143
12
11
28
326
0.027
0.003
0.000
0.272
0.192
0.016
0.015
0.038
0.438
123
shop = hardware
23
13
5
196
150
8
18
9
302
0.032
0.018
0.007
0.271
0.207
0.011
0.025
0.012
0.417
124
shop = kiosk
9
35
4
149
54
9
17
2
116
0.023
0.089
0.010
0.377
0.137
0.023
0.043
0.005
0.294
125
shop = doityourself
22
2
8
178
113
4
12
10
374
0.030
0.003
0.011
0.246
0.156
0.006
0.017
0.014
0.517
126
shop = pet
4
0
0
94
74
2
19
4
278
0.008
0.000
0.000
0.198
0.156
0.004
0.040
0.008
0.585
127
shop = florist
18
3
1
171
64
4
14
7
243
0.034
0.006
0.002
0.326
0.122
0.008
0.027
0.013
0.463
128
shop = beauty
11
0
2
39
96
2
16
11
304
0.023
0.000
0.004
0.081
0.200
0.004
0.033
0.023
0.632
129
shop = mobile_phone
140
6
0
78
147
1
1
3
127
0.278
0.012
0.000
0.155
0.292
0.002
0.002
0.006
0.252
130
shop = shoes
19
3
0
79
29
2
7
2
117
0.074
0.012
0.000
0.306
0.112
0.008
0.027
0.008
0.453
131
shop = newsagent
30
59
17
248
71
23
0
1
85
0.056
0.110
0.032
0.464
0.133
0.043
0.000
0.002
0.159
132
shop = electronics
63
2
1
74
96
0
10
11
192
0.140
0.004
0.002
0.165
0.214
0.000
0.022
0.024
0.428
133
shop = alcohol
8
4
2
94
72
5
10
2
229
0.019
0.009
0.005
0.221
0.169
0.012
0.023
0.005
0.538
134
shop = jewelry
3
2
1
85
45
16
26
0
164
0.009
0.006
0.003
0.249
0.132
0.047
0.076
0.000
0.480
135
shop = mall
22
5
2
120
46
15
16
3
140
0.060
0.014
0.005
0.325
0.125
0.041
0.043
0.008
0.379
136
shop = butcher
16
6
6
82
92
4
19
5
160
0.041
0.015
0.015
0.210
0.236
0.010
0.049
0.013
0.410
137
shop = cosmetics
1
2
0
71
58
5
18
10
115
0.004
0.007
0.000
0.254
0.207
0.018
0.064
0.036
0.411
138
craft = shoemaker
1
2
0
137
17
5
15
0
48
0.004
0.009
0.000
0.609
0.076
0.022
0.067
0.000
0.213
139
amenity = *
368
193
25
1930
1223
37
146
51
2460
0.057
0.030
0.004
0.300
0.190
0.006
0.023
0.008
0.382
140
shop = *
245
90
39
1737
1287
86
271
52
3288
0.035
0.013
0.005
0.245
0.181
0.012
0.038
0.007
0.463
141
leisure = *
89
37
14
1330
746
11
103
19
1134
0.026
0.011
0.004
0.382
0.214
0.003
0.030
0.005
0.326
142
sport = *
15
8
4
249
86
4
37
3
373
0.019
0.010
0.005
0.320
0.110
0.005
0.047
0.004
0.479
143
clothes = *
4
9
0
75
31
1
3
3
102
0.018
0.039
0.000
0.329
0.136
0.004
0.013
0.013
0.447
144
#
building
2276
539
217
7406
6615
381
534
162
12093
0.075
0.018
0.007
0.245
0.219
0.013
0.018
0.005
0.400
145
building = industrial
32
8
2
226
506
4
15
4
820
0.020
0.005
0.001
0.140
0.313
0.002
0.009
0.002
0.507
146
building = service
618
1
12
58
413
2
5
0
277
0.446
0.001
0.009
0.042
0.298
0.001
0.004
0.000
0.200
147
building = retail
252
36
36
665
267
116
47
34
494
0.129
0.018
0.018
0.342
0.137
0.060
0.024
0.017
0.254
148
building = commercial
23
9
4
129
115
8
16
2
287
0.039
0.015
0.007
0.218
0.194
0.013
0.027
0.003
0.484
149
building = warehouse
19
1
0
30
33
0
2
0
157
0.079
0.004
0.000
0.124
0.136
0.000
0.008
0.000
0.649
150
building = public
23
10
4
158
60
5
9
2
96
0.063
0.027
0.011
0.431
0.163
0.014
0.025
0.005
0.262
151
building = dormitory
0
3
1
109
71
2
53
1
367
0.000
0.005
0.002
0.180
0.117
0.003
0.087
0.002
0.605
152
building = warehouse
19
1
0
30
33
0
2
0
157
0.079
0.004
0.000
0.124
0.136
0.000
0.008
0.000
0.649
153
building = *
1309
471
158
6031
5150
244
387
119
9595
0.056
0.020
0.007
0.257
0.219
0.010
0.016
0.005
0.409
154
#
water
3733
177
0
14246
5228
151
0
5
21
0.158
0.008
0.000
0.605
0.222
0.006
0.000
0.000
0.001
155
waterway = drain
14
2
0
109
301
0
0
0
5
0.032
0.005
0.000
0.253
0.698
0.000
0.000
0.000
0.012
156
waterway = ditch
25
0
0
153
457
4
0
1
0
0.039
0.000
0.000
0.239
0.714
0.006
0.000
0.002
0.000
157
waterway = stream
1042
46
0
3214
1909
30
0
0
1
0.167
0.007
0.000
0.515
0.306
0.005
0.000
0.000
0.000
158
waterway = river
2584
105
0
8233
978
83
0
1
3
0.216
0.009
0.000
0.687
0.082
0.007
0.000
0.000
0.000
159
waterway = canal
11
1
0
312
242
7
0
0
6
0.019
0.002
0.000
0.539
0.418
0.012
0.000
0.000
0.010
160
type = waterway
375
10
0
1248
364
59
0
2
0
0.182
0.005
0.000
0.606
0.177
0.029
0.000
0.001
0.000
161
natural = water
29
8
0
2013
965
26
0
2
4
0.010
0.003
0.000
0.661
0.317
0.009
0.000
0.001
0.001
162
natural = spring
28
10
0
193
355
1
0
1
2
0.047
0.017
0.000
0.327
0.602
0.002
0.000
0.002
0.003
163
waterway = *
0
4
0
13
20
0
0
0
0
0.000
0.108
0.000
0.351
0.541
0.000
0.000
0.000
0.000
164
water = *
22
5
0
1978
902
26
0
2
3
0.007
0.002
0.000
0.673
0.307
0.009
0.000
0.001
0.001
165
#
natural
353
110
80
3308
2177
33
12
11
121
0.057
0.018
0.013
0.533
0.351
0.005
0.002
0.002
0.020
166
place = island
1
3
0
8
0
0
0
0
0
0.083
0.250
0.000
0.667
0.000
0.000
0.000
0.000
0.000
167
place = islet
3
1
0
62
19
2
0
0
0
0.034
0.011
0.000
0.713
0.218
0.023
0.000
0.000
0.000
168
boundary = *
107
1
4
1336
106
12
12
5
80
0.064
0.001
0.002
0.803
0.064
0.007
0.007
0.003
0.048
169
natural = *
112
33
49
620
656
4
0
3
11
0.075
0.022
0.033
0.417
0.441
0.003
0.000
0.002
0.007
170
landuse = *
137
74
27
1416
1429
17
0
3
31
0.044
0.024
0.009
0.452
0.456
0.005
0.000
0.001
0.010
171
#
other
67
19
6
401
159
7
23
4
785
0.046
0.013
0.004
0.273
0.108
0.005
0.016
0.003
0.534
172
#
TOTAL
24590
15579
989
290971
49027
2537
1770
971
30594
0.059
0.037
0.002
0.698
0.118
0.006
0.004
0.002
0.073
data = []
for dependant in list (dependants ):
name_all_cnt = dependant_key_counter [dependant ]['all' ]
name_be_cnt = dependant_key_counter [dependant ]['be' ]
name_ru_cnt = dependant_key_counter [dependant ]['ru' ]
name_name_cnt = dependant_key_counter [dependant ]['name' ]
name_no_cnt = dependant_key_counter [dependant ]['not found' ]
total = (name_all_cnt + name_be_cnt + name_ru_cnt + name_name_cnt + name_no_cnt )
data .append ([
'#' , dependant ,
name_all_cnt ,
name_be_cnt ,
name_ru_cnt ,
name_name_cnt ,
name_no_cnt ,
name_all_cnt / (total or 1 ),
name_be_cnt / (total or 1 ),
name_ru_cnt / (total or 1 ),
name_name_cnt / (total or 1 ),
name_no_cnt / (total or 1 ),
])
df = pd .DataFrame (data , columns = [
'lvl' , 'category' ,
'name all' , 'name be' , 'name ru' , 'name only' , 'not found' ,
'name all%' , 'name be%' , 'name ru%' , 'name only%' , 'not found%' ,
])
(
df
.style
.set_properties (subset = ['category' ], ** {'text-align' : 'left' })
.set_properties (subset = ['name all' , 'name be' ], ** {'background-color' : '#d9ead3' })
.set_properties (subset = ['name ru' , 'name only' ], ** {'background-color' : '#fff2cc' })
.set_properties (subset = ['not found' ], ** {'background-color' : '#f4cccc' })
.background_gradient ('YlOrRd' , subset = [c for c in df .columns if c .endswith ('%' )], vmin = 0 , vmax = 1 )
.format ({f : '{:.3f}' for f in [c for c in df .columns if c .endswith ('%' )]})
.apply (lambda row : [("font-weight: bold" if row .loc ['lvl' ] == '#' else '' ) for _ in row ], axis = 1 )
)
lvl
category
name all
name be
name ru
name only
not found
name all%
name be%
name ru%
name only%
not found%
0
#
addr:region
54988
0
0
0
0
1.000
0.000
0.000
0.000
0.000
1
#
addr:district
54682
0
0
0
108
0.998
0.000
0.000
0.000
0.002
2
#
addr:subdistrict
12994
0
0
0
0
1.000
0.000
0.000
0.000
0.000
3
#
addr:city
114361
0
130
0
2
0.999
0.000
0.001
0.000
0.000
4
#
addr:place
36954
0
2
0
54
0.998
0.000
0.000
0.000
0.001
5
#
addr:street
860120
11
1507
81
354
0.998
0.000
0.002
0.000
0.000
6
#
addr2:street
629
0
0
0
0
1.000
0.000
0.000
0.000
0.000
7
#
from
724
0
15
2
271
0.715
0.000
0.015
0.002
0.268
8
#
to
705
0
12
3
189
0.776
0.000
0.013
0.003
0.208
9
#
via
68
0
0
0
40
0.630
0.000
0.000
0.000
0.370
10
#
destination
847
0
0
0
680
0.555
0.000
0.000
0.000
0.445
11
#
destination:backward
57
0
0
0
102
0.358
0.000
0.000
0.000
0.642
12
#
destination:forward
88
0
0
0
94
0.484
0.000
0.000
0.000
0.516
13
#
water_tank:city
1
0
0
0
0
1.000
0.000
0.000
0.000
0.000