-
Notifications
You must be signed in to change notification settings - Fork 819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggested additional indexes #207
Comments
Paul, that's interesting research. Do you know how much additional indexes decrease diff update time and if they degrade after a while when applying diffs? |
No. They obviously do some amount - all indexes increase insert time, including the standard geometry ones. Unfortunately, the test sequence for this is load the full planet data, build indexes, time applying a large set of diffs, repeat, then do the same for a different set of indexes. I do not expect these would have a significant negative impact on update time, and for a server like the tile.osm.org ones where rendering is most of the load, it should be hugely helpful.
All indexes do - it's essential to See postgresql index storage parameters for more information. |
Did you try the ferry index too? These are the index commands I used when the DB was reloaded recently on Yevaud. They were first derived a couple of years ago and may no longer match the queries generated by the Carto style but they did seem to reduce the database disk IO. gis=# CREATE INDEX ferry_idx ON planet_osm_line USING gist (way)
WHERE (route = 'ferry'::text);
CREATE INDEX
Time: 212686.663 ms
gis=# CREATE INDEX water_lines_idx ON planet_osm_line USING gist (way)
WHERE (waterway = 'river'::text);
CREATE INDEX
Time: 245527.051 ms
gis=# CREATE INDEX national_park_idx ON planet_osm_polygon USING gist (way)
WHERE (boundary = 'national_park'::text);
CREATE INDEX
Time: 200265.090 ms
gis=# CREATE INDEX water_areas_idx ON planet_osm_polygon USING gist (way)
WHERE (((waterway IS NOT NULL)
OR (landuse = ANY (ARRAY['reservoir'::text, 'water'::text, 'basin'::text])))
OR ("natural" IS NOT NULL));
CREATE INDEX
Time: 362130.281 ms
gis=# CREATE INDEX leisure_polygon_idx ON planet_osm_polygon USING gist (way)
WHERE (((((((((landuse IS NOT NULL)
OR (leisure IS NOT NULL))
OR (aeroway IS NOT NULL))
OR (amenity IS NOT NULL))
OR (military IS NOT NULL))
OR ("natural" IS NOT NULL))
OR (power IS NOT NULL))
OR (tourism IS NOT NULL))
OR (highway IS NOT NULL));
CREATE INDEX
Time: 692723.212 ms
gis=# analyze planet_osm_polygon;
ANALYZE
Time: 3631.508 ms
gis=# ANALYZE planet_osm_line;
ANALYZE
Time: 9027.941 ms
gis=# |
Working my way there :)
I'm hoping that the Can you check the number of scans with |
I did spend a while thinking about using 'building is null' once before. I think I came to the conclusion that it would not be useful on its own.
gis=# SELECT relname, indexrelname, idx_scan, idx_tup_read, idx_tup_fetch FROM pg_stat_all_indexes WHERE
schemaname = 'public';
relname | indexrelname | idx_scan | idx_tup_read | idx_tup_fetch
--------------------+--------------------------+------------+---------------+---------------
spatial_ref_sys | spatial_ref_sys_pkey | 58 | 42451 | 34
planet_osm_nodes | planet_osm_nodes_pkey | 0 | 0 | 0
planet_osm_rels | planet_osm_rels_parts | 113021697 | 26921754 | 0
planet_osm_rels | planet_osm_rels_idx | 79139 | 2323268 | 1039537
planet_osm_rels | planet_osm_rels_pkey | 2503925 | 3850694 | 2385537
planet_osm_ways | planet_osm_ways_nodes | 88829336 | 49482411 | 0
planet_osm_ways | planet_osm_ways_idx | 78981 | 139448491 | 9345884
planet_osm_ways | planet_osm_ways_pkey | 478998833 | 483055856 | 470937294
planet_osm_roads | planet_osm_roads_pkey | 26146323 | 11303403 | 2586441
planet_osm_roads | planet_osm_roads_index | 81408599 | 14710768229 | 2301551862
planet_osm_point | planet_osm_point_pkey | 97114726 | 1498889 | 1356590
planet_osm_point | planet_osm_point_index | 780215985 | 293482811310 | 42034817139
planet_osm_line | water_lines_idx | 220329 | 18794281 | 9230207
planet_osm_line | ferry_idx | 19912711 | 8027972 | 6257047
planet_osm_line | planet_osm_line_pkey | 26146330 | 47317105 | 26643291
planet_osm_line | planet_osm_line_index | 1244288415 | 1209981428900 | 211810654512
planet_osm_polygon | leisure_polygon_idx | 117129517 | 29672063052 | 7492540844
planet_osm_polygon | water_areas_idx | 118773853 | 9762978659 | 2286235650
planet_osm_polygon | national_park_idx | 20787338 | 6961639 | 3075306
planet_osm_polygon | planet_osm_polygon_pkey | 26146335 | 27158346 | 4420594
planet_osm_polygon | planet_osm_polygon_index | 830025980 | 879337830585 | 136794125324
(21 rows) |
Well, I was unsure, that's why I ran the tests, which showed the noticeable improvement. For reference, the 5 layers with |
Indexes ruled out: Anything general on boundary, anything on boundary on the roads table, partial gist on man_made=cutline. Giving postgres enough |
Ferry is the top time-consuming SQL for me. BEFORE
AFTER CREATE INDEX ferry_idx ON planet_osm_line USING gist (way)
|
That should be doing |
It depends entirely on the zoom. But yes, at this zoom, it should be index scanning. If the index exists, you're getting a horrible query plan, either from bad statistics or cost parameters. |
the index does exist, but I don't why postgres does not use it. this is the index before I created a ferry one CREATE INDEX planet_osm_line_index
ON planet_osm_line
USING gist
(way); I am using ubuntu 14.04, everything from the packages, and imported the whole planet using
|
I found a similar issue gis=> EXPLAIN ANALYZE SELECT ST_AsBinary("way") AS geom,"barrier" FROM (select way,barrier from planet_osm_polygon where barrier is not null) as area_barriers WHERE "way" && ST_SetSRID('BOX3D(-19940891.93904749 -6576030.417433815,-19918878.07490134 -6554016.553287672)'::box3d, 900913);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on planet_osm_polygon (cost=0.00..6879888.83 rows=29933854 width=64) (actual time=52122.745..52122.745 rows=0 loops=1)
Filter: ((barrier IS NOT NULL) AND (way && '010300002031BF0D000100000005000000A95606BF610473C1523CB79AE71559C1A95606BF610473C1B2106923680059C1BFCB32E101FF72C1B2106923680059C1BFCB32E101FF72C1523CB79AE71559C1A95606BF610473C1523CB79AE71559C1'::geometry))
Rows Removed by Filter: 150421368
Total runtime: 52122.775 ms
(4 rows)
gis=>
gis=> set enable_seqscan = off;
SET
gis=> EXPLAIN ANALYZE SELECT ST_AsBinary("way") AS geom,"barrier" FROM (select way,barrier from planet_osm_polygon where barrier is not null) as area_barriers WHERE "way" && ST_SetSRID('BOX3D(-19940891.93904749 -6576030.417433815,-19918878.07490134 -6554016.553287672)'::box3d, 900913);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on planet_osm_polygon (cost=1534352.07..6910027.15 rows=29933854 width=64) (actual time=0.070..0.070 rows=0 loops=1)
Recheck Cond: (way && '010300002031BF0D000100000005000000A95606BF610473C1523CB79AE71559C1A95606BF610473C1B2106923680059C1BFCB32E101FF72C1B2106923680059C1BFCB32E101FF72C1523CB79AE71559C1A95606BF610473C1523CB79AE71559C1'::geometry)
Filter: (barrier IS NOT NULL)
Rows Removed by Filter: 1
-> Bitmap Index Scan on planet_osm_polygon_index (cost=0.00..1526868.61 rows=30084275 width=0) (actual time=0.062..0.062 rows=1 loops=1)
Index Cond: (way && '010300002031BF0D000100000005000000A95606BF610473C1523CB79AE71559C1A95606BF610473C1B2106923680059C1BFCB32E101FF72C1B2106923680059C1BFCB32E101FF72C1523CB79AE71559C1A95606BF610473C1523CB79AE71559C1'::geometry)
Total runtime: 0.610 ms
(7 rows) |
Finally found the reason: the problem is index not used by query planner, and query planner need table statistics to work, but when autovacuum is off, ANALYZE is NOT run automatically. this is on by default so most people not encounter this problem. so the solution for slow rendering is: do a manual ANALYZE ANALYZE; everything works. |
As I indicated above, these don't appear to be issues with the stylesheet, but with your database statistics, or cost parameters in postgresql.conf. |
For #1736 I loaded data into my testing server, and created some partial indexes, to see what is tested. The indexes created, paramaterized for furry-sanza are
|
I pre-rendered in two stages: z0-z6 worldwide and z7-z12 in the US northeast. I captured the index usage stats, took differences, and got the index stats for each stage. z0-z6
z7-z12
|
Updated polygon way area recommendations. It turns out that you need a bigger difference between index size than I realized for it to matter, except for cache contention. 233 is 0.01*0.9999 of a z10 pixel, and z5 for 239000. For the planet I had, the indexes are 10GB and 494 MB. The non-partial index is 20GB. |
These custom indexes should speed up rendering, particularly at low and middle zooms. Fixes gravitystorm#207
#16 suggests non-standard indexes.
What non-standard indexes should we be suggesting for performance reasons?
The text was updated successfully, but these errors were encountered: