TileMap update_bitmask_region() poor performance #25731

danboo · 2019-02-09T07:33:16Z

Godot version:
3.1 beta 3

OS/device including version:
Windows 10

Issue description:
Working with a large (millions of cells) TileMap I found that updating the autotile bitmask using the easiest, built-in method was slower than I could afford. After trying some alternate methods I found faster ways to update, but the result was unintuitive.

On my computer with a 1000x1000 TileMap I got the following results:

tilemap.update_bitmask_region() - 7.1 seconds

This is letting the engine update the entire TileMap by calling the method with no arguments. I'd actually expect this to be the fastest way to update the tilemap since it isn't iterating through a GDScript loop and calling into engine code. It is wholly contained in the C++ code.
tilemap.update_bitmask_region(Vector2(min_x, min_y), Vector2(max_x, max_y)) - 1.7 seconds

I thought this would have the same performance as the default call in example 1, just with explicit bounds.
tilemap.update_bitmask_area(Vector2(col_i, row_i)) - 0.9 seconds

In this method I iterate through the rows and columns in GDScript, calling this method for every 3rd cell. I thought this would be the slowest method since it has the overhead of the GDScript loop and method calls. The gist of it is:

	for i in range(0, dimensions.y+1, 3):
		for j in range(0, dimensions.x+1, 3):
			tilemap.update_bitmask_area(Vector2(j,i))

I'm not surprised that updating a million tiles takes a measurable amount of time, just that the method of iterating and calling methods in GDScript was faster than the single-call methods that stay within the C++ code. It seems like implementing method 3 in C++ might even be that much more efficient.

Steps to reproduce:
Install the referenced project and use the 3 buttons to try each method.

Minimal reproduction project:
https://github.com/danboo/Godot-SlowTileMapBitmaskUpdate

The text was updated successfully, but these errors were encountered:

zzz-assault · 2020-03-18T22:27:21Z

The issue is still the same in 3.2.1

sian2005 · 2020-04-18T15:48:33Z

Yeah this is pretty laggy, it takes my pc 7 seconds to update the bitmask on a 500 by 500 tiles area.

zzz-assault · 2020-04-19T19:12:04Z

Just had a look onto the source code (I'm no C++ programmer, but it's understandable). Regarding the huge difference between point 1 and 2, this is related to a fallback if no vector2 (start & end) are given, which is using the tilemap.update_bitmask_area() for all tiles, therefor resulting in a nested double loop in the already existing double loop of tilemap.update_bitmask_region().

Triggered by the second part of the if statement (p_end.x == p_start.x && p_end.y == p_start.y).

void TileMap::update_bitmask_region(const Vector2 &p_start, const Vector2 &p_end) {

	if ((p_end.x < p_start.x || p_end.y < p_start.y) || (p_end.x == p_start.x && p_end.y == p_start.y)) {

		int i;

		Array a = get_used_cells();

		for (i = 0; i < a.size(); i++) {

			// update_bitmask_area() in order to update cells adjacent to the

			// current cell, since ordering in array may not be reliable

			Vector2 vector = (Vector2)a[i];

			update_bitmask_area(Vector2(vector.x, vector.y));

		}

		return;

	}

zzz-assault · 2020-04-26T22:33:03Z

For 2 a short workaround / performance trick :)

As long as the tilemap is not loaded to scene tree (add child) it's thread save, which means, we can use multi threading via GD-Script (THREAD). This also applies to the bitmask function, so we can use the below code, to boost the performance by the nr. of available CPU threads.

Sample Code for use of THREAD:

const map_size : Vector2 = Vector2(1024, 1024)
var map_threads : Array = []
var threads_finished_count : int = 0
var sample_tilemap : TileMap = load("res://...")

func _ready() -> void:
	for i in range (10):
		map_threads.append(Thread.new())
	_update_bitmask(sample_tilemap)

func _update_bitmask(sample_tilemap : TileMap) -> void:
	threads_finished_count = 0
	for i in range (map_threads.size()):
		var thread_data : Array = [] as Array
		thread_data.append(Tilemap)
		thread_data.append(i)
		map_threads[i].start(self, "_update_bitmask_part", thread_data)

func _update_bitmask_part(thread_data : Array) -> void:
	thread_data[0].update_bitmask_region(Vector2(0, int(map_size.y / map_threads.size() * (thread_data[1]))), Vector2(map_size.x, int(map_size.y / map_threads.size() * (thread_data[1] + 1))))
	call_deferred("_update_bitmask_finished", thread_data)
	
func _update_bitmask_finished(thread_data : Array) -> void:
	map_threads[thread_data[1]].wait_to_finish()
	threads_finished_count += 1
	if threads_finished_count == map_threads.size():
		get_parent().add_child(thread_data[0])

This is copied out of my project (slightly simplified), I'm sure it would be easier to show, but thats basicly how it works :)

zzz-assault · 2020-05-02T20:55:19Z

Regarding issue 1, I thought about the code and I think the "update_bitmask_area()" in the fallback makes no sense (it will update the cell + all adjacent cells), but with "get_used_cells()" we will anyhow get all cells which have a tile used and also the ordering is irrelevant, as "update_cell_bitmask()" is not affected by the ordering of the array. Breaking it down, the use of "update_bitmask_area()" is unnecessarily updating plenty of cells mutliple times -> therefore the high computing time!

So I would recommend to change "update_bitmask_area()" in the fallback to "update_cell_bitmask()", then the computing time between issue 1 & 2 should be the same.

danboo · 2020-05-07T05:31:15Z

I had a chance to pull and build the latest 3.2 branch and confirmed that the performance of method 1 was much improved. On my build method 1, at 1.1 seconds, was actually faster than method 2, at 1.6 seconds, implying that fetching and iterating over the array is faster than the double nested for loop.

Thanks @zzz-assault !

clayjohn added enhancement topic:core labels Apr 21, 2019

Calinou added the confirmed label Mar 18, 2020

zzz-assault mentioned this issue May 2, 2020

Tilemap: Fix performance issue in fallback of update_bitmask_region #38420

Merged

akien-mga added this to the 4.0 milestone May 3, 2020

danboo closed this as completed May 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TileMap update_bitmask_region() poor performance #25731

TileMap update_bitmask_region() poor performance #25731

danboo commented Feb 9, 2019 •

edited

Loading

zzz-assault commented Mar 18, 2020

sian2005 commented Apr 18, 2020

zzz-assault commented Apr 19, 2020

zzz-assault commented Apr 26, 2020 •

edited

Loading

zzz-assault commented May 2, 2020

danboo commented May 7, 2020

TileMap update_bitmask_region() poor performance #25731

TileMap update_bitmask_region() poor performance #25731

Comments

danboo commented Feb 9, 2019 • edited Loading

zzz-assault commented Mar 18, 2020

sian2005 commented Apr 18, 2020

zzz-assault commented Apr 19, 2020

zzz-assault commented Apr 26, 2020 • edited Loading

zzz-assault commented May 2, 2020

danboo commented May 7, 2020

danboo commented Feb 9, 2019 •

edited

Loading

zzz-assault commented Apr 26, 2020 •

edited

Loading