Adapt calculation of flats per building #7
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What's this about?
Hey there,
looking through the great work of the district generator, I noticed that the calculation of the number of flats per building was (in my opinion) lacking a more scientific base. So I dug into the Zensus 2011 data, that the IWU used for the TABULA typical buildings and came up with a new heuristic. This is just a proposal, if you have other plans or this doesn't fit the goal of the software, I completely understand a refusal.
How was it handled before?
Previously, the number of flats per SFH/TH building was hardcoded to 1. The number of flats for MFH/AB type buildings was either hardcoded or calculated from the buildings floor area, based on the assumption, that each flat has around 100m² of floor space.
What was the problem with this implementation?
The problem (from my maybe limited point of view) was twofold:
How is it implemented now?
SFH/TH
From the Zensus 2011 data (Link), I simply calculated the proportion of single and double appartment buildings within the SFH category. Based on these proportions and a random number between 0 and 1, it is decided if the building has one or two appartments. So the calculation is not based on the floor area anymore; the Zensus data shows that there are flats of every size in every category, from smaller than 30m² to larger than 180m².
MFH
The TABULA building type MFH contains houses with 3 to 12 flats, but this range is split in two categories in the Zensus2011 data. So only the number of houses with 3-6 flats and 7-12 flats is known. I calculated the proportion of both categories and based on a random number, one of the two categories is selected. After that, the number of flats is chosen from a uniform distribution within the specified limits (i.e. between 3 and 6 flats and between 7 and 12 flats).
AB
The Zensus dataset provides the least amount of information about appartment blocks. The only helpful thing to know is that there are flats of every size, so making a decision based on size does not feel appropriate to me. Therefore, I went out on a limb here and assumed a truncated pareto distribution, scaled to a minimum value of 13 (the TABULA category for AB is 13+ flats) and a maximum value of 100 (pure assumption on my part, definitely willing to change that). I've used
scipy.stats.truncpareto
for it, as scipy is already in the dependencies. The scaling factorsb
andscale
are both set to 1.This is the most arbitrary decision and I am open to suggestions here. Seeing that AB type buildings only make up 1.2% of all buildings in Germany (according to Zensus dataset) I accepted the margin of error. For a project primarily concerned with AB type buildings, this might not be ideal. I wasn't able to find any more information on this distribution. Maybe the Zensus 2022 data will hold more detailled information.