Abstract: Traditional urban design is labor-intensive and time-consuming. While existing urban computing models are mostly predictive, generative models present new opportunities to automatically design urban landscape, thus reducing the labor and time costs in traditional urban design. This work presents a state-of-the-art diffusion model under the ControlNet framework that can automatically generate urban landscape conditioning on the text descriptions that lay out targeting landuse patterns, and on the image constraints that represent existing infrastructure and natural environment. It can efficiently generate a large number of alternative urban designs, and can transfer generated urban landscape across cities. To address challenges in data availability, this work spatially matches satellite imagery to OpenStreetMap, thus developing the association between satellite imagery, landuse description, and imagery constraints. To align with design practice, the generated urban landscape is evaluated by both urban experts and the general public by scoring the images' consistency with descriptions, constraints, and reality. The generated images receive similar scores to the real ones across all three dimensions. Additionally, an unlabelled side-by-side comparison revealed that the majority of the generated images are selected as being more consistent with the text descriptions and image constraints in both user groups.
This repository is forked from the original ControlNet.
The descriptions can be accessed here. The target satellite and constraint images can be accessed here and here.