Skip to content

Commit

Permalink
reference repo and clean few typo
Browse files Browse the repository at this point in the history
  • Loading branch information
defuneste committed May 8, 2024
1 parent cf93d61 commit 9160f85
Showing 1 changed file with 14 additions and 12 deletions.
26 changes: 14 additions & 12 deletions ms-eda.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -11,31 +11,35 @@ Why are we doing it?

> Premises and Premise Counts
Currently we have FCC total count locations at census block level.
Currently we have FCC total count of locations (BSL) at the census block level.

# MS Building Foot print

## Overview

MS used satellite data (from multiple campains/vintage) to get the footprint of buildings.
Microsoft (MS) used satellite data (from multiple campains/vintage) to get the footprint of buildings.

They are classifying pixels that are supposed to be part of a buildings (segmentation using a neural network) and then convert pixels to a shape.

It exists [worldwide](https://github.com/microsoft/GlobalMLBuildingFootprints) and for [US states](https://github.com/Microsoft/USBuildingFootprints?tab=readme-ov-file).

- We have processed the 51 states

- Additional works will be required for PR and US territories (parts of workflow from the US states can be reused).
- Additional works will be required for Puero Rico and the US territories (parts of workflow from the US states can be reused).

The precision of the model vary depending on the region: the Carribean is at 92,2% and the US at 98.5%. The rate of false positive is 1% for the US and 1.8% for the Carribean. (Oceania was not provided)
The precision of their model vary depending on the region: the Carribean is at 92,2% and the US at 98.5%. The rate of false positive is 1% for the US and 1.8% for the Carribean. (Oceania was not provided)

The license is [ODbL](https://github.com/Microsoft/USBuildingFootprints?tab=License-1-ov-file).

## Buildings to BSL?

Buildings are shapes, BSL are points. We converted the buildings to single point (arbitrary: the first vertex of the shape) to lower the amout of data. Hence, now we have "buildings" summarized to points (lat/long).

We do not have access to lat/long of BSL (fabrics). Our assumptions is if a count at block match they are describing the same reality (we can't do the "on the ground verification").
:::{.column-margin}
Our pipeline can be find [here](https://github.com/ruralinnovation/data-MS-buildings) (private repo)
:::

We do not have access to lat/long of BSL (this is only part of the fabric data). Our assumptions is if a count at block match they are describing the same reality (we can't do the "on the ground verification").

The number of buildings reported for 51 states is: 130 099 920

Expand All @@ -53,11 +57,11 @@ While is the number of BSL is: 114 074 438

**What could be the use cases?**

Integrating those informations will have cost: information "overload", documentation about is needed, and depending
Integrating those informations will have cost: information "overload", documentation about it is needed

We can:

- Provide the count per census block of MS building footprint and the user
- Provide the count per census block of MS building footprint and let the user decide on the quality of the data around their specific geography

- Add the dot to the map

Expand All @@ -69,7 +73,7 @@ How this data helps BEAD applicant?

## MS building footprint in VT

We can count those points per block and compare to the number of location than FCC is descriving.
We can count those points per block and compare to the number of location than FCC is describing.

After that we build a small model that will provide either an estimate of BSL given MS footprint and how confidant the model is.

Expand Down Expand Up @@ -116,10 +120,8 @@ lines(pot_val, conf_interval[, "upr"], col =" blue", lty = 2)

:::

- my model (just a linear one) is probably bad (log should correct that)

- still strong relation
- Strong relation

- the model is "overconfidant", and the reality is more "spread"
- the model is "overconfidant", and the reality is more "spread" (what could possibly be the cause of it?)

- VT MS has also **more** locations (285333, versus 352618)

0 comments on commit 9160f85

Please sign in to comment.