Skip to content

Commit

Permalink
a bit of cleaning / update
Browse files Browse the repository at this point in the history
  • Loading branch information
defuneste committed May 3, 2024
1 parent 68a79e2 commit 8c522d6
Showing 1 changed file with 21 additions and 24 deletions.
45 changes: 21 additions & 24 deletions isp_eda.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ count_and_clean <- function(vec) {
num_brand_name <- count_and_clean(isp[["brand_name"]])
```

Removing all capitalization and change underscore for white space help lower tthe number of unique brand names to: `r num_brand_name`
Removing all capitalization and change underscore for white space help lower the number of unique brand names to: `r num_brand_name`

```{r}
isp[["clean_name"]] <- tolower(trimws(gsub("_", " ", isp[["brand_name"]])))
Expand Down Expand Up @@ -196,9 +196,7 @@ table_with_options(more_frn_than_provider)

Unique provider_id + brand_name are kind of "green" (for one time frame):

```{r}
sprintf("Number of green isp: %s", nrow(isp[isp$unique_brand_name_by_provider_id == 1,]))
```
Number of green isp: `r nrow(isp[isp$unique_brand_name_by_provider_id == 1,])`

We can have one `provider_id` with multiple `frn` and same or not `brand_name` (see TSC for example / 150266)

Expand Down Expand Up @@ -285,7 +283,7 @@ table_with_options(isp)
```


A good example could be `131167` and how we can discriminate Orbitel communications. We can also prob raise the bar of "few locations".
A good example could be `131167` and how we can discriminate Orbitel communications. We can also prob. raise the bar of "few locations".

A quick summary of where we are:

Expand All @@ -298,7 +296,9 @@ table(isp[["rdy_to_go"]])

# Typology of ISP

The data was generated from June 23 FCC release and assumed that an FRN = ISP. Can we guess who is a small ISP?
The data was generated from June 23 FCC release and assumed that an FRN = ISP.

Can we guess who is a small ISP?

```{r}
# con <- cori.db::connect_to_db("proj_calix")
Expand All @@ -316,28 +316,27 @@ table_with_options(frn_desc)

```{r}
cnt_locations <- frn_desc[["cnt_locations"]]
summary(cnt_locations)
```


```{r}
#| column: margin
boxplot(cnt_locations)
boxplot(cnt_locations, horizontal = TRUE, col = 2, border = 2, frame = F, main = "Count of locations per ISP")
```

Some ISP are declaraing covering a huge number of locations. Some low counts are probably errors.

Count of FRN with a less than 10 locations: `r nrow(frn_desc[frn_desc$cnt_locations < 10,])`

Count of FRN with more than 500 000 locations: `r nrow(frn_desc[frn_desc$cnt_locations > 500000,])`

```{r}
#| label: removing big and small isp
sprintf("FRN with a less than 10 locations: %s", nrow(frn_desc[frn_desc$cnt_locations < 10,]))
sprintf("FRN with more than 500000 locations: %s", nrow(frn_desc[frn_desc$cnt_locations > 500000,]))
frn_desc$n_states <- lengths(strsplit(gsub("\\{|\\}", "", frn_desc$states), ","))
```

If we filter them out (removing 110 cases):


::: {.panel-tabset}

## 100 000
Expand All @@ -350,8 +349,6 @@ frn <- frn_desc[frn_desc$cnt_locations >= 10 & frn_desc$cnt_locations <= locatio
hist(frn$cnt_locations, col = 2,
main = sprintf("Less than %s", location_filter), xlab = "count locations")
frn$n_states <- lengths(strsplit(gsub("\\{|\\}", "", frn$states), ","))
```

## 10 000
Expand All @@ -364,11 +361,7 @@ frn <- frn_desc[frn_desc$cnt_locations >= 10 & frn_desc$cnt_locations <= locatio
hist(frn$cnt_locations, col = 2,
main = sprintf("Less than %s", location_filter), xlab = "count locations")
frn$n_states <- lengths(strsplit(gsub("\\{|\\}", "", frn$states), ","))
```


:::

List of ISP that the Broadband team that are good reference of small provider:
Expand All @@ -385,15 +378,19 @@ List of ISP that the Broadband team that are good reference of small provider:
| Salsgiver|0011167079|29941|
| All Points Broadband|0023524705|107803|
| Marquette-Adams Telephone co-op |0003774023|130783 |
| USI fiber |||
| USI fiber |0017096538|71466|
| Scott county telephone co | 0002069862|7829|
| PANGAEA |0016202236| 8410|
| Blue Mountain Networks |0005450507|310013|

Side notes

Side notes:

Newport Utilities = NUconnect
- Newport Utilities = NUconnect

- SandyNet, OR = City of Sandy, OR

SandyNet, OR = City of Sandy, OB
- USI FIber =

Blue Mountain Networks = Blue Ridge Mountain Electric Membership Corporation
- Blue Mountain Networks = Blue Ridge Mountain Electric Membership Corporation

0 comments on commit 8c522d6

Please sign in to comment.