diff --git a/_quarto.yml b/_quarto.yml index badae23..304b02f 100644 --- a/_quarto.yml +++ b/_quarto.yml @@ -2,7 +2,7 @@ project: type: website website: - title: "FCC NBM EDA" + title: "FCC BDC EDA" repo-url: https://github.com/ruralinnovation/proj-fcc-report repo-actions: [edit, issue] navbar: diff --git a/about.qmd b/about.qmd index a90bec3..2b9334b 100644 --- a/about.qmd +++ b/about.qmd @@ -4,4 +4,4 @@ title: "About" Sharing quick EDA about FCC data. -FCC NBM: June 2023 release, downloaded 21-11-2023 +FCC BDC: June 2023 release, downloaded 21-11-2023 diff --git a/isp_eda.qmd b/isp_eda.qmd index 5729423..64af7da 100644 --- a/isp_eda.qmd +++ b/isp_eda.qmd @@ -22,7 +22,9 @@ table_with_options <- function(x){DT::datatable(x, rownames = FALSE, )} ``` -We are starting a first exploratory data analysis around ISPs in the FCC NBM data set. It should be kept in mind that an ISP can be multiple time in the same location (offering multiple service). +We are starting a first exploratory data analysis around ISPs in the FCC BDC data set. It should be kept in mind that an ISP can be multiple time in the same location (offering multiple service). + +TODO: I am at location level: add a small example to illustrate The query that generated this first pass at it is here: @@ -97,7 +99,7 @@ ORDER BY cnt_services desc; isp_list <- read.csv("data/isp_prov.csv") isp_list$ct <- 1 isp_list$name_id <- ave(isp_list$ct, isp_list$provider_id, FUN = sum) -#View(isp_list) +#View(isp_list[!is.na(isp_list$new_name),]) ``` #### TCT @@ -139,9 +141,20 @@ isp_list$new_name[isp_list$provider_id == 130008] <- "acentek" For now I will go with attributing them to Acentek but an other option will be to just remove them. +#### Mediacom - Bolt + +```{r} +table_with_options(isp_list[grepl("Mediacom|Bolt", isp_list$brand_name) ,]) +isp_list$new_name[isp_list$provider_id == 130804] <- "mediacom_bolt" +``` +It appears that Bolt and Mediacom share the same `provider_id` and are together in some `brand_name`. I think we should regroup them but this definietly more domain knowledge than the one I have! +```{r} +table_with_options(isp_list[isp_list$provider_id == 131378,]) +``` ## TODO list: -[ ] To check: 586211 \ No newline at end of file +[ ] Provider_id: 586211 +[ ] Provider_id: 131413 \ No newline at end of file diff --git a/zero_dl_up.qmd b/zero_dl_up.qmd index 0a6a571..90a85b2 100644 --- a/zero_dl_up.qmd +++ b/zero_dl_up.qmd @@ -42,6 +42,7 @@ The data used to provide most of the analysis was done with this 2 SQL queries. SELECT state_abbr, brand_name, + technology, count(brand_name) FROM staging.june23 @@ -55,6 +56,7 @@ GROUP BY brand_name, state_abbr, technology; SELECT state_abbr, brand_name, + technology, count(brand_name) FROM staging.june23 @@ -92,14 +94,15 @@ table_with_options(technology) We do not mind too much `70` (Unlicensed Terrestrial Fixed Wireless) because we are filtering it out but we are keeping `71` (Licensed Terrestrial Fixed Wireless) , `72` (Licensed-by-Rule Terrestrial Fixed Wireless)and `10` (Copper Wire). -To take that into account I will filter out Unlicensed Terrestrial Fixed Wireless for the rest of this document. +To take that into account I will filter out Unlicensed Terrestrial Fixed Wireless for the rest of this document. I alao filtered out 60 amd 61 to be consistant with our pipelines. ## Summary by ISP ```{r} -#| label: ISP with 0/0 -zero_loc <- zero_loc[which(zero_loc$technology != 70), ] -not_zero <- not_zero[which(not_zero$technology != 70), ] +#| label: ISP with 0/0 +filter_sat <- c(60, 61, 70) +zero_loc <- zero_loc[which(! zero_loc$technology %in% filter_sat), ] +not_zero <- not_zero[which(! not_zero$technology %in% filter_sat), ] agg <- agg_count(zero_loc, "brand_name") agg_not <- agg_count(not_zero, "brand_name")