Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2023 05 18 automated briefings #9

Open
wants to merge 111 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
111 commits
Select commit Hold shift + click to select a range
b4f36ce
Regex slidepack
MHWauben Mar 4, 2020
a5042b1
Add exercises
MHWauben Mar 5, 2020
78fea54
Importing data slides start
MHWauben Mar 5, 2020
79513c6
Exercise and final plot
MHWauben Mar 9, 2020
0caa5d7
Better libraries, add why bother, add transitions
MHWauben Mar 9, 2020
e1a48e9
Create understanding_and_using_R_functions.rmd
andreassot10 Jul 31, 2020
c93f48c
Initial commit
tcrisford Aug 10, 2020
a127e8b
Knitted html file
tcrisford Aug 10, 2020
77c1b37
added: DBI basics- connection, create table add data to table. Union …
Aug 12, 2020
b5dbe1c
.gitignore updated
Aug 12, 2020
a546418
Reorganising file structure
tcrisford Aug 14, 2020
fee4026
Merge branch 'tc_workshops' into intro_to_sql
tcrisford Aug 28, 2020
755bcfb
Error handling workshop - initial commit
tcrisford Aug 28, 2020
398edea
Updates once workshop is planned
MHWauben Sep 3, 2020
3f55252
Rename R functions workshop folder
MHWauben Sep 3, 2020
45ed8e9
Merge pull request #4 from DataS-DHSC/r_functions_workshop
andreassot10 Sep 3, 2020
35f02ff
Update readme for coffee & coding
MHWauben Sep 3, 2020
06b1d74
Merge pull request #3 from DataS-DHSC/import_data
MHWauben Sep 10, 2020
4b3a61f
testing code tutorial added
Sep 17, 2020
916684e
updated dir name
Sep 17, 2020
8344398
Merge pull request #5 from DataS-DHSC/error_handling_in_R
tcrisford Oct 8, 2020
144edb8
Renaming folder to match convention
tcrisford Oct 8, 2020
8c5abfd
Merge pull request #6 from DataS-DHSC/error-handling-in-R
tcrisford Oct 8, 2020
2c7e4c4
Adding section on the View(...) function
tcrisford Oct 8, 2020
56b4e11
Adding further info on debugging within RStudio
tcrisford Oct 8, 2020
ef6e6b8
Adding short note on the purrr error handling functions
tcrisford Oct 8, 2020
a19969f
Adding acknowledgements section
tcrisford Oct 8, 2020
2eff147
Fixing indents
tcrisford Oct 8, 2020
6243f55
Merge pull request #7 from DataS-DHSC/error-handling-in-R-additions
tcrisford Oct 8, 2020
9f8ef90
Interactive graphs with plotly.
jriley-dhsc Oct 14, 2020
539d470
Merge pull request #1 from DataS-DHSC/master
jriley-dhsc Oct 14, 2020
237917a
Merge pull request #2 from jriley-dhsc/plotly
jriley-dhsc Oct 14, 2020
d74fbb2
Remove excel file
MHWauben Oct 22, 2020
1ecd7ff
Merge pull request #8 from jriley-dhsc/master
MHWauben Nov 17, 2020
cab1e38
Merge pull request #9 from DataS-DHSC/testing_your_code
MHWauben Nov 17, 2020
d3ba0db
Create folder for Al
MHWauben Nov 26, 2020
528c0bc
Add files via upload
abrodlie Nov 26, 2020
acdc5b0
Merge pull request #10 from DataS-DHSC/joins
MHWauben Nov 26, 2020
c653201
Create folder for Lucy's workshop
MHWauben Dec 10, 2020
3bb38fb
Add date for workshop
MHWauben Dec 10, 2020
324a576
Commenting out disconnection line so that we can knit the markdown do…
Dec 31, 2020
df30a88
Knitting the markdown document
Dec 31, 2020
05eb082
Adapting my sections to use sqlite database, instead of querying the …
Dec 31, 2020
33ce337
Rewording new sections slightly to fit flow of presentation
Dec 31, 2020
515afb2
Adding explanation of CASE WHEN and * to the 'create new column' section
Dec 31, 2020
fefa82b
Adding note on queries being case sensitive
Dec 31, 2020
612cc47
Correcting small typos
Dec 31, 2020
9383463
Renaming folder to match convention
Jan 7, 2021
8b3a240
Merge pull request #12 from DataS-DHSC/intro_to_sql
tcrisford Jan 7, 2021
6faad5a
"New workshop added 2021-01-21 Typical errors in R"
ailsaprise Jan 21, 2021
ec148e7
Intro to Regex added
ailsaprise Feb 3, 2021
10f0a2b
Tiny modification to R exercise 3.
ailsaprise Feb 4, 2021
5c84340
Create 2021-03-04 Organising survey data for beginners
externality44 Mar 4, 2021
b9aa3b2
Create R intermediate training
externality44 Mar 4, 2021
1e6dd1d
R intermediate training topics
externality44 Mar 4, 2021
44981ca
Webscraping in R loading directly into PowerBi
externality44 May 27, 2021
a69b550
Add readme doc
MHWauben Jun 23, 2021
7546c22
Add workshop files
MHWauben Jun 23, 2021
62b9efa
Re-name git folder
MHWauben Jun 23, 2021
069d663
Merge pull request #13 from DataS-DHSC/git-dhsc
MHWauben Jun 24, 2021
f7b4569
Add Gitignore information
MHWauben Jun 29, 2021
7641646
Merge pull request #14 from DataS-DHSC/git-ignore-add
MHWauben Jun 29, 2021
92f7f82
Remove link to DataS-DH
MHWauben Jul 9, 2021
263f09d
Add image
MHWauben Jul 9, 2021
7736591
Add TidyR / LearnR folder
MHWauben Jul 9, 2021
4c4a686
Add TidyR / LearnR files
MHWauben Jul 9, 2021
851a308
Add random forest spotlight seminar
MHWauben Jul 9, 2021
20a963e
Add files for random forest seminar
MHWauben Jul 9, 2021
46f980c
Add files via upload
externality44 Jul 22, 2021
fd94300
Add folder for workshop
MHWauben Jul 22, 2021
23dd465
Upload RStudio IDE workshop
MHWauben Jul 22, 2021
099907b
Merge pull request #17 from DataS-DHSC/rstudio-ide
MHWauben Jul 22, 2021
c580b97
Add 2021-09-09 PHE Indicator Automation RAP presentation
GeorgieAnderson Sep 9, 2021
bf78418
Merge pull request #18 from PHEgeorginaanderson/master
MHWauben Sep 9, 2021
cc3556e
Add readme for Chris workshop
MHWauben Sep 23, 2021
73ee2dc
Merge pull request #19 from DataS-DHSC/shiny-production
MHWauben Sep 23, 2021
66bb9c5
adding script from C&C session on gmaps distance matrix api
lisa-lon Oct 26, 2021
4a8c935
Create 2021-11-04 Sharepoint lists and Powerbi
externality44 Nov 4, 2021
bbfb100
Update 2021-11-04 Sharepoint lists and Powerbi
externality44 Nov 4, 2021
38e36a6
Update 2021-11-04 Sharepoint lists and Powerbi
externality44 Nov 4, 2021
9969a8a
Update 2021-11-04 Sharepoint lists and Powerbi
externality44 Nov 4, 2021
a927c12
Renv tutorial by James
jamescrosbie Dec 2, 2021
26861d2
Python basics workshop
Mar 21, 2022
a29ade2
2022-04-28 Intermediate R/
Apr 28, 2022
aa350e5
Delete Intermediate R.R
Apr 28, 2022
2ab0e86
Adding R scripts from session
TomDougall Apr 28, 2022
3f86a1b
Create readme.me
Apr 28, 2022
1fa88a1
Example ggplot2 code (R markdown and html)
AndyABaker May 17, 2022
b79c9b5
html swapped with github-friendly .md file
AndyABaker May 17, 2022
f501e1d
Slidepack
AndyABaker May 24, 2022
f7db6af
Add files via upload
Jun 30, 2022
b416771
Update readme.md
Jun 30, 2022
78bc3df
Added dataset.py and dataset template workbook
James-Osmond Sep 7, 2022
2c8a8b5
Added if __name__ == __main__ to dataset.py
James-Osmond Sep 8, 2022
f45b9cd
Added example initialisation template
James-Osmond Sep 8, 2022
090b49f
Added desk notes
James-Osmond Sep 8, 2022
a78c9e6
Merge pull request #21 from DataS-DHSC/python-accessible-datasets
MarianaBazely Sep 8, 2022
877d4a4
Spatial Analysis slides added
ailsaprise Oct 4, 2022
387def6
Adding slides and python code for the session 'Code QA and best pract…
ailsaprise Dec 9, 2022
d58a655
Environment C&C session
TomDougall Jan 18, 2023
30f4425
Added 2023-02-23 folder and content for a11ytables
GeorgieAnderson Feb 23, 2023
c3579b8
Create 2023-04-27 Basics of SQL
drlspencer Apr 27, 2023
b566e1c
Delete 2023-04-27 Basics of SQL
drlspencer Apr 27, 2023
5807421
Add files via upload
drlspencer Apr 27, 2023
7b79e8e
Merge pull request #22 from PHEgeorginaanderson/master
drlspencer May 2, 2023
af380b1
Merge pull request #2 from DataS-DHSC/regex_mw
drlspencer May 2, 2023
4cd7dea
Merge pull request #11 from DataS-DHSC/ggplot-gganimate
drlspencer May 2, 2023
844064b
Merge pull request #15 from DataS-DHSC/move-old
drlspencer May 2, 2023
da9704f
Merge pull request #20 from jamescrosbie/renv
drlspencer May 2, 2023
6b7f62b
added code for this weeks C&C on Plotly Dash
samtaylor-dhsc May 16, 2023
fcf4109
added demo app
samtaylor-dhsc May 18, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
548 changes: 548 additions & 0 deletions 2018-09-05 Random forests/Sitrep 12-16.csv

Large diffs are not rendered by default.

110 changes: 110 additions & 0 deletions 2018-09-05 Random forests/Sitrep 16-17.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
trust_code,GAbeds,Norovirus,diverts,ACC,attendance,performancesy
RXP,89.21,0,0,79.7,6.32,0.895299
RTR,96.19,0,0,86.99,5.39,0.931424
RLN,87.33,0.01,0,70.7,3.1,0.890376
RTD,87.56,0.44,0,80.44,8.92,0.919367
RR7,94.17,0,0,88.75,1.4,0.946481333
RTF,94.15,0.08,0,95.03,4.92,0.893035333
RE9,93.35,0.02,0,79.08,0.59,0.909940667
RWD,96.72,0.01,0,77.39,4.09,0.761819667
RX1,93.43,0.34,0,84.63,8.38,0.774517
RCX,97.46,0.04,0,90.95,0.64,0.899322667
RC9,94.91,0,0,77.02,2.31,0.984394
RK5,93.28,0.03,0,88.69,2.49,0.936054667
RTG,87.07,0.04,0,81.67,5.7,0.860491333
RFS,94.46,0.01,0,87.07,1.15,0.828021
RWJ,90.51,0.03,0,82.56,1.84,0.717097333
RJN,96.98,0,0,70.92,0.34,0.765705333
RHQ,95.49,0.37,0,80.28,8.42,0.819239333
RJF,98.07,0.03,0,71.86,1.05,0.894674667
R1F,97.53,0,0,80.23,0.39,0.842843667
RYJ,93.71,0,0,97.36,6.74,0.862219667
R1K,94.91,0.06,0,76.52,9.47,0.803925333
RWG,91.69,0,0,88.95,2.57,0.771880333
RQW,99.65,0.02,0,86.47,0.97,0.666211667
RRV,77.39,0.02,0,82.78,3.87,0.871944
RAP,99.64,0,0,87.96,2.11,0.792240667
RM1,99.58,0.07,0,81.51,3.91,0.777558333
RGP,96.26,0.06,0,77.3,0.81,0.882759333
RHW,94.05,0.01,0,90.04,2.2,0.904404333
RTH,93.73,0,0,72.96,4.48,0.860474333
RDE,97.96,0.01,0,76.89,1.89,0.831785667
R1H,94.77,0.05,0,87.72,22.53,0.819526667
RDD,99.59,0,0,88.24,2.69,0.894008333
RAJ,94.95,0,0,89.76,1.51,0.799368667
RF4,94.49,0,0,85.76,8.44,0.832420333
RQQ,97.44,0.02,0,90.46,0.26,0.747513
RNQ,98.89,0,0,88.97,1.45,0.756472
RKB,96.5,0.03,0,95.14,5.44,0.788562
RWE,90.79,0.07,0,73.34,10.66,0.791167667
RNS,99.46,0,0,82.78,2.09,0.809344
RD8,96.28,0,0,79.33,0,0.875429333
RLT,96.44,0,0,71.84,0.61,0.808379
RRK,98.39,0,0,96.85,3.27,0.770758667
RQ3,88.75,0,0,55.71,0.4,0.947426667
RJC,94.39,0.05,0,75.91,0.83,0.939936
RL4,90.71,0.06,0,64.5,4.85,0.892209667
RNA,92.6,0.02,0,94.61,3.61,0.902644333
RLQ,93,0.01,0,83,0.59,0.844752333
RWP,99.05,0.15,0.01,92.12,3.11,0.768087667
RBK,96.97,0.07,0,89.26,1.53,0.785969
RXW,93.67,0,0,88.69,2.72,0.758751667
RTE,93.16,0,0,88.15,3.25,0.752257333
RTK,98.16,0.01,0,83.11,1.36,0.829655333
RKE,97.4,0,0,84.22,0.6,0.850969333
RN7,94.97,0,0,92.25,1.67,0.806435
RJ2,97.87,0.04,0,91.7,6.42,0.822741667
RJZ,98.84,0.29,0,96.2,10.49,0.783680667
RJ7,93.46,0.08,0,83.68,4.06,0.887711667
RCF,94.59,0.01,0,62.89,0.59,0.899442667
RAE,94.83,0.06,0,82.9,2.55,0.863046667
RR8,95.96,0.17,0,72.3,13.34,0.812935
RFR,93.98,0.09,0,69.8,2.08,0.821598
RP5,92.55,0.02,0,89.85,3.46,0.867936333
RFF,96.93,0,0,84.05,0.83,0.852187667
RWY,91.1,0.07,0,72.51,2.95,0.927092
RCD,91.79,0.03,0,77.77,0.45,0.937470667
RCB,91.63,0.19,0,87.65,4.49,0.802483667
RWA,86.5,0.11,0,74.99,3.98,0.840269
RN5,86.16,0.02,0,74.7,2.62,0.849109
RHU,95.77,0,0,85.24,3.78,0.740002667
RA2,94.86,0.02,0,96.13,0.73,0.874913
RN3,93.47,0.02,0,79.13,1.36,0.813659667
RBD,90.04,0,0,74.88,0.33,0.976974667
RDZ,90.95,0.01,0,67.4,1.38,0.928473333
RD3,95.08,0.02,0,80.75,1.07,0.914599
RD1,94.68,0.07,0,80.77,1.44,0.797075
RVJ,97.3,0,0,88.47,2,0.783172333
RA7,85.46,0,0,84.78,2.41,0.802455
RA3,97.81,0.02,0,89.55,0.33,0.666578333
RH8,82.6,0.08,0,80.09,2.01,0.917079333
RBA,88.68,0.02,0,76.2,0.94,0.919198667
REF,91.29,0.01,0,73.86,1.5,0.804558667
RK9,98.07,0.01,0,70.21,2.26,0.830004667
RC1,93.91,0,0,68.36,0.77,0.909231333
RJE,98.31,0.19,0,87.15,7.8,0.755551
RXR,96.21,0,0,75.54,5.42,0.774975333
RXN,94.33,0,0,80.43,0,0.814520667
RXL,95.89,0.03,0,75.39,4.03,0.838749667
RW3,92.43,0.13,0,91.18,10.02,0.896629667
RMC,92.18,0.01,0,86.08,1.69,0.814044667
RW6,94.41,0.05,0.01,89.49,8.49,0.775423
RM2,90.86,0.05,0,91.15,2.41,0.849619667
RMP,98.02,0,0,88.12,1.06,0.799364
RBN,94.4,0,0,91.68,3.09,0.830070667
RWW,88.5,0.01,0,86.85,1.61,0.851599
RNL,91.68,0.16,0,74.25,2.7,0.841071667
REP,73.34,0,0,41.58,0.03,0.969193667
REM,96.08,0.1,0,83.94,3.16,0.815655
RQ6,95.51,0,0,92.82,0,0.863516333
RBT,92.73,0.08,0,72.72,1.12,0.890214
RVY,87.75,0.01,0,87.77,1.41,0.8916
RBL,98.3,0.11,0,89.05,2.56,0.799897333
RJR,93.92,0.04,0,74.64,0.92,0.842648333
RVR,98.97,0.01,0,88.4,2.61,0.946300667
RJ6,98.78,0,0,92.58,1.49,0.847868667
RXH,96.85,0.05,0,87.52,3.44,0.792876333
RXC,95.28,0.06,0,81.17,1.94,0.756502333
RTP,96.19,0.03,0,93.75,1.65,0.892421667
RPA,96.68,0,0,96.65,1.89,0.739128333
RWF,96.55,0,0,88.47,3.31,0.804745667
87 changes: 87 additions & 0 deletions 2018-09-05 Random forests/random forest-demo-09-18.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
if (!require(rpart)) install.packages('rpart') # DECISION TREE PACKAGE
library(rpart)
if (!require(rpart.plot)) install.packages('rpart.plot') # PLOTTING PACKAGE
library(rpart.plot)
if (!require(rattle)) install.packages('rattle') # PLOTTING PACKAGE
library(rattle)
if (!require(RColorBrewer)) install.packages('RColorBrewer') # PLOTTING PACKAGE
library(RColorBrewer)
if (!require(randomForest)) install.packages('randomForest') # RANDOM FOREST PACKAGE
library(randomForest)
if (!require(caret)) install.packages('caret') # ML SUPPORT PACKAGE
library(caret)
if (!require(tidyverse)) install.packages('tidyverse') # DATA MANIPULATION AND PLOTTING PACKAGE
library(tidyverse)

#set random seed for reproducability
set.seed(111)

#read in input data
setwd("~/spotlight seminar 06.09.18")

Sitrep_12_16 <- read_csv('Sitrep 12-16.csv') # 4 WINTERS OF SITREP DATA

Sitrep_16_17 <- read_csv('Sitrep 16-17.csv') # SITREP DATA FROM WITER 16-17


#creat (stratified) test/train splits
# ==========================================================================

# create index for test/train split
index <- createDataPartition(Sitrep_12_16$performanceny, # stratify according to performance in the next year
p = 0.8, # split 80:20
list = FALSE)

Sitrep_12_16_train <- Sitrep_12_16[index,] # create training set

Sitrep_12_16_test <- Sitrep_12_16[-index,] # create test set

# confirm stratification
mean(Sitrep_12_16_train$performanceny)
mean(Sitrep_12_16_test$performanceny)

#USE WINTERS 12-16 TO TRAIN AND EVALUATE TREE AND FOREST
# ==========================================================================

tree <- rpart(performanceny~., # model formula
Sitrep_12_16_train, # training data
method = 'anova') # regesion tree

fancyRpartPlot(tree) # visulize tree

forest <- randomForest(performanceny~., # model formula
Sitrep_12_16_train, # training data
ntree=1000, # 1000 tree forest
importance=TRUE) # calculate variable importance

varImpPlot(forest) # visulize variable importance
importance(forest) # list variable importance

tree_p <- predict(tree,Sitrep_12_16_test) # make tree predictions
tree_error <- tree_p - Sitrep_12_16_test$performanceny # calculate tree error
rmse_tree <- sqrt(mean(tree_error^2)) # calculate tree RMSE

forest_p <- predict(forest,Sitrep_12_16_test) # make forest predictions
forest_error <- forest_p - Sitrep_12_16_test$performanceny # calculate forest error
rmse_forest <- sqrt(mean(forest_error^2)) # calculate forest RMSE



#TRAIN FINAL MODEL ON FULL DATASET (TRAIN + TEST) AND MAKE PREDICTIONS + PLOT SUMMARY
# ============================================================================================

forest_final <- randomForest(performanceny~.,
Sitrep_12_16,
ntree=1000,
importance=TRUE) # grow forest on all data



#MAKE FINAL PREDITIONS
# =========================================================================================

predictions <- data.frame(trust_code = Sitrep_16_17$trust_code,
performance_16_17 = Sitrep_16_17$performancesy,
predictions_17_18 = predict(forest_final,Sitrep_16_17)) # make final predictions

write_csv(predictions,'predictions.csv')
3 changes: 3 additions & 0 deletions 2018-09-05 Random forests/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# spotlight-seminar-random-forest

the input files and r code for a basic random forest
Binary file added 2019-09-20 Iteration/purrr cheatsheet.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 0 additions & 4 deletions 2019-09-20 Iteration/readme.md
Original file line number Diff line number Diff line change
@@ -1,5 +1 @@
Material from the R coding club session led br Matt Malcher on 20th September 2019.

Any other materials for the R coding club session on iteration can be found here:

## [R Coding Club - Iteration](https://github.com/DataS-DH/RCC-Iteration)
13 changes: 13 additions & 0 deletions 2019-10-04 TidyR LearnR/RCC_tidyr_learnr.Rproj
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Version: 1.0

RestoreWorkspace: Default
SaveWorkspace: Default
AlwaysSaveHistory: Default

EnableCodeIndexing: Yes
UseSpacesForTab: Yes
NumSpacesForTab: 2
Encoding: UTF-8

RnwWeave: Sweave
LaTeX: pdfLaTeX
3 changes: 3 additions & 0 deletions 2019-10-04 TidyR LearnR/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
A quick tutorial for the R Coding Club demonstrating how to use `learnr`

App deployed at: https://matthew-malcher.shinyapps.io/tidyr_learnr/
4 changes: 4 additions & 0 deletions 2019-10-04 TidyR LearnR/tidyr_learnr/data/preg.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
"name","treatmenta","treatmentb"
"John Smith",NA,18
"Jane Doe",4,1
"Mary Johnson",6,7
3 changes: 3 additions & 0 deletions 2019-10-04 TidyR LearnR/tidyr_learnr/data/preg2.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
"treatment","John Smith","Jane Doe","Mary Johnson"
"a",NA,4,6
"b",18,1,7
Loading