Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgraded to new script library that provides foundation for R key words #8707

Merged
merged 9 commits into from
Jan 11, 2024

Conversation

lloyddewit
Copy link
Contributor

@rdstern Please ignore this PR for now, it's just to test if converting the RScript library to C# created any compatibility issues.
Thanks

@lloyddewit lloyddewit marked this pull request as ready for review December 27, 2023 09:02
@lloyddewit lloyddewit added the skip-releasenotes PRs that don't affect functionality and should not be included in the release notes label Dec 27, 2023
@lloyddewit lloyddewit changed the title Test C# version of RScript NuGet package Upgraded to new script library that provides foundation for R key words Dec 27, 2023
@lloyddewit
Copy link
Contributor Author

@rdstern This PR is ready for testing.

I replaced the RScript library with the new library that I've been working on. It is a different library but the version in this PR should provide the same functionality as RScript. This version intentionally does not contain new features. I first wanted to ensure that it is 100% compatible with R-Instat.

Please could you test that the script window works the same as before, especially in relation to:

  • executing multi-line commands
  • skipping over multi-line commands using ctrl-enter
  • comments before/after commands/scripts; end line comments in multi-line commands
  • different operators including bracket operators that include commas (a[b], a[b,c], a[,,] etc.)
  • function calls where the parameters are also function calls or operator expressions
  • any other strange R code that you can think of that worked before!

This PR doesn't add new functionality but it's important because subsequent versions of this library will add R key words to the script window.
Thanks for testing!

@rdstern
Copy link
Collaborator

rdstern commented Dec 28, 2023

@lloyddewit exciting and will start testing today!

@rdstern
Copy link
Collaborator

rdstern commented Dec 28, 2023

@lloyddewit my first example is working - and your ctrl-enter (initially) seems fine - and, and, and I now have the cursor visible, so already have a bonus! Though perhaps I am coming down to earth now. Mayb we always had that? When I press Run there is no cursor. But it is there after ctrl enter. Oh well - live and learn. On we go!

I am taking the opportunity to explore new scripts at the same time. I am happy with the ease that Insert dialog in the script window gives access to the scripts from the packages. I am wondering where this should should figure in the Help, and should there be another ordinary menu item somewhere that gives access to this dialog?

agricolae is an important package. It is in R-Instat and will continue to have features that will only run from a script. So I am starting there:
a) ### Name: strip.plot all ran fine. Nothing need be added to run it in R-Instat. Ordinary graphs.
b) ### Name: AMMI ditto
c) ### Name: audpc this is the first script that gave me a small problem. So I give the script below.

### Name: audpc
### Title: Calculating the absolute or relative value of the AUDPC
### Aliases: audpc
### Keywords: manip

### ** Examples

library(agricolae)
dates<-c(14,21,28) # days
# example 1: evaluation - vector
evaluation<-c(40,80,90)
audpc(evaluation,dates)
# example 2: evaluation: dataframe nrow=1
evaluation<-data.frame(E1=40,E2=80,E3=90) # percentages
plot(dates,evaluation,type="h",ylim=c(0,100),col="red",axes=FALSE)
title(cex.main=0.8,main="Absolute or Relative AUDPC\nTotal area = 100*(28-14)=1400")
lines(dates,evaluation,col="red")
text(dates,evaluation+5,evaluation)
text(18,20,"A = (21-14)*(80+40)/2")
text(25,60,"B = (28-21)*(90+80)/2")
text(25,40,"audpc = A+B = 1015")
text(24.5,33,"relative = audpc/area = 0.725")
abline(h=0)
axis(1,dates)
axis(2,seq(0,100,5),las=2)
lines(rbind(c(14,40),c(14,100)),lty=8,col="green")
lines(rbind(c(14,100),c(28,100)),lty=8,col="green")
lines(rbind(c(28,90),c(28,100)),lty=8,col="green")
# It calculates audpc absolute
absolute<-audpc(evaluation,dates,type="absolute")
print(absolute)
rm(evaluation, dates, absolute)
# example 3: evaluation dataframe nrow>1
data(disease)
dates<-c(1,2,3) # week
evaluation<-disease[,c(4,5,6)]
# It calculates audpc relative
index <-audpc(evaluation, dates, type = "relative")
# Correlation between the yield and audpc
correlation(disease$yield, index, method="kendall")
# example 4: days infile
data(CIC)
comas <- CIC$comas
oxapampa <- CIC$oxapampa
dcomas <- names(comas)[9:16]
days<- as.numeric(substr(dcomas,2,3))
AUDPC<- audpc(comas[,9:16],days)
relative<-audpc(comas[,9:16],days,type = "relative")
h1<-graph.freq(AUDPC,border="red",density=4,col="blue")
table.freq(h1)
h2<-graph.freq(relative,border="red",density=4,col="blue",
frequency=2, ylab="relative frequency")

The problem - if it is one - is in lines 15 (starting plot and the first line of the plot) to line 29 which is where the plot is finished. Here is the plot at the end:

image

It is quite nice and using the ordinary R graphs.
a) R-Instat does not recognise that the lines after the plot line are all part of the same graph. I am not sure how it can. So it works fine if you select lines including the plot line downwards, and it is then quite nice you can manually build up to the plot above. Or you can select the whole block and get the complete graph.

I am not sure there is a problem here, i.e. is there a way you could know that these are all a multi-line command? I think you may have the same problem in RStudio if you run line-by-line?

So, @lloyddewit I suspect this may not be your problem? If that is the case there are 3 (at least) follow-up questions.
a) Could I have coded it differently, so it would be obvious that these are the same multi-line statement.
b) When this arises what could I add as a comment in the code (and maybe in the help) so users treat it as one statement, or always select, starting with the plot, if I want to build the plot gradually.
c) Would this be a good example to try ggplotify - to make it into a ggplot graph.

I haven't finished testing, but this is the end of my first comment.

@rdstern
Copy link
Collaborator

rdstern commented Dec 29, 2023

Here are 3 examples.

First is the alpha designs still from agricolae package. It seems fine with [,1] but is really not happy with [,,i]. I suspect this may not be the new system, but may have been an issue before, though we never had this.

Looking at the code belowI wonder if there are examples with more complex stuff inside square brackets?
By the way, there are just 2 lines with that problem. It is very content with the rest of the code, if these are commented out.

### Name: design.alpha
### Title: Alpha design type (0,1)
### Aliases: design.alpha
### Keywords: design

### ** Examples

library(agricolae)
#Example one
trt<-1:30
t <- length(trt)
# size block k
k<-3
# Blocks s
s<-t/k
# replications r
r <- 2
outdesign<- design.alpha(trt,k,r,serie=2)
book<-outdesign$book
plots<-book[,1]
dim(plots)<-c(k,s,r)
for (i in 1:r) print(t(plots[,,i]))
outdesign$sketch
# Example two 
trt<-letters[1:12] 
t <- length(trt)
k<-3
r<-3
s<-t/k
outdesign<- design.alpha(trt,k,r,serie=2)
book<-outdesign$book
plots<-book[,1]
dim(plots)<-c(k,s,r)
for (i in 1:r) print(t(plots[,,i]))
outdesign$sketch

Second is carolina also from agricolae. It gives an odd message caused by lines like this : output[][-1]. It seems to run ok but with that error message (by the not-strict method) while running. The full code is below:

### Name: carolina
### Title: North Carolina Designs I, II and III
### Aliases: carolina
### Keywords: models

### ** Examples


library(agricolae)
data(DC)
carolina1 <- DC$carolina1
# str(carolina1)
output<-carolina(model=1,carolina1)
output[][-1]

carolina2 <- DC$carolina2
# str(carolina2)
majes<-subset(carolina2,carolina2[,1]==1)
majes<-majes[,c(2,5,4,3,6:8)]
output<-carolina(model=2,majes[,c(1:4,6)])
output[][-1]

carolina3 <- DC$carolina3
# str(carolina3)
output<-carolina(model=3,carolina3)
output[][-1]

Third is an example with a single line that doesn't run. I think it needs your new stuff, before it can. Interesting though that it is the first example I have found where the problem is not detected at the top - when I try the first Run. Is that because the problem is a single line? It is here: by(book,book[2],function(x) paste(x[,1],"-",as.character(x[,3])))

### Name: design.dau
### Title: Augmented block design
### Aliases: design.dau
### Keywords: design

### ** Examples

library(agricolae)
# 4 treatments and 5 blocks
T1<-c("A","B","C","D")
T2<-letters[20:26]
outdesign <-design.dau(T1,T2, r=5,serie=2)
# field book
book<-outdesign$book
by(book,book[2],function(x) paste(x[,1],"-",as.character(x[,3])))
# write in hard disk
# write.table(book,"dau.txt", row.names=FALSE, sep="\t")
# file.show("dau.txt")
# Augmented designs in Completely Randomized Design
trt<-c(T1,T2)
r<-c(4,4,4,4,1,1,1,1,1,1,1)
outdesign <- design.crd(trt,r)
outdesign$book

@lloyddewit
Copy link
Contributor Author

@rdstern Thank you for your testing and comments.

Your comment from 28 Dec:

  • This PR should have no effect on the cursor (dis)appearing in the script window. This is a separate issue related to window focus and could be investigated by one of the developers.
  • Lines 15-29 are part of the same graph but are separate statements. So R-Instat is behaving correctly when it executes the lines statement by statement. I tested in RStudio and it behaves in the same way. So I don't think we should change anything here.
  • R allows compound statements. So if the user wants the whole block to execute as a single statement, then the user can enclose the block in braces ({}). Round brackets also work (()). This works in RStudio but not yet in R-Instat. I cam currently implementing this anyway because it's needed for keywords and the Crimea example you gave in PR More sample scripts and the third has piping which may work now? #8533.
  • For questions about ggplotify, probably best to ask someone like Lily.

@lloyddewit
Copy link
Contributor Author

@rdstern All these examples are brilliant, thank you

Your comment from 29 Dec:

  • print(t(plots[,,i])) executes correctly. The problem is the for loop. for is a key word and is therefore not yet implemented (working on this now). If we comment out the for loop part, then the square bracket statement works correctly. See screenshot below.
  • output[][-1] should work but does not. I will fix in this PR and will let you know when you can retest.

image

@rdstern
Copy link
Collaborator

rdstern commented Dec 29, 2023

@lloyddewit I added a third example above - after you saw the first 2. I have continued, partly as a test of the new script-sysem and it seems to be holding up very well. So here is the opposite, namely code that worked and I didn't expect it to"

### Name: hcut
### Title: Cut tree of consensus
### Aliases: hcut
### Keywords: cluster

### ** Examples

library(agricolae)
data(pamCIP)
# Save data frame(s) "pamCIP"
data_book$import_data(data_tables=list(pamCIP=pamCIP))


# only code
rownames(pamCIP)<-substr(rownames(pamCIP),1,6)
# groups of clusters
output<-consensus(pamCIP,nboot=100)
hcut(output,h=0.4,group=5,main="Group 5")
# 
hcut(output,h=0.4,group=8,type="t",edgePar=list(lty=1:2,col=2:1),main="group 8"
,col.text="blue",cex.text=1)


In the code as downloaded the graphs are commented out so I ran the last-but-one line, with the last line (which starts with a comma!) commented out, and it ran fine. Then I uncommented (Is that a word?) the last line and it correctly included it in the statement. Neat. I don't think we could do that earlier?

By the way I have run quite a number to also check what happens with multiple files

image

So far, so good.

@rdstern
Copy link
Collaborator

rdstern commented Dec 29, 2023

@lloyddewit perhaps this example needs to wait and we can try once the loops are in. It complains even with the replacement method, but then seems to run ok? It is just the lines at the end that cause the problem.

### Name: lateblight
### Title: LATEBLIGHT - Simulator for potato late blight Version LB2004
### Aliases: lateblight
### Keywords: models

### ** Examples

library(agricolae)
f <- system.file("external/weather.csv", package="agricolae")
weather <- read.csv(f,header=FALSE)
f <- system.file("external/severity.csv", package="agricolae")
severity <- read.csv(f)
weather[,1]<-as.Date(weather[,1],format = "%m/%d/%Y")
# Parameters dates
dates<-c("2000-03-25","2000-04-09","2000-04-12","2000-04-16","2000-04-22")
dates<-as.Date(dates)
EmergDate <- as.Date('2000/01/19')
EndEpidDate <- as.Date("2000-04-22")
dates<-as.Date(dates)
NoReadingsH<- 1
RHthreshold <- 90
WS<-weatherSeverity(weather,severity,dates,EmergDate,EndEpidDate,
NoReadingsH,RHthreshold)
# Parameters Lateblight
InocDate<-"2000-03-18"
LGR <- 0.00410
IniSpor <- 0
SR <- 292000000
IE <- 1.0
LP <- 2.82
InMicCol <- 9
Cultivar <- 'NICOLA'
ApplSys <- "NOFUNGICIDE"
main<-"Cultivar: NICOLA"
#--------------------------
model<-lateblight(WS, Cultivar,ApplSys, InocDate, LGR,IniSpor,SR,IE, LP,
MatTime='LATESEASON',InMicCol,main=main,type="l",xlim=c(65,95),lwd=1.5,
xlab="Time (days after emergence)", ylab="Severity (Percentage)")
# reproduce graph
x<- model$Ofile$nday
y<- model$Ofile$SimSeverity
w<- model$Gfile$nday
z<- model$Gfile$MeanSeverity
Min<-model$Gfile$MinObs
Max<-model$Gfile$MaxObs
plot(x,y,type="l",xlim=c(65,95),lwd=1.5,xlab="Time (days after emergence)",
ylab="Severity (Percentage)")
points(w,z,col="blue",cex=1,pch=19)
npoints <- length(w)
for ( i in 1:npoints){
segments(w[i],Min[i],w[i],Max[i],lwd=1.5,col="blue")
}
legend("topleft",c("Disease progress curves","Weather-Severity"),
title="Description",lty=1,pch=c(3,19),col=c("black","blue"))

@rdstern
Copy link
Collaborator

rdstern commented Dec 29, 2023

@lloyddewit I have been exploring the scripts in agricolae and am now up to about 50. It has done well so far - including lines with comments at the end, and also lines with multiple statements per line. Here is another example with a one line code that means it doesn't yet work.

### Name: stability.nonpar
### Title: Nonparametric stability analysis
### Aliases: stability.nonpar
### Keywords: nonparametric

### ** Examples

library(agricolae)
data(haynes)
stability.nonpar(haynes,"AUDPC",ranking=TRUE,console=TRUE)
# Example 2
data(CIC)
data1<-CIC$comas[,c(1,6,7,17,18)]
data2<-CIC$oxapampa[,c(1,6,7,19,20)]
cic <- rbind(data1,data2)

means <- by(cic[,5], cic[,c(2,1)], function(x) mean(x,na.rm=TRUE))
means <-as.data.frame(means[,])
cic.mean<-data.frame(genotype=row.names(means),means)
cic.mean<-delete.na(cic.mean,"greater")
out<-stability.nonpar(cic.mean)
out$ranking
out$statistics

@rdstern
Copy link
Collaborator

rdstern commented Dec 30, 2023

@lloyddewit and now for something completely different. In the tokenizers package (which includes the text of Moby Dick which I was looking for!) one example was giving me some problems for a new reason, namely it includes \n in the example. I am still not sure how this should be handled? I was having problems putting the text into a data frame? I now am ok, and it has turned into a very nice example for R-Instat! I am not sure you need concern yourself with this.

### Name: basic-tokenizers
### Title: Basic tokenizers
### Aliases: basic-tokenizers tokenize_characters tokenize_words
###   tokenize_sentences tokenize_lines tokenize_paragraphs tokenize_regex

### ** Examples

song <-  paste0("How many roads must a man walk down\n",
                "Before you call him a man?\n",
                "How many seas must a white dove sail\n",
                "Before she sleeps in the sand?\n",
                "\n",
                "How many times must the cannonballs fly\n",
                "Before they're forever banned?\n",
                "The answer, my friend, is blowin' in the wind.\n",
                "The answer is blowin' in the wind.\n")

tokenize_words(song)
tokenize_words(song, strip_punct = FALSE)
tokenize_sentences(song)
tokenize_paragraphs(song)
tokenize_lines(song)
tokenize_characters(song)

I continue here, but now think it might be ok, and an "interesting" example. I added:

song <- as.data.frame(song)

Save data frame(s) "song"

data_book$import_data(data_tables=list(song=song))

So it imports into R-Instat. It seemed to give me just the last line

image

But I now find that it is actually all there. I used the Edit > Wordwrap dialog, and found it all! So far so good. Now I wanted to illustrate that when data are messy initially, then ananalysis could start in R - with the script - and only move into R-Instat when it is a bit less "extreme". This is a trivial example.

@lloyddewit
Copy link
Contributor Author

@rdstern All this testing is wonderful, thank you.

In the code as downloaded the graphs are commented out so I ran the last-but-one line, with the last line (which starts with a comma!) commented out, and it ran fine. Then I uncommented (Is that a word?) the last line and it correctly included it in the statement. Neat. I don't think we could do that earlier?

I understand that you ran hcut(output,h=0.4,group=8,type="t",edgePar=list(lty=1:2,col=2:1),main="group 8" and it worked? This is invalid R because it's missing the closing bracket. However, this version of the library will run it because it automatically adds closing brackets to the script it sends to the R environment. The next version of the library will work differently and will send the exact script displayed in the script window. So script missing a closing bracket will fail (which is correct). It's better to send the exact script displayed to avoid inconsistencies that could theoretically cause the cursor to move to an incorrect position after executing a statement.

perhaps this example needs to wait and we can try once the loops are in. It complains even with the replacement method, but then seems to run ok? It is just the lines at the end that cause the problem.

Yes, this script should work with the new library when key words are implemented. As you say, it runs in the non-strict method but it displays the error message twice and does not display the for loop script in the output window. I fixed this. However please note that the non-strict method does not know which parts of the scripts are comments so it displays all the script in the same font and colour. The non-strict method should be needed less and less so I hope this limitation is acceptable. Please could you retest?

I have been exploring the scripts in agricolae and am now up to about 50. It has done well so far - including lines with comments at the end, and also lines with multiple statements per line. Here is another example with a one line code that means it doesn't yet work.

This script fails because it contains the function key word and key words are not yet implemented. However it does not fail gracefully. It should recognise that it can't process the key word and execute in the non-strict mode. I will fix this and ask you to retest when I am done.

@rdstern
Copy link
Collaborator

rdstern commented Dec 31, 2023

@lloyddewit

a) I first tried the lateblight, which is a useful function to have. It now - I think - runs perfectly for the current code. So where there is a loop is complains that it will have to use the "reserve" method. This now runs with no complaint to give this graph:

image

b) You mentioned this code with the closing bracket missing:

# 
hcut(output,h=0.4,group=8,type="t",edgePar=list(lty=1:2,col=2:1),main="group 8"
,col.text="blue",cex.text=1)

I hadn't noticed that aspect and am very happy that code with incorrect number of brackets will give an error. I assume, with the code above, that it would check that the statement continues to the next line, so will be happy with the statement as a whole. (In my unfair test above I had commented out the second line!)

c) The alpha design example is much better, but there is still an "interesting" point. Here is the code:

### Name: design.alpha
### Title: Alpha design type (0,1)
### Aliases: design.alpha
### Keywords: design

### ** Examples

library(agricolae)
#Example one
trt<-1:30
t <- length(trt)
# size block k
k<-3
# Blocks s
s<-t/k
# replications r
r <- 2
outdesign<- design.alpha(trt,k,r,serie=2)
book<-outdesign$book
plots<-book[,1]
dim(plots)<-c(k,s,r)
for (i in 1:r) print(t(plots[,,i]))
outdesign$sketch
# Example two 
trt<-letters[1:12] 
t <- length(trt)
k<-3
r<-3
s<-t/k
outdesign<- design.alpha(trt,k,r,serie=2)
book<-outdesign$book
plots<-book[,1]
dim(plots)<-c(k,s,r)
for (i in 1:r) print(t(plots[,,i]))
outdesign$sketch

One example is in the last 2 lines. If I run the for line by itself it uses the reserve method and seems to work fine. But if I select the 2 lines together then the reserve method gives an error and there is no output.

d) The carolina example has the same - or similar problem:

image

If I run those 2 lines together, as shown, then it complains and then lists the 2 lines of code, with no further complaint - but there is also no output.
That's different to running the lines one at a time. There is no problem with the first of the two lines. Then, when I run the second line by itself, it uses the reserve method and works fine - giving the results.

@rdstern
Copy link
Collaborator

rdstern commented Jan 1, 2024

@lloyddewit happy New Year.

I here have a code I really would like to run. It fits perfectly with the current presentation, where I am introducing your scripts stuff. It is the one function in the janeaustenr package. Here is the code as supplied:

### Name: austen_books
### Title: Tidy data frame of Jane Austen's 6 completed, published novels
### Aliases: austen_books

### ** Examples

## Don't show: 
if (requireNamespace("dplyr", quietly = TRUE)) (if (getRversion() >= "3.4") withAutoprint else force)({ # examplesIf
## End(Don't show)

library(dplyr)

austen_books() %>% 
    group_by(book) %>%
    summarise(total_lines = n())
## Don't show: 
}) # examplesIf
## End(Don't show)

I took the comment away from the 2 lines with examplesIf. I added the library(janeaustenr) line though I wasn't sure it is needed - it can't do any harm.

I can't seem to get it to run?

@lloyddewit
Copy link
Contributor Author

@rdstern Thanks for the extra feedback and happy new year to you too. :)

One example is in the last 2 lines. If I run the for line by itself it uses the reserve method and seems to work fine. But if I select the 2 lines together then the reserve method gives an error and there is no output.

I experimented with this. For me:

  • Run All worked in non-strict mode
  • If I highlighted sections and clicked run, then everything worked with the new library apart from the 2 for loops (lines 22 and 34). Lines 22 and 34 worked in non-strict mode. I could not recreate the problem you reported when highlighting the last 2 lines.

This script should work with the new library when key words are implemented so I suggest we don't spend too much time investigating the non-strict mode (we know it will never work perfectly and should be obsolete at one point anyway).

If I run those 2 lines together, as shown, then it complains and then lists the 2 lines of code, with no further complaint - but there is also no output.
That's different to running the lines one at a time. There is no problem with the first of the two lines. Then, when I run the second line by itself, it uses the reserve method and works fine - giving the results.

I could not recreate this problem but I anyway fixed the output[][-1] bug in the library. I will let you know when I've upgraded this PR to the new library, and you can retest.

I took the comment away from the 2 lines with examplesIf. I added the library(janeaustenr) line though I wasn't sure it is needed - it can't do any harm. I can't seem to get it to run?

The If parts won't run with the new library yet because if is a key word, and the script is too complex for the non-strict method. The script won't work in RStudio either because it's missing the library(janeaustenr) statement. If you install the janeaustenr package and use the script below, then it should run correctly in the R-Instat script window.

library(dplyr)
library(janeaustenr)

austen_books() %>% 
    group_by(book) %>%
    summarise(total_lines = n())

@rdstern
Copy link
Collaborator

rdstern commented Jan 5, 2024

@lloyddewit should I be testing this?

@lloyddewit
Copy link
Contributor Author

@lloyddewit should I be testing this?

@rdstern No, not yet. I'll let you know when it's ready.

@lloyddewit
Copy link
Contributor Author

@rdstern
All the scripts above should now work, apart from the scripts containing key words (if and for). If you comment out the if and for lines, then these scripts should also work.
When you test, please could you also test the last script in PR #8533 (shown below). This should hopefully work now but I couldn't test because I don't have the data set.
Thanks!

data(Nightingale)

# For some graphs, it is more convenient to reshape death rates to long format
#  keep only Date and death rates
require(reshape)
Night<- Nightingale[,c(1,8:10)]
melted <- melt(Night, "Date")
names(melted) <- c("Date", "Cause", "Deaths")
melted$Cause <- sub("\\.rate", "", melted$Cause)
melted$Regime <- ordered( rep(c(rep('Before', 12), rep('After', 12)), 3), 
                          levels=c('Before', 'After'))
Night <- melted

# subsets, to facilitate separate plotting
Night1 <- subset(Night, Date < as.Date("1855-04-01"))
Night2 <- subset(Night, Date >= as.Date("1855-04-01"))

# sort according to Deaths in decreasing order, so counts are not obscured [thx: Monique Graf]
Night1 <- Night1[order(Night1$Deaths, decreasing=TRUE),]
Night2 <- Night2[order(Night2$Deaths, decreasing=TRUE),]

# merge the two sorted files
Night <- rbind(Night1, Night2)


require(ggplot2)
# Before plot
cxc1 <- ggplot(Night1, aes(x = factor(Date), y=Deaths, fill = Cause)) +
		# do it as a stacked bar chart first
   geom_bar(width = 1, position="identity", stat="identity", color="black") +
		# set scale so area ~ Deaths	
   scale_y_sqrt() 
		# A coxcomb plot = bar chart + polar coordinates
cxc1 + coord_polar(start=3*pi/2) + 
	ggtitle("Causes of Mortality in the Army in the East") + 
	xlab("")

# After plot
cxc2 <- ggplot(Night2, aes(x = factor(Date), y=Deaths, fill = Cause)) +
   geom_bar(width = 1, position="identity", stat="identity", color="black") +
   scale_y_sqrt()
cxc2 + coord_polar(start=3*pi/2) +
	ggtitle("Causes of Mortality in the Army in the East") + 
	xlab("")

## Not run: 
# do both together, with faceting
cxc <- ggplot(Night, aes(x = factor(Date), y=Deaths, fill = Cause)) +
 geom_bar(width = 1, position="identity", stat="identity", color="black") + 
 scale_y_sqrt() +
 facet_grid(. ~ Regime, scales="free", labeller=label_both)
cxc + coord_polar(start=3*pi/2) +
	ggtitle("Causes of Mortality in the Army in the East") + 
	xlab("")

## End(Not run)

## What if she had made a set of line graphs?

# these plots are best viewed with width ~ 2 * height 
colors <- c("blue", "red", "black")
with(Nightingale, {
	plot(Date, Disease.rate, type="n", cex.lab=1.25, 
		ylab="Annual Death Rate", xlab="Date", xaxt="n",
		main="Causes of Mortality of the British Army in the East");
	# background, to separate before, after
	rect(as.Date("1854/4/1"), -10, as.Date("1855/3/1"), 
		1.02*max(Disease.rate), col=gray(.90), border="transparent");
	text( as.Date("1854/4/1"), .98*max(Disease.rate), "Before Sanitary\nCommission", pos=4);
	text( as.Date("1855/4/1"), .98*max(Disease.rate), "After Sanitary\nCommission", pos=4);
	# plot the data
	points(Date, Disease.rate, type="b", col=colors[1], lwd=3);
	points(Date, Wounds.rate, type="b", col=colors[2], lwd=2);
	points(Date, Other.rate, type="b", col=colors[3], lwd=2)
	}
)
# add custom Date axis and legend
axis.Date(1, at=seq(as.Date("1854/4/1"), as.Date("1856/3/1"), "3 months"), format="%b %Y")
legend(as.Date("1855/10/20"), 700, c("Preventable disease", "Wounds and injuries", "Other"),
	col=colors, fill=colors, title="Cause", cex=1.25)

# Alternatively, show each cause of death as percent of total
Nightingale <- within(Nightingale, {
	Total <- Disease + Wounds + Other
	Disease.pct <- 100*Disease/Total
	Wounds.pct <- 100*Wounds/Total
	Other.pct <- 100*Other/Total
	})

colors <- c("blue", "red", "black")
with(Nightingale, {
	plot(Date, Disease.pct, type="n",  ylim=c(0,100), cex.lab=1.25,
		ylab="Percent deaths", xlab="Date", xaxt="n",
		main="Percentage of Deaths by Cause");
	# background, to separate before, after
	rect(as.Date("1854/4/1"), -10, as.Date("1855/3/1"), 
		1.02*max(Disease.rate), col=gray(.90), border="transparent");
	text( as.Date("1854/4/1"), .98*max(Disease.pct), "Before Sanitary\nCommission", pos=4);
	text( as.Date("1855/4/1"), .98*max(Disease.pct), "After Sanitary\nCommission", pos=4);
	# plot the data
	points(Date, Disease.pct, type="b", col=colors[1], lwd=3);
	points(Date, Wounds.pct, type="b", col=colors[2], lwd=2);
	points(Date, Other.pct, type="b", col=colors[3], lwd=2)
	}
)
# add custom Date axis and legend
axis.Date(1, at=seq(as.Date("1854/4/1"), as.Date("1856/3/1"), "3 months"), format="%b %Y")
legend(as.Date("1854/8/20"), 60, c("Preventable disease", "Wounds and injuries", "Other"),
	col=colors, fill=colors, title="Cause", cex=1.25)

@rdstern
Copy link
Collaborator

rdstern commented Jan 8, 2024

@lloyddewit will do.

But you do have access to the Nightingdale package and code, if you did also wish to look yourself. It is in the histdata (historical data) package in the library.

@lloyddewit
Copy link
Contributor Author

@rdstern Thanks for the hint. I tested the Nightingale script.
It all seemed to work apart from line 52. R-Instat tried to send this line to the output system which failed. I may ask @Patowhiz 's advice about which script lines should and should not go to the output system.
It is a strange statement anyway. It would be more normal just to include the statement in the statement above. A workaround is shown below. Then I think everything works.

image

@rdstern
Copy link
Collaborator

rdstern commented Jan 8, 2024

@lloyddewit agreed it is just the coord_polar (with the facets) that seems to give a problem. I would like to know what happens in RStudio.? Your code above runs, but if you then have a line with just cxc to display the plot, then you get the same error message as before. Great that the rest runs nicely. I'll keep checking and with some other scripts!

@rdstern
Copy link
Collaborator

rdstern commented Jan 8, 2024

I have found the histdata sets havew useful code. So I am trying them in turn.
a) Up to Bowley they run ok.
b) Bowley and Cavendish have a line with function in. I assume that needs the keywords.
c) Chestsizes runs
d) Cholera is "interesting". The require sections need to wait. But the lines are not needed. A workaround is to use library(car) command instead - and library(effects). Then it runs fine.
e) CushnyPeebles by 2. workaround as in d)
f) Dactyl fine. I note a general limitation that plots only appear one at a time, and you must do something about them immediately. That's a general issue.

Break to be with grandchildren, but these seem interesting datasets for the analyses that are provided with them.

@rdstern
Copy link
Collaborator

rdstern commented Jan 8, 2024

@lloyddewit it was going so well too.
This is drinkswages, also from HistData. I needed to add library(HistData) into the code provided - and that's fine. Then at the bottom running line-by-line it is stuck, as you can see. That's when I did the plot command. I have to stop R-Instat and start again.

image

Let's be clear on the problem. This runs fine, i.e. doing the 3 lines together and gives me a n ice plot.

image

The problem comes from running a line at a time.
I also checked with the last released version 0.7.17 and the problem is the same. So it isn't from what was added recently.

@lloyddewit
Copy link
Contributor Author

@rdstern Thank you for finding this bug.
I loaded the HistData/DrinksWages data and ran the code below. I executed the code in different ways (e.g. run all, statement-by-statement, highlighting sections, repeating statements/sections etc.). I used the debugger and in all cases, the script window sent the correct code to the R environment.

The problem seems to be in the output system. I can make R-Instat hang with the steps below. It's possible that R-Instat's memory is corrupted.

I don't think we should try and fix this problem in this PR. I suggest that we move this into a separate issue. @Patowhiz has expertise with the output system and he is probably the best person to assign to this.


I can recreate the issue with these steps:

  • Import HistData/DrinksWages data from library
  • Open script window and paste code below
  • If I don't touch the 'R Graphics Device' window then everything seems to work correctly
  • Execute the script statement-by-statement and then close the 'R Graphics Device' window just before the plot(mod.sober) statement.
  • Execute plot(mod.sober).
  • It is then no longer possible to execute R statements. R-Instat eventually freezes and needs to be restarted.
data(DrinksWages)
plot(DrinksWages)
# plot proportion sober vs. wage | class
with(DrinksWages, plot(wage, sober/n, col=c("blue","red","green")[class]))
# fit logistic regression model of sober on wage
mod.sober <- glm(cbind(sober, n) ~ wage, family=binomial, data=DrinksWages)
summary(mod.sober)
op <- par(mfrow=c(2,2))
plot(mod.sober)
par(op)
# TODO: plot fitted model

@lloyddewit
Copy link
Contributor Author

@rdstern Thank you for all the wonderful testing you did. I understand that the script window in this PR now handles more scripts than the current master; there is also no known regression.
If possible, I would like to merge this PR (to reduce risk of merge conflicts) and upgrade to the next version of the library (supporting R key words) in a new PR.
What's your view?
Thanks

Copy link
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lloyddewit that would be great. @N-thony could you also check and then merge. I note that @N-thony change with the windows is merged already and is looking nice!
Antoine - as soon as you like.

@N-thony N-thony merged commit 819984c into IDEMSInternational:master Jan 11, 2024
2 checks passed
@lloyddewit lloyddewit deleted the testRInsight branch January 14, 2024 07:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
skip-releasenotes PRs that don't affect functionality and should not be included in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants