-
Notifications
You must be signed in to change notification settings - Fork 1
/
index.tex
15021 lines (13101 loc) · 612 KB
/
index.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
% Options for packages loaded elsewhere
\PassOptionsToPackage{unicode}{hyperref}
\PassOptionsToPackage{hyphens}{url}
\PassOptionsToPackage{dvipsnames,svgnames,x11names}{xcolor}
%
\documentclass[
letterpaper,
DIV=11,
numbers=noendperiod]{scrreprt}
\usepackage{amsmath,amssymb}
\usepackage{iftex}
\ifPDFTeX
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{textcomp} % provide euro and other symbols
\else % if luatex or xetex
\usepackage{unicode-math}
\defaultfontfeatures{Scale=MatchLowercase}
\defaultfontfeatures[\rmfamily]{Ligatures=TeX,Scale=1}
\fi
\usepackage{lmodern}
\ifPDFTeX\else
% xetex/luatex font selection
\fi
% Use upquote if available, for straight quotes in verbatim environments
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
\IfFileExists{microtype.sty}{% use microtype if available
\usepackage[]{microtype}
\UseMicrotypeSet[protrusion]{basicmath} % disable protrusion for tt fonts
}{}
\makeatletter
\@ifundefined{KOMAClassName}{% if non-KOMA class
\IfFileExists{parskip.sty}{%
\usepackage{parskip}
}{% else
\setlength{\parindent}{0pt}
\setlength{\parskip}{6pt plus 2pt minus 1pt}}
}{% if KOMA class
\KOMAoptions{parskip=half}}
\makeatother
\usepackage{xcolor}
\ifLuaTeX
\usepackage{luacolor}
\usepackage[soul]{lua-ul}
\else
\usepackage{soul}
\fi
\setlength{\emergencystretch}{3em} % prevent overfull lines
\setcounter{secnumdepth}{5}
% Make \paragraph and \subparagraph free-standing
\makeatletter
\ifx\paragraph\undefined\else
\let\oldparagraph\paragraph
\renewcommand{\paragraph}{
\@ifstar
\xxxParagraphStar
\xxxParagraphNoStar
}
\newcommand{\xxxParagraphStar}[1]{\oldparagraph*{#1}\mbox{}}
\newcommand{\xxxParagraphNoStar}[1]{\oldparagraph{#1}\mbox{}}
\fi
\ifx\subparagraph\undefined\else
\let\oldsubparagraph\subparagraph
\renewcommand{\subparagraph}{
\@ifstar
\xxxSubParagraphStar
\xxxSubParagraphNoStar
}
\newcommand{\xxxSubParagraphStar}[1]{\oldsubparagraph*{#1}\mbox{}}
\newcommand{\xxxSubParagraphNoStar}[1]{\oldsubparagraph{#1}\mbox{}}
\fi
\makeatother
\providecommand{\tightlist}{%
\setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}\usepackage{longtable,booktabs,array}
\usepackage{calc} % for calculating minipage widths
% Correct order of tables after \paragraph or \subparagraph
\usepackage{etoolbox}
\makeatletter
\patchcmd\longtable{\par}{\if@noskipsec\mbox{}\fi\par}{}{}
\makeatother
% Allow footnotes in longtable head/foot
\IfFileExists{footnotehyper.sty}{\usepackage{footnotehyper}}{\usepackage{footnote}}
\makesavenoteenv{longtable}
\usepackage{graphicx}
\makeatletter
\def\maxwidth{\ifdim\Gin@nat@width>\linewidth\linewidth\else\Gin@nat@width\fi}
\def\maxheight{\ifdim\Gin@nat@height>\textheight\textheight\else\Gin@nat@height\fi}
\makeatother
% Scale images if necessary, so that they will not overflow the page
% margins by default, and it is still possible to overwrite the defaults
% using explicit options in \includegraphics[width, height, ...]{}
\setkeys{Gin}{width=\maxwidth,height=\maxheight,keepaspectratio}
% Set default figure placement to htbp
\makeatletter
\def\fps@figure{htbp}
\makeatother
\KOMAoption{captions}{tableheading}
\makeatletter
\@ifpackageloaded{bookmark}{}{\usepackage{bookmark}}
\makeatother
\makeatletter
\@ifpackageloaded{caption}{}{\usepackage{caption}}
\AtBeginDocument{%
\ifdefined\contentsname
\renewcommand*\contentsname{Table of contents}
\else
\newcommand\contentsname{Table of contents}
\fi
\ifdefined\listfigurename
\renewcommand*\listfigurename{List of Figures}
\else
\newcommand\listfigurename{List of Figures}
\fi
\ifdefined\listtablename
\renewcommand*\listtablename{List of Tables}
\else
\newcommand\listtablename{List of Tables}
\fi
\ifdefined\figurename
\renewcommand*\figurename{Figure}
\else
\newcommand\figurename{Figure}
\fi
\ifdefined\tablename
\renewcommand*\tablename{Table}
\else
\newcommand\tablename{Table}
\fi
}
\@ifpackageloaded{float}{}{\usepackage{float}}
\floatstyle{ruled}
\@ifundefined{c@chapter}{\newfloat{codelisting}{h}{lop}}{\newfloat{codelisting}{h}{lop}[chapter]}
\floatname{codelisting}{Listing}
\newcommand*\listoflistings{\listof{codelisting}{List of Listings}}
\makeatother
\makeatletter
\makeatother
\makeatletter
\@ifpackageloaded{caption}{}{\usepackage{caption}}
\@ifpackageloaded{subcaption}{}{\usepackage{subcaption}}
\makeatother
\ifLuaTeX
\usepackage{selnolig} % disable illegal ligatures
\fi
\usepackage{bookmark}
\IfFileExists{xurl.sty}{\usepackage{xurl}}{} % add URL line breaks if available
\urlstyle{same} % disable monospaced font for URLs
\hypersetup{
pdftitle={R-Instat Climatic Guide},
pdfauthor={Roger Stern, Danny Parsons, David Stern, Francis Torgbor \& James Musyoka},
colorlinks=true,
linkcolor={blue},
filecolor={Maroon},
citecolor={Blue},
urlcolor={Blue},
pdfcreator={LaTeX via pandoc}}
\title{R-Instat Climatic Guide}
\author{Roger Stern, Danny Parsons, David Stern, Francis Torgbor \&
James Musyoka}
\date{2025-01-06}
\begin{document}
\maketitle
\renewcommand*\contentsname{Table of contents}
{
\hypersetup{linkcolor=}
\setcounter{tocdepth}{2}
\tableofcontents
}
\bookmarksetup{startatroot}
\chapter*{Acknowledgments}\label{acknowledgments}
\addcontentsline{toc}{chapter}{Acknowledgments}
\markboth{Acknowledgments}{Acknowledgments}
\hl{To be added.}
\bookmarksetup{startatroot}
\chapter{About this guide}\label{about-this-guide}
\section{Who is this guide for?}\label{who-is-this-guide-for}
This guide is concerned with the analysis of climatic data. It is for
four types of reader: The first is those concerned with the collection
and subsequent use of their climatic data. This includes staff of
national meteorological services, (NMSs) who are often the custodians of
the historical climatic data for their country. There are many others
who collect climatic data, for example schools and colleges, farms,
agricultural institutes and many individuals.
Second is the users who need results from an analysis of historical
climatic data. They may undertake analyses themselves, or, at least,
need to know what is possible from the data. They are in many walks of
life, including agriculture, health, flood prevention, water supply,
renewable energy, building, tourism and insurance.
The other two groups are concerned more with teaching and learning
statistics. Looking at climatic data is an application of interest to
many people; partly because of the effects that climate has on many
areas. Also because of the many issues of climate change.
So, the third group is those who teach statistics. This guide shows how
simple statistical ideas are used in solving practical problems in one
application area. The key concepts of sensible data handling are the
same whatever the area of application.
The final group consists of those who have to learn statistics. Many
people recognize that they need statistics skills for their work but
sometimes find their statistics courses are difficult to relate to
real-life applications. The materials here are complementary, by
starting with the application and considering the statistical ideas that
are needed to process the data.
These groups overlap. For example, many users of climatic data are also
conscious of their need for further training in statistics.
\section{Why is it needed?}\label{why-is-it-needed}
Many organizations have devoted more effort to collecting climatic data
than to their subsequent analysis. This is like other areas where
monitoring data are collected routinely. One way that climatic data is
perhaps different is that much of the data has an important immediate
use. It is an input for the short-term forecasts and for other immediate
monitoring of the current season. This might be termed a ``spatial
need'' in that these applications benefit from lots of data from
different places at the same time. These data are then stored, and this
guide is for a ``time series need''. Most of the analyses in this guide
are for long records in time. They may be for one, or more, points in
space.
Sometimes the excuse for the lack of analysis is that the quality of the
data is suspect. This is not a good reason, because one way to improve
data quality is to analyze the existing data to demonstrate their
importance and shortcomings.
It is also useful if those who collect data can do their own analysis,
or at least be involved in the analysis. This is highly motivating for
staff and an excellent way to encourage good data quality.
Some familiarity with the use of software (under Windows) is assumed.
Knowledge of statistics is useful, but not essential for most chapters.
Indeed, though this guide cannot substitute for a conventional
statistics book, users can learn many general ideas through seeing where
different techniques are useful.
\section{What software is used?}\label{what-software-is-used}
This guide uses the R Statistical system (R Core Team, 2018). It mainly
uses R through a graphical user interface, called R-Instat. An earlier
version of this guide (Stern, Rijks, Dale, \& Knock, 2006) was for a
simple statistics package, called Instat. Like the original Instat,
R-Instat combines a general statistics package with a special additional
menu to simplify the analysis of climatic data.
R-Instat is designed to support improved learning of statistics in
general as well as providing a wide range of special dialogues to
simplify, and hence facilitate, the analysis of climatic data.
\section{What is in this guide?}\label{what-is-in-this-guide}
Illustrations and `route maps' are provided in all chapters for those
who just wish to study particular topics. We hope that some users will
enjoy the way the ideas unfold in successive chapters but do not assume
that readers will wish to look at every chapter. Most chapters are in a
``tutorial'' style, so readers can follow, and practice at the same
time. There is considerable repetition, to support users to ``dip into''
the chapters they need.
How much practice is needed, depends on the user's current experience in
statistical computing. Those who are relatively inexperienced, or have
never used a statistics package, may be surprised at how easy the ideas
and the software are. However, beginners need practice, so just reading
the guide will not be so effective.
Those with experience of a statistics package should find that R-Instat
is like many other statistics packages. Then practice is not so
important, because they should be able to visualize the results from
just reading the text.
We assume R-Instat has already been installed, see
\href{http://r-instat.org/index.html}{\ul{http://r-instat.org/index.html}}.
Chapters 2 and 3 provide practice of using R-Instat in general. Climatic
data are used, but not the special climatic menu. They assume initial
knowledge of R-Instat that could be from the two initial tutorials.
Beginners should go through these tutorials, while others may just need
to see the corresponding videos.
Chapters 4 to 7 show the use of the R-Instat climatic menu. The
structure of the climatic menu mirrors the main menus. It has items in
the order that is usually needed in an analysis. First is
\textbf{\emph{File}}, to input the data, then \textbf{\emph{Prepare}},
to organise them for analysis, then \textbf{\emph{Describe}}, to analyse
the data without assuming a particular (statistical) model. Lastly, the
menu includes some special modelling items.
If your need is to analyze your own climatic data as quickly as
possible, then you may choose to omit Chapters 2 and 3 initially.
Chapter 4 describes the R-Instat ``climatic system''. This is then
assumed in later chapters. Chapter 5 is on initial exploration of the
data and on quality control. Chapter 6 produces and analyses
``standard'' summaries, such as rainfall totals, while Chapter 7
examines ``tailored products'' such as the start of the rains.
Chapter 8 examines further features of R-Instat, particularly for users
of climatic data who may wish to migrate from R-Instat to using R
itself. There are inevitable limits to the efficient processing of data
with a menu-driven package. This idea is already introduced in Chapter 3
where Section 3.5 is titled ``Don't let the computer laugh at you''.
Hence, for example, Chapter 8 considers how particular climatic analyses
can be done, using tools in R that are currently absent from R-Instat.
The remaining chapters examine more specialised topics. Chapter 9 is on
the input and analysis of gridded satellite and reanalysis data. Chapter
10 is on mapping and 11 introduces the important area of extremes.
The PICSA (Participatory Integrated Climate Services for Agriculture)
project is described briefly in Section 1.6 and in more detail in
Chapter 12.
The remaining chapters are on a range of further topics including the
(statistical approach to) the seasonal forecast, the use of stochastic
models, the processing of within-day data (e.g.~from automatic stations)
and the analysis of circular data, particularly for wind direction.
All the data sets used for illustration are supplied in the R-Instat
``library''. Chapter 3 describes how the data are organized, so readers
can substitute their own data for the examples in later chapters.
The analysis of the climatic records is often a two-stage process. The
first stage reduces the raw, often daily, data to a semi-processed form
with key summaries that correspond to users\textquotesingle{} needs. The
second stage involves processing these summaries.
This two-stage process is typical of the processing of many types of
data and is one reason why users often find their statistics training
did not seem relevant to real-world problems. Many courses use only
small sets of semi-processed data that are tailored to the topic being
taught. However, the real world starts with primary data, and these are
often quite large.
\section{Climatology, statistics, and
computing?}\label{climatology-statistics-and-computing}
Readers who are not confident in statistics should recognize the three
different subjects that are in this guide, namely climatology,
statistics, and computing. The material becomes easier if you separate
these subjects as far as possible.
R is a programming language and those who are already adept in R may
find they do not need the R-Instat menus and dialogues but can use
RStudio more efficiently for the same analyses. In contrast, beginners
in R, sometimes find it difficult to use in their statistics courses.
They are still trying to master the computing ideas, and this becomes
mixed with the statistical objectives.
Computing ideas are raised at various points in this guide, because
users sometimes limit the analyses they conduct, by not exploiting the
software fully. So, we show how R-Instat can be used in different ways,
to solve problems raised by users in their needs for data analysis.
These sections should be recognized as largely computing topics and
perhaps omitted initially by those who have less interest in using R
itself.
Statistics and climatology have two features in common. Both are
relevant to a wide range of applications and many specialists in those
application areas treat both statisticians and climatologists as an
unwelcome nuisance! Perhaps by working together, they can be welcomed
more.
\section{Climate Services for
Agriculture}\label{climate-services-for-agriculture}
Many countries have projects on climate and agriculture. These projects
often concentrate on sharing the information on the short-term and
seasonal forecasts with producers. We outline one such project, called
PICSA (Participatory Integrated Climate Services for Agriculture). More
information on PICSA is here:
\href{https://research.reading.ac.uk/picsa/}{\ul{https://research.reading.ac.uk/picsa/}}.
The different component of PICSA are shown in Fig. 1.6a
\begin{longtable}[]{@{}
>{\raggedright\arraybackslash}p{(\columnwidth - 0\tabcolsep) * \real{0.9730}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\raggedright
\textbf{\emph{Fig. 1.6a The PICSA project}}
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
\includegraphics[width=5.64347in,height=4.15038in]{figures/Fig1.6a.png} \\
\end{longtable}
A distinguishing feature of PICSA is the first panel in Fig. 1.6 and
this is in addition to the forecasting activities. The first panel is on
aspects that are based on an analysis of the historical climatic
records, that are shared with small-scale farmers, before the seasonal
forecast is available. The National Met Service (NMS) is a key partner
in each country and provides analyses of the historical data. These
analyses use the methods described in Chapters 6 and 7 of this guide and
are from the climatic stations that are as close as possible to
different groups of farmers. PICSA is described in Chapter 12.
\section{The Climsoft Climate Data Management
System}\label{the-climsoft-climate-data-management-system}
Climsoft is a free and open source system for the entry and management
of primary climatic data. The initial screen is shown in Fig. 1.7a. It
has facilities for the entry and checking of data from paper records,
and for the transfer of data from previous systems, and from automatic
stations. The data, and metadata are currently held in a mysql database.
Climsoft is designed particularly for National Met Services (NMSs), but
can be used by any other organisation that has to manage historical
climatic or other related data. A wide range of elements are
pre-defined, but others can be added for hydrology, pollution or other
aspects. Data can be at any scale, e.g.~daily, 10-minute.
Climsoft includes some products, but not many. Instead, R-Instat can
read data directly from Climsoft, or exported from Climsoft, and is
designed as the products' partner to Climsoft. In later versions of
Climsoft the plan is for some of the R-routines in R-Instat to become
part of Climsoft.
\begin{longtable}[]{@{}
>{\raggedright\arraybackslash}p{(\columnwidth - 0\tabcolsep) * \real{0.9730}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\raggedright
\textbf{\emph{Fig. 1.7a The main Climsoft menu}}
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
\includegraphics[width=5.75635in,height=3.96937in]{figures/Fig1.7a.png} \\
\end{longtable}
\bookmarksetup{startatroot}
\chapter{More Practice with R-Instat}\label{more-practice-with-r-instat}
\section{Introduction}\label{introduction}
To use climatic data fully it is important to be able to deliver
products. The two examples in this chapter describe the steps and the
endpoint in this process. Data are supplied in the right form for the
analysis. The objectives are specified, and your task is to prepare the
tables and graphs for a report and a presentation.
Some familiarity with R-Instat is assumed. There are two initial
tutorials and following those is enough preparation. If you have already
used a statistics package before, then the examples below may be
sufficient for you, even without the tutorials. This chapter is also
designed to provide practice with R-Instat.
The first problem builds on a study in Southern Zambia. This is the most
drought-prone area of the country. Everyone knew that there is
\textquotesingle climate change\textquotesingle! Some farmers were
emigrating North, citing climate change as their reason. However, a
local non-governmental organization (NGO) called the Conservation
Farming Unit, questioned this reasoning for the rainfall data. They are
not convinced that any climate change has necessarily affected the
farming practices. They, therefore, commissioned a study that used daily
climatic data from several stations in Southern Zambia. The results were
supplied as a report, and presentations of the results were also made to
the NGO and to the local FAO Officers. The results confirmed evidence of
climate change in the temperature data, but not in the rainfall. The key
conclusions were later made into short plays that were broadcast on
local radio and played at village meetings.
Here we use data from Moorings, a site in Southern Zambia. The daily
data, on rainfall, are from 1922 to 2009. Here, partly for simplicity,
we largely use the monthly summaries.
For the work, we draw an analogy with the preparation of a meal. The
first key requirement is that you have the food, which here is the
climatic data. In a real meal, the food may be supplied in a form that
is ready for cooking, or it may need preparation prior to cooking. Here
the data are in pre-packed form, so the analysis can proceed quickly.
You also need the right tools. In a kitchen, they are the saucepans,
etc, while here they are just the computer, together with the required
software.
You need some general cooking skills. These are the basic computing
skills, plus initial skills of R-Instat, at least from the tutorial.
Finally, your objectives must be clear. This corresponds to having a
specific meal in mind so that a recipe can be used. Of course, you may
have to adapt slightly as you go along. You might find some oddities in
the data, just as cooks must improvise if they suddenly find that one of
the ingredients is not available.
If everything is well organized, the cook can prepare the meal very
quickly. This is just what is done in the products in this chapter. This
leaves time to make sure the dishes, for us the results, are presented
attractively. Then users will enjoy consuming what is presented.
Section 2.2 describes the data for this first task. Trends in the
rainfall are examined in Section 2.3. A second problem, in Section 2.4,
examines whether satellite data on sunshine hours resembles
corresponding station data. Daily data from Dodoma, Tanzania, are used.
The data for each of these case studies are in the R-Instat library. The
presentation is designed so users can repeat the analyses on their
laptops.
Graphs are produced in each of these sections and the general methods
for graphics in R-Instat is outlined in Section 2.5. Section 3.5 then
adds a warning. R-Instat provides an easy-to-use click and point way of
using the R programming language. It should help users to solve may
problems. But a click-and-point system is not the right tool for all
problems. We describe a problem that may require more programming
skills, at least if you wish to prevent your computer from laughing at
you!
This chapter demonstrates R-Instat as a simple general statistics
package and the File, Prepare and Describe menus are used. It
illustrates that a general statistics package is an appropriate tool for
many climatic problems. It is also designed to consolidate your
experience in using R-Instat. The special climatic menu is introduced in
chapter 4.
\section{The Moorings data}\label{the-moorings-data}
Monthly data are used in this part of the chapter. Daily data are the
starting point in most of this guide because many of the objectives
require daily data. But here the emphasis is on objectives for which the
monthly data are suitable.
The data are already in an R-Instat file. Hence, they can be opened from
the library in R-Instat library.
From the opening screen in R-Instat, select \textbf{\emph{File
\textgreater{} Open From Library}} as shown in Fig. 2.2a. Choose
\textbf{\emph{Load From Instat Collection}}, Then \textbf{\emph{Browse}}
to the \textbf{\emph{Climatic}} directory then to
\textbf{\emph{Zambia}}. \textbf{\emph{Select}} the file called
\textbf{\emph{Moorings\_July.rds}} to give the screen shown in Fig.
2.2b. Press \textbf{\emph{Ok}}.
\begin{longtable}[]{@{}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.4828}}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.5172}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.2a File \textgreater{} Open from Library}}
\end{minipage} & \begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.2b Ready to import Moorings.RDS}}
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
\textbf{\emph{Climatic \textgreater{} Zambia \textgreater{}
Moorings.RDS}} & \\
\includegraphics[width=3.36in,height=3.06in]{figures/Fig2.2a.png} &
\includegraphics[width=2.43in,height=2.75in]{figures/Fig2.2b.png} \\
\end{longtable}
The resulting data are shown in Fig. 2.2c. There are 2 data frames. The
one called Moorings has daily data.
\textbf{\emph{Move to the second data frame}} as shown in Fig. 2.2c
which shows the monthly totals. They are the total rainfall in mm and
the total number of rain days. A rain day was defined as a day with more
than 0.85mm\footnote{This threshold is like the value of 1mm sometimes
suggested by WMO. The smallest value usually recorded is 0.1mm, but we
find stations are not equally conscientious in recording very small
amounts. So, if 0.1mm is used, then it is harder to compare the
pattern of rainfall at different stations. Records also differ in
their attitudes on rounding. So, some records have far fewer values of
0.9 or 1.1mm than others. There is also an old issue that some
stations used to measure in inches and the smallest value was then
0.01 inches. This translates to 0.3mm and higher values are 0.5, 0.8
and 1mm. So 0.9mm is not possible in data that used to be in inches.
Hence the threshold of 0.85mm is a practical way of implementing ``1mm
and above''.}.
\begin{longtable}[]{@{}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.4421}}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.5579}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.2c The Moorings monthly data}}
\end{minipage} & \begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.2d Boxplot dialogue on the Describe menu}}
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
& \textbf{\emph{Describe \textgreater{} Specific \textgreater{}
Boxplot}} \\
\includegraphics[width=1.844in,height=3.474in]{figures/Fig2.2c.png} &
\includegraphics[width=3.096in,height=2.329in]{figures/Fig2.2d.png} \\
\end{longtable}
Rainfall in Southern Zambia is from November to April. Hence, we analyze
the data by season, rather than by year. There are 88 seasons from 1922
to 2009 and 1056 monthly values, as indicated in Fig. 2.2c.
The task is to write a short report that describes the patterns of
rainfall. One aim is to assess whether there is obvious evidence of
change in the pattern of rainfall. This evidence might justify
requesting the data from multiple stations, to undertake a more detailed
study. The first step is to explore the data, and then consider how
appropriate results could be presented. To explore we start with a
boxplot to show the seasonal pattern of the rainfall totals.
Choose the Boxplot dialogue from the Describe menu, with
\textbf{\emph{Describe \textgreater{} Specific \textgreater{} Boxplot}},
as shown in Fog. 2.2d. Complete the dialogue as shown in Fig. 2.2e. The
resulting graph is shown in Fig. 2.2f\footnote{The body of the boxplot
is not blue by default. For this, choose \textbf{\emph{Boxplot
Options}} in Fig. 2.2e, then choose the \textbf{\emph{Geom
Parameters}} tab and change the \textbf{\emph{Fill colour}} to your
choice.}. This shows the total rainfall was typically 200mm in each of
December to February. There was always some rain in each of these
months, and the records were over 500mm.
\begin{longtable}[]{@{}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.4421}}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.5579}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.2e Completed boxplot dialogue}}
\end{minipage} & \begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.2f Boxplot of monthly rainfall totals}}
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
\textbf{\emph{Describe \textgreater{} Specific \textgreater{} Boxplot}}
& \\
\includegraphics[width=3.101in,height=3.54in]{figures/Fig2.2e.png} &
\includegraphics[width=2.966in,height=3.095in]{figures/Fig2.2f.png} \\
\end{longtable}
Change the variable from rain to \textbf{\emph{raindays}} in Fig. 2.2e
to give the corresponding boxplots for the number of raindays in the
month, Fig. 2.2g. This shows that typically one day in two are rainy in
December to February. Occasionally most of the days are rainy.
Boxplots are essentially a 5-number summary of the data, (with potential
outliers also shown). The \textbf{\emph{Prepare \textgreater{} Column:
Reshape \textgreater{} Column Summaries, Fig. 2.2h,}} dialogue can
provide the same summaries numerically.
\begin{longtable}[]{@{}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.4421}}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.5579}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.2g The number of rain days}}
\end{minipage} & \begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.2h Summary dialogue on the Prepare menu}}
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
\includegraphics[width=3.281in,height=3.44in]{figures/Fig2.2g.png} &
\includegraphics[width=2.724in,height=2.691in]{figures/Fig2.2h.png} \\
\end{longtable}
Summarise both the monthly totals and the number of raindays, with the
month as the factor, as shown in Fig. 2.2i. Then choose the Summaries
button and complete the sub-dialogue as shown in Fig. 2.2j.
\begin{longtable}[]{@{}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.4421}}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.5579}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.2i Summary dialogue}}
\end{minipage} & \begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.2j Summaries sub-dialogue}}
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
\textbf{\emph{Prepare \textgreater{} Column: Reshape \textgreater{}
Column Summaries}} & \\
\includegraphics[width=3.206in,height=3.343in]{figures/Fig2.2i.png} &
\includegraphics[width=2.734in,height=3.674in]{figures/Fig2.2j.png} \\
\end{longtable}
The results are in a third data frame. It just has 12 rows as shown in
Fig. 2.2k. The summaries are clearer if they are in order (which we did
already for Fig. 2.2k).
\textbf{\emph{Right-click in the name field}} of this data frame and
choose the option to \textbf{\emph{Reorder columns}}, Fig. 2.2l.
\begin{longtable}[]{@{}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.4421}}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.5579}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.2k Resulting summary data}}
\end{minipage} & \begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.2l Right-click menu to reorder columns}}
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
\includegraphics[width=3.517in,height=2.443in]{figures/Fig2.2k.png} &
\includegraphics[width=2.419in,height=2.464in]{figures/Fig2.2l.png} \\
\end{longtable}
In the Reorder dialogue, use the \textbf{\emph{arrow keys}} to change
the position of the columns in the data frame.
With the summaries in a sensible order, they are now transferred to the
results (output) window.
\begin{longtable}[]{@{}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.4421}}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.5579}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.2m Reorder the resulting columns}}
\end{minipage} & \begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.2n Simplify column names}}
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
\textbf{\emph{Right-click \textgreater{} Reorder column(s)}} &
\textbf{\emph{Right-click \textgreater{} Rename column(s)}} \\
\includegraphics[width=3.098in,height=2.219in]{figures/Fig2.2m.png} &
\includegraphics[width=3.026in,height=2.083in]{figures/Fig2.2n.png} \\
\end{longtable}
Before this, we renamed some of the columns to give shorter names. This
again used the \textbf{\emph{right-click}} menu, Fig 2.2l. The rename
dialogue is shown in Fig. 2.2n.
\begin{longtable}[]{@{}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.4421}}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.5579}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.2o View Data dialogue}}
\end{minipage} & \begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.2p The Monthly number of rain days}}
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
\textbf{\emph{Prepare \textgreater{} Data Frame \textgreater{} View
Data}} & \\
\includegraphics[width=2.712in,height=2.72in]{figures/Fig2.2o.png} &
\includegraphics[width=3.378in,height=2.302in]{figures/Fig2.2p.png} \\
\end{longtable}
Now use the \textbf{\emph{Prepare \textgreater{} Data Frame
\textgreater{} View Data}} dialogue, Fig. 2.2o, to transfer the rainfall
totals and then the number of rain days to the results window. The
results for the number of rain days are shown in Fig. 2.2p.
\section{The objectives}\label{the-objectives}
Section 2.2 explored the data and examined the seasonal pattern of the
rainfall at Moorings. It also made use of the three menus, File, Prepare
and Describe and well as the right-click menu. The main objective,
however, was to see if there is evidence of rainfall change rather than
to investigate the seasonal pattern.
We first examine the annual totals and the total number of rain days.
These are the totals from July to June, so they cover each season.
Some ``housekeeping'' is a preliminary. The 3\textsuperscript{rd}
data-frame is no longer needed. Right-click on the bottom tab a and
choose the option to delete, Fig. 2.3a. The dialogue shown in Fig. 2.3b
opens. Just press ok.
\begin{longtable}[]{@{}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.4421}}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.5579}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.3a Right-click on the bottom tab}}
\end{minipage} & \begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.3b Delete a data frame}}
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
\includegraphics[width=2.15in,height=1.688in]{figures/Fig2.3a.png} &
\includegraphics[width=3.808in,height=2.323in]{figures/Fig2.3b.png} \\
\end{longtable}
Use \textbf{\emph{Prepare \textgreater{} Column: Reshape \textgreater{}
Column Summaries}} and complete the dialogue and sub-dialogue as shown
in Fig. 2.3c and Fig. 2.3d to produce the seasonal totals.
\begin{longtable}[]{@{}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.4421}}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.5579}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.3c Produce the annual totals}}
\end{minipage} & \begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.3d The Summaries sub-dialogue}}
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
\textbf{\emph{Prepare \textgreater{} Column: Reshape \textgreater{}
Column Summaries}} & \\
\includegraphics[width=3.194in,height=3.34in]{figures/Fig2.3c.png} &
\includegraphics[width=2.831in,height=3.806in]{figures/Fig2.3d.png} \\
\end{longtable}
The results are shown in Fig. 2.3 e after the steps explained below.
First, notice in Fig. 2.3e that there were only 4 months in the first
season, and the annual summary was therefore set to missing\footnote{This
is partly because the ``Omit Missing Values'' checkbox in Fig. 2.3c
was left unchecked. Coping with missing values in the data, when
summaries are calculated, is a complex issue. It is discussed in
detail in Chapter 6.}.
\begin{longtable}[]{@{}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.4421}}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.5579}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.3e Resulting annual data}}
\end{minipage} & \begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.3f Menu for a text substring}}
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
\includegraphics[width=2.941in,height=3.522in]{figures/Fig2.3e.png} &
\includegraphics[width=3.028in,height=2.812in]{figures/Fig2.3f.png} \\
\end{longtable}
A numeric column for the year (season) is needed for the time series
graphs. Hence, as shown below, we produce the second column, called
s\_yr, also shown in Fig. 2.3e.
Use \textbf{\emph{Prepare \textgreater{} Column: Text \textgreater{}
Transform}}, Fig. 2.3f. Complete the resulting dialogue, as shown in
Fig. 2.3g, to give just the starting year of the season. The resulting
variable is shown in Fig. 2.3e.
\begin{longtable}[]{@{}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.4421}}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.5579}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.3g The Substring Option}}
\end{minipage} & \begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.3h Convert Column to Numeric}}
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
\textbf{\emph{Prepare \textgreater{} Column: Text \textgreater{}
Transform}} & \\
\includegraphics[width=3.46in,height=3.001in]{figures/Fig2.3c.png} &
\includegraphics[width=2.004in,height=2.891in]{figures/Fig2.3d.png} \\
\end{longtable}
Use the \textbf{\emph{right-click}} menu, Fig. 2.3h to convert the
resulting s\_yr column to numeric.
After a little further housekeeping from the right-click menu, to
\textbf{\emph{rename}}, \textbf{\emph{re-orde}}r and
\textbf{\emph{delete}} columns, the annual data are as shown in Fig.
2.3e above.
Now for the time-series graphs. They can be produced using the
\textbf{\emph{Describe \textgreater{} Specific \textgreater{} Line
Plot}} dialogue, but this type of graph is just what is needed for the
PICSA-style rainfall graphs, so we use the special climatic menu for the
first time.
Use \textbf{\emph{Climatic \textgreater{} PICSA \textgreater{} Rainfall
Graph}}. Complete as shown in Fig. 2.3i. Press the \textbf{PICSA
Options} button and complete the Lines ab as shown in Fig. 2.3j to add
(and label) a horizontal line for the mean.
\begin{longtable}[]{@{}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.4421}}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.5579}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.3i PICSA Rainfall graph dialogue}}
\end{minipage} & \begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.3j Add a line showing the mean}}
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
\textbf{\emph{CLimatic \textgreater{} PICSA \textgreater{} Rainfall
Graph}} & \\
\includegraphics[width=2.796in,height=3.34in]{figures/Fig2.3c.png} &
\includegraphics[width=3.264in,height=2.318in]{figures/Fig2.3d.png} \\
\end{longtable}
The resulting graph is shown in Fig 2.3k\footnote{The graph in Fig. 2.4k
also starts from 0. This used the Yaxis tab in Fig. 2.3j.}.
\textbf{\emph{Return to the dialogue}} and put \textbf{\emph{raindays}}
as the y-variable to give the results in Fig. 2.3l.
\begin{longtable}[]{@{}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.4421}}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.5579}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.3k Seasonal rainfall totals}}
\end{minipage} & \begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.3l Number of rain days}}
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
\includegraphics[width=3.017in,height=1.502in]{figures/Fig2.3k.png} &
\includegraphics[width=2.872in,height=1.429in]{figures/Fig2.3l.png} \\
\end{longtable}
These graphs indicate large inter-annual variability, but they don't
seem to show a trend. That is important because, if you can attribute
your farming problems to climate change, then there may be nothing you
can do. But coping with the variability is what farmers have always had
to do.
With results such as shown in Fig. 2.3k and 2.3l you can start comparing
risks for different options in your farming and in other enterprises.
That sort of idea is discussed in PICSA workshops.
Some may find the graph shown above to be convincing evidence that, with
rainfall, the pressing problem is variability, rather than change. We
stress that there IS climate change, and similar graphs with temperature
data show a trend. If the temperatures have changed, then the ``system''
has changed, and it follows that other elements including rainfall will
be affected. Currently, however, with this sort of analysis, it is
usually not yet possible to determine which way the pattern of rainfall
may change. It is difficult to detect a small change when the
inter-annual variability is so large. And, even if a change is detected,
coping as well as possible with the variability must be a good thing to
do.
Some people are not convinced by graphs such as are shown above. A
common statement is that the annual totals that might still be similar,
but the season is shorter, because planting is delayed, etc. We examine
this in more detail in Chapter 7. There the daily data are used to
define the start, end and length of the season as well as to examine dry
spells and extremes during the season. With the monthly total, the
examination can start by repeating the analysis above, but just for
November and December, when the season starts.
\textbf{\emph{Return to the monthly data frame}} and
\textbf{\emph{filter}} to examine just those months. So, make sure you
are on the monthly data. \textbf{\emph{Right click}} as usual and choose
\textbf{\emph{Filter}}, Fig. 2.3m
\begin{longtable}[]{@{}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.4421}}
>{\centering\arraybackslash}p{(\columnwidth - 2\tabcolsep) * \real{0.5579}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.3m Right-click for Filter}}
\end{minipage} & \begin{minipage}[b]{\linewidth}\centering
\textbf{\emph{Fig. 2.3n The filter dialogue}}
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
\includegraphics[width=2.886in,height=6.25in]{figures/Fig2.3c.png} &
\includegraphics[width=6.944in,height=3.659in]{figures/Fig2.3l.png} \\
\end{longtable}