Title: | Multiscale Climate Model Assessment |
---|---|
Description: | Provides functions allowing for (1) easy aggregation of multivariate time series into custom time scales, (2) comparison of statistical summaries between different data sets at multiple time scales (e.g. observed and bias-corrected data), (3) comparison of relations between variables and/or different data sets at multiple time scales (e.g. correlation of precipitation and temperature in control and scenario simulation) and (4) transformation of time series at custom time scales. |
Authors: | Martin Hanel [aut, cre] |
Maintainer: | Martin Hanel <[email protected]> |
License: | GPL-2 |
Version: | 0.1.3 |
Built: | 2025-02-03 03:54:17 UTC |
Source: | https://github.com/hanel/musica |
Contains functions for flexible assessment of climate model bias and changes at multiple time scales. See documentation for decomp
, compare
and vcompare
. In addition, musica provides functions for multiscale transformations of time series (see msTrans_abs
and msTrans_dif
)
Following option(s) are available:
At several places the package compares values. The character vector additive_values
specifies for which variables difference should be used for comparison instead of ratio. Defaults to additive_values = "TAS"
. See options
for setting or examining options.
Martin Hanel [email protected]
Hanel, M., Kozin, R. (2016) Bias correction for hydrological modelling, submitted.
A list of three data.tables with observed (obs_ctrl
) and RCM simulated data for the control (sim_ctrl
) and scenario (sim_scen
) periods for Oslava basin (downto Cucice) in the Czech Republic. The basin average precipitation and temperature were obtained from griddedb observations and RCM simulation (EUR-11_CNRM-CERFACS-CNRM-CM5_rcp45_r1i1p1_CLMcom-CCLM4-8-17 simulation conducted within the CORDEX project).
basin_PT
basin_PT
List of 3 data.tables:
observed data for the basin for a period 1981-01-01 – 2005-21-31
simulated data for the basin for a period 1981-01-01 – 2005-21-31
simulated data for the basin for a period 2070-01-01 – 2099-21-31
Each data.table contains 3 variables:
date
precipitation, mm
temperature, degrees C
Conversion between period specification and codes
period2code(periods) code2period(code)
period2code(periods) code2period(code)
periods |
period specification |
code |
period code |
Periods are specified using keywords "day", "month", "year" preceded by an integer and a space and optionally followed by "s" (the specification is further passed to cut.Date
, see cut.Date
for details). To fit in figures and for simplicity, periods can be also specified by codes, i.e. by D
, M
, Y
(for "day", "month" and "year", respectively) and folowed by integer specifying the number of intervals. The functions period2code
and code2period
provide conversion between the two alternatives.
period2code(c('1 day', '23 days', '3 month', '2 years')) code2period(c('D1', 'D23', 'M3', 'Y2'))
period2code(c('1 day', '23 days', '3 month', '2 years')) code2period(c('D1', 'D23', 'M3', 'Y2'))
The function evaluates distance between statistical characteristics of specified data sets. Distance is measured as difference for variables included in getOption('additive_variables')
, i.e. temperature (TAS
) by default, and as a ratio for other variables.
compare(x, compare_to, fun = mean, wet_int_only = TRUE, wet_int_thr = 0.1, exclude_below = 0.9)
compare(x, compare_to, fun = mean, wet_int_only = TRUE, wet_int_thr = 0.1, exclude_below = 0.9)
x |
List of decomposed variables to be compared |
compare_to |
Decomposed variable used as a reference |
fun |
Function used for comparison |
wet_int_only |
(logical) Should only the wet intervals be considered? |
wet_int_thr |
Numeric value specifying the minimum depth to be considered wet |
exclude_below |
Some of the intervals might not be of required length, e.g. D10 interval may have less than 10 days available. The |
data.table summarizing the differences with columns:
factor indicating the variable
specification of the averaging length with 'D' - day(s), 'M' - month(s), 'Y' - year(s) and 'G1' - the overall mean
averaging length in hours
indication of the aggregating scale specified by agg_by
argument
factor indicating the data sets from x
with labels given by names(x)
distance between data sets from x
and compare_to
. Distance is measured as difference for variables included in getOption('additive_variables')
, i.e. temperature (TAS
) by default, and as a ratio for other variables, see dif
library(ggplot2) data(basin_PT) ## Not run: dobs = decomp(basin_PT[['obs_ctrl']]) dctrl = decomp(basin_PT[['sim_ctrl']]) dscen = decomp(basin_PT[['sim_scen']]) d = compare(x = list(CTRL = dctrl, SCEN = dscen), compare_to = dobs, fun = max) ggplot(d) + geom_line(aes(x = TS, y = DIF, col = factor(sub_period))) + facet_grid(variable ~ comp, scale = 'free') + scale_x_log10() ## End(Not run)
library(ggplot2) data(basin_PT) ## Not run: dobs = decomp(basin_PT[['obs_ctrl']]) dctrl = decomp(basin_PT[['sim_ctrl']]) dscen = decomp(basin_PT[['sim_scen']]) d = compare(x = list(CTRL = dctrl, SCEN = dscen), compare_to = dobs, fun = max) ggplot(d) + geom_line(aes(x = TS, y = DIF, col = factor(sub_period))) + facet_grid(variable ~ comp, scale = 'free') + scale_x_log10() ## End(Not run)
Compare distribution function of variables
dcompare(x, compare_to, p = seq(0, 1, 0.01), wet_int_only = TRUE, wet_int_thr = 0.1, exclude_below = 0.9)
dcompare(x, compare_to, p = seq(0, 1, 0.01), wet_int_only = TRUE, wet_int_thr = 0.1, exclude_below = 0.9)
x |
List of decomposed objects |
wet_int_only |
(logical) Should only the wet intervals be considered? |
wet_int_thr |
Numeric value specifying the minimum depth to be consider wet |
exclude_below |
Some of the intervals might not be of required length, e.g. D10 interval may have less than 10 days available. The |
data.table summarizing the relation with columns:
factor indicating the variable
specification of the averaging length with 'D' - day(s), 'M' - month(s), 'Y' - year(s) and 'G1' - the overall mean
averaging length in hours
indication of the aggregating scale specified by agg_by
argument
factor indicating the data sets from x
with labels given by names(x)
distance between quantiles of x
and compare_to
. Distance is measured as difference for variables included in getOption('additive_variables')
, i.e. temperature (TAS
) by default, and as a ratio for other variables, see dif
library(ggplot2) data(basin_PT) ## Not run: dobs = decomp(basin_PT[['obs_ctrl']]) dctrl = decomp(basin_PT[['sim_ctrl']]) d = dcompare(x = list(CTRL = dctrl), compare_to = dobs) ggplot(d[variable=='TAS' & period!='G1']) + geom_line(aes(x = p, y = DIF, col = comp)) + facet_grid(sub_period~period, scale = 'free') + theme(legend.position = 'top', axis.text.x = element_text(angle = 90, vjust = .5)) ## End(Not run)
library(ggplot2) data(basin_PT) ## Not run: dobs = decomp(basin_PT[['obs_ctrl']]) dctrl = decomp(basin_PT[['sim_ctrl']]) d = dcompare(x = list(CTRL = dctrl), compare_to = dobs) ggplot(d[variable=='TAS' & period!='G1']) + geom_line(aes(x = p, y = DIF, col = comp)) + facet_grid(sub_period~period, scale = 'free') + theme(legend.position = 'top', axis.text.x = element_text(angle = 90, vjust = .5)) ## End(Not run)
Calculate series of averages over the periods specified in the period
argument into the inpur data.table.
decomp(x, period = c("Y1", "M6", "M3", "M1", "D15", "D1"), agg_by = quarter, full_return = FALSE, remove_incomplete = TRUE, year_starts = months(2))
decomp(x, period = c("Y1", "M6", "M3", "M1", "D15", "D1"), agg_by = quarter, full_return = FALSE, remove_incomplete = TRUE, year_starts = months(2))
x |
data.table with columns |
period |
The periods over which the averages will be calculated, see Details |
agg_by |
Function for specification of the period (season, month) to be additionaly included in output, see Details |
full_return |
(logical) Should the average be repeated for each scale along with original time series? Default is FALSE (e.g. for M1 only monthly and not daily time series is returned) |
remove_incomplete |
Should the incomplete years be removed from results? Default is TRUE. For use with monthly data always set to FALSE. |
year_starts |
Indication of the start of the year - determines how the months will be grouped into the seasons. Note that the sub_period output variable is with respect to the |
The original time series in daily time step is decomposed into series of averages ove periods specified in periods
argument using letter codes 'D' - day(s), 'M' - month(s), 'Y' - year(s) followed by number corresponding to number of periods and 'G1' the overall mean. The periods must be given in order from longest to shortest, the overall mean is always included (and needs not to be specified in period
). Shorter periods are always identified within the closest longer periods, i.e. each shorter period is included in exactly one longer period. As a result, the averages may be calculated over shorter periods than specified. This is due to varying length of "month" and "year" periods. The actual length used for averaging is included in the output. To make further assessment of the decomposed objects easier, indicator of period within the year (e.g. quarter or month) as specified by agg_by
argument is included in the output.
data.table with variables:
factor indicating the variable
date
specification of the averaging length with 'D' - day(s), 'M' - month(s), 'Y' - year(s) and 'G1' - the overall mean
value of the variable for given averaging length
indication of the aggregating scale specified by agg_by
argument
average date of the interval
real length of the vectors used for calculating averages
averaging length in hours
data(basin_PT) str(basin_PT) basin_PT[['obs_ctrl']] dobs = decomp(basin_PT[['obs_ctrl']], period = c('1 year', '1 month', '1 day'))
data(basin_PT) str(basin_PT) basin_PT[['obs_ctrl']] dobs = decomp(basin_PT[['obs_ctrl']], period = c('1 year', '1 month', '1 day'))
Functions for evaluating distance between variables
dif(x, y, var) rev_dif(x, y, var) rev_difv(x, var)
dif(x, y, var) rev_dif(x, y, var) rev_difv(x, var)
x , y
|
variables to be compared |
var |
variable code |
Difference or ratio of x
and y
(for dif
) and sum or product (for rev_dif
and rev_difv
). Distance is measured as difference for variables included in getOption('additive_variables')
, i.e. temperature (TAS
) by default, and as a ratio for other variables.
While rev_dif
returns sum(x, y)
or prod(x, y)
, rev_difv
takes single vector x
and returns sum(x)
or prod(x)
.
Used mainly in other functions of the package.
getOption('additive_variables') # calculate distance of 2 vectors dif(c(10, 20, 30), c(11, 18, 3), 'TAS') dif(c(10, 20, 30), c(11, 18, 3), 'PR') # inverse for 2 vectors rev_dif(c(10, 20, 30), c(11, 18, 3), 'TAS') # inverse for 1 vector rev_difv(c(10, 1.1, .9), 'TAS')
getOption('additive_variables') # calculate distance of 2 vectors dif(c(10, 20, 30), c(11, 18, 3), 'TAS') dif(c(10, 20, 30), c(11, 18, 3), 'PR') # inverse for 2 vectors rev_dif(c(10, 20, 30), c(11, 18, 3), 'TAS') # inverse for 1 vector rev_difv(c(10, 1.1, .9), 'TAS')
Indication of a season
month2sea(dtm, year_starts = months(0)) sscale2sea(sub_scale, year_starts = months(-1))
month2sea(dtm, year_starts = months(0)) sscale2sea(sub_scale, year_starts = months(-1))
dtm |
a |
year_starts |
Month object indicating the start of the year |
sub_scale |
integer indicating the season |
3 letter code (as DJF, JJA etc.) specifying the season
month2sea(as.Date('2000-01-01') + months(1:10) ) sscale2sea(c(1, 1, 2, 2, 2, 3, 3), year_starts = months(-1))
month2sea(as.Date('2000-01-01') + months(1:10) ) sscale2sea(c(1, 1, 2, 2, 2, 3, 3), year_starts = months(-1))
Applies standard quantile mapping at custom time scales.
msTrans_abs(dta, agg_by = month, wet_int_thr = 0.1, maxiter = 10, tol = 1e-04, qstep = 0.001, period = c("G1", "Y1", "M3", "M1", "D1"))
msTrans_abs(dta, agg_by = month, wet_int_thr = 0.1, maxiter = 10, tol = 1e-04, qstep = 0.001, period = c("G1", "Y1", "M3", "M1", "D1"))
dta |
List with components |
agg_by |
Function for specification of the period (season, month) to be additionaly included in output, see Details |
wet_int_thr |
Numeric value specifying the minimum depth to be considered wet |
maxiter |
Maximum number of iterations, see Details |
tol |
Stoping criterion of the iteration cycle, see Details |
qstep |
A numeric value between 0 and 1. The quantile mapping is fitted only for the quantiles defined by quantile(0,1,probs=seq(0,1,by=qstep). Passed to |
period |
Specification of the aggregation lengths the correction is applied at with 'D' - day(s), 'M' - month(s), 'Y' - year(s) and 'G1' - the overall mean |
The procedure utilizes standard quantile mapping from the qmap
-package, but at multiple time scales. Since correction at particular temporal scale influences values at other aggregations, the procedure is applied iterativelly until the maximum number of iterations (maxiter
) is reached or the difference between succesive iteration step is smaller than tol
. Differences between corrected and uncorrected variable at longer time scales are used to modify daily values after each iteration step (see e.g. Mehrorta and Sharma, 2016; Pegram et al. 2009). To make further assessment of the decomposed objects easier, indicator of period within the year (e.g. quarter or month) as specified by agg_by
argument is included in the output.
data.table with corrected data
Hanel, M., Kozin, R., 2016. Bias and projected changes in climate model simulations at multiple time scales: consequences for hydrological impact assessment. Environmental Modelling and Software, submitted.
Mehrotra, R., Sharma, A., 2016. A multivariate quantile-matching bias correction approach with auto-and cross-dependence across multiple time scales: Implications for downscaling. Journal of Climate 29, 3519-3539.
Pegram, G.G., et al., 2009. A nested multisite daily rainfall stochastic generation model. Journal of Hydrology 371, 142-153.
data("basin_PT") scen = basin_PT$sim_scen ctrl = basin_PT$sim_ctrl obs = basin_PT$obs_ctrl dta = list(TO = obs, FROM = ctrl, NEWDATA = scen) ## Not run: msTrans_abs(dta, maxiter = 10, period = 'D1') ## End(Not run)
data("basin_PT") scen = basin_PT$sim_scen ctrl = basin_PT$sim_ctrl obs = basin_PT$obs_ctrl dta = list(TO = obs, FROM = ctrl, NEWDATA = scen) ## Not run: msTrans_abs(dta, maxiter = 10, period = 'D1') ## End(Not run)
Transforms observed data such that the changes in summary statistics of variables at custom time scales are similar to those obtained from climate model simulation. Number of functions can be used to summarize the variables.
msTrans_dif(dta, model = "const", model_par = list(NULL), agg_by = month, wet_int_thr = 0.1, maxiter = 10, tol = 1e-04, period = c("G1", "Y1", "M1", "D1"), qstep = 0.001)
msTrans_dif(dta, model = "const", model_par = list(NULL), agg_by = month, wet_int_thr = 0.1, maxiter = 10, tol = 1e-04, period = c("G1", "Y1", "M1", "D1"), qstep = 0.001)
dta |
List with components |
model |
One of |
model_par |
optional parameters of the |
agg_by |
Function for specification of the period (season, month) to be additionaly included in output, see Details |
wet_int_thr |
Numeric value specifying the minimum depth to be considered wet |
maxiter |
Maximum number of iterations, see Details |
tol |
Stoping criterion of the iteration cycle, see Details |
period |
Specification of the aggregation lengths the correction is applied at with 'D' - day(s), 'M' - month(s), 'Y' - year(s) and 'G1' - the overall mean |
qstep |
A numeric value between 0 and 1. The ecdf is calculated only for the quantiles defined by quantile(0, 1, probs = seq(0, 1, by = qstep). |
transformed data.table
Hanel, M., Kozin, R., 2016. Bias and projected changes in climate model simulations at multiple time scales: consequences for hydrological impact assessment. Environmental Modelling and Software, submitted.
data("basin_PT") scen = basin_PT$sim_scen ctrl = basin_PT$sim_ctrl obs = basin_PT$obs_ctrl dta = list(TO = scen, FROM = ctrl, NEWDATA = obs) ## Not run: msTrans_dif(dta, maxiter = 10, period = 'D1') ## End(Not run)
data("basin_PT") scen = basin_PT$sim_scen ctrl = basin_PT$sim_ctrl obs = basin_PT$obs_ctrl dta = list(TO = scen, FROM = ctrl, NEWDATA = obs) ## Not run: msTrans_dif(dta, maxiter = 10, period = 'D1') ## End(Not run)
Evaluation of empirical cumulative distribution function
prob(x)
prob(x)
x |
vector of values |
value of the empirical distribution function evaluated at x
prob(rnorm(10))
prob(rnorm(10))
The typical use is in compare
to avoid anonymous functions in specification of its fun
argument.
Q(p, ...)
Q(p, ...)
p |
Specification of the quantile |
... |
other arguments passed to |
function calculating the p-th quantile
q90 = Q(.9) class(q90) q90(rnorm(10))
q90 = Q(.9) class(q90) q90(rnorm(10))
Provides the letter code for months
sscale2sea(sub_scale, year_starts = months(-1))
sscale2sea(sub_scale, year_starts = months(-1))
sub_scale |
Typically the |
year_starts |
The start of the year |
Typical workflow is to set year_starts
in the decomp function e.g. to months(-1)
or months(2)
. These both result in climatological seasons (December-January-February - DJF, etc.). The latter in addition results in grouping of warm and cold seasons together at M6 scale. The sub_period field of the decomposed object is with respect to year_starts
, i.e. when year_starts = months(-1)
then sub_period = 1
corresponds to December. To obtain the three-letter codes back, sscale2sea
is used. The function is typically used for plotting.
Vector of three-letter codes for seasons
sscale2sea(1:12, year_starts = months(-1))
sscale2sea(1:12, year_starts = months(-1))
Period durations are calculated by the lubridate
package.
tscale(x, nyears = 30)
tscale(x, nyears = 30)
x |
Vector of the averaging period codes |
nyears |
Overall number of years - used for conversion of the overall mean |
numerical vector of durations in hours
tscale('M1') tscale('G1', nyears = 25)
tscale('M1') tscale('G1', nyears = 25)
Assess the relations between two decomposed variables
vcompare(x, fun = cor, wet_int_only = TRUE, wet_int_thr = 0.1, exclude_below = 0.9)
vcompare(x, fun = cor, wet_int_only = TRUE, wet_int_thr = 0.1, exclude_below = 0.9)
x |
List of decomposed objects |
fun |
Function to sumarize dependence (like |
wet_int_only |
(logical) Should only the wet intervals be considered? |
wet_int_thr |
Numeric value specifying the minimum depth to be consider wet |
exclude_below |
Some of the intervals might not be of required length, e.g. D10 interval may have less than 10 days available. The |
vcompare
compares the relation between all pairs of variables included in x
, typically precipitation and temperature, but other variables may be included also (e.g. runoff).
data.table summarizing the relation with columns:
factor indicating the variable
specification of the averaging length with 'D' - day(s), 'M' - month(s), 'Y' - year(s) and 'G1' - the overall mean
averaging length in hours
indication of the aggregating scale specified by agg_by
argument
factor indicating the data sets from x
with labels given by names(x)
value of fun
library(ggplot2) data(basin_PT) ## Not run: dobs = decomp(basin_PT[['obs_ctrl']]) dctrl = decomp(basin_PT[['sim_ctrl']]) d = vcompare(x = list(OBS = dobs, CTRL = dctrl), fun = cov) ggplot(d[period!='G1']) + geom_line(aes(x = TS, y = value, col = factor(sub_period))) + facet_grid(VARS~ID) + scale_x_log10() ## End(Not run)
library(ggplot2) data(basin_PT) ## Not run: dobs = decomp(basin_PT[['obs_ctrl']]) dctrl = decomp(basin_PT[['sim_ctrl']]) d = vcompare(x = list(OBS = dobs, CTRL = dctrl), fun = cov) ggplot(d[period!='G1']) + geom_line(aes(x = TS, y = value, col = factor(sub_period))) + facet_grid(VARS~ID) + scale_x_log10() ## End(Not run)