Title: | Transformations for Unit-Level Small Area Models |
---|---|
Description: | The aim of this package is to offer new methodology for unit-level small area models under transformations and limited population auxiliary information. In addition to this new methodology, the widely used nested error regression model without transformations (see "An Error-Components Model for Prediction of County Crop Areas Using Survey and Satellite Data" by Battese, Harter and Fuller (1988) <doi:10.1080/01621459.1988.10478561>) and its well-known uncertainty estimate (see "The estimation of the mean squared error of small-area estimators" by Prasad and Rao (1990) <doi:10.1080/01621459.1995.10476570>) are provided. In this package, the log transformation and the data-driven log-shift transformation are provided. If a transformation is selected, an appropriate method is chosen depending on the respective input of the population data: Individual population data (see "Empirical best prediction under a nested error model with log transformation" by Molina and Martín (2018) <doi:10.1214/17-aos1608>) but also aggregated population data (see "Estimating regional income indicators under transformations and access to limited population auxiliary information" by Würz, Schmid and Tzavidis <unpublished>) can be entered. Especially under limited data access, new methodologies are provided in saeTrafo. Several options are available to assess the used model and to judge, present and export its results. For a detailed description of the package and the methods used see the corresponding vignette. |
Authors: | Nora Würz [aut] |
Maintainer: | Nora Würz <[email protected]> |
License: | GPL-2 |
Version: | 1.0.2 |
Built: | 2025-02-21 05:16:26 UTC |
Source: | https://github.com/norawuerz/saetrafo |
Function compare_plot
is a generic function used to produce plots
comparing point and existing MSE/CV estimates of direct and model-based
estimation for the Mean.
Methods compare_plot.NER
produce plots comparing point and existing
MSE/CV estimates of direct and model-based estimation from NER_Trafo
.
The direct and model-based point estimates are compared by a scatter plot and
a line plot. If the input arguments MSE and CV are set to TRUE, two extra
plots are created, respectively: the MSE/CV estimates of the direct and
model-based estimates are compared by boxplots and scatter plots.
compare_plot( model, direct, MSE = FALSE, CV = FALSE, label = "orig", color = c("blue", "lightblue3"), shape = c(16, 16), line_type = c("solid", "solid"), gg_theme = NULL, ... ) ## S3 method for class 'NER' compare_plot( model = NULL, direct = NULL, MSE = FALSE, CV = FALSE, label = "orig", color = c("blue", "lightblue3"), shape = c(16, 16), line_type = c("solid", "solid"), gg_theme = NULL, ... )
compare_plot( model, direct, MSE = FALSE, CV = FALSE, label = "orig", color = c("blue", "lightblue3"), shape = c(16, 16), line_type = c("solid", "solid"), gg_theme = NULL, ... ) ## S3 method for class 'NER' compare_plot( model = NULL, direct = NULL, MSE = FALSE, CV = FALSE, label = "orig", color = c("blue", "lightblue3"), shape = c(16, 16), line_type = c("solid", "solid"), gg_theme = NULL, ... )
model |
a model object of type "NER", representing point and optional MSE estimates. |
direct |
an object of type "direct" from "emdi", representing point
and MSE estimates. For more information on how to generate direct estimates,
please see |
MSE |
optional logical. If |
CV |
optional logical. If |
label |
argument that enables to customize title and axis labels. There are three options to label the evaluation plots: (i) original labels ("orig"), (ii) axis labels but no title ("no_title"), (iii) neither axis labels nor title ("blank"). |
color |
a vector with two elements. The first color determines the color for the regression line in the scatter plot and the color for the direct estimates in the remaining plots. The second color specifies the color of the intersection line in the scatter plot and the color for the model-based estimates in the remaining plots. Defaults to c("blue", "lightblue3"). |
shape |
a numeric vector with two elements. The first shape determines the shape of the points in the scatterplot and the shape of the points for the direct estimates in the remaining plots. The second shape determines the shape for the points for the model-based estimates. The options are numbered from 0 to 25. Defaults to c(16, 16). |
line_type |
a character vector with two elements. The first line type determines the line type for the regression line in the scatter plot and the line type for the direct estimates in the remaining plots. The second line type specifies the line type of the intersection line in the scatter plot and the line type for the model-based estimates in the remaining plots. The options are: "twodash", "solid", "longdash", "dotted", "dotdash", "dashed" and "blank". Defaults to c("solid", "solid"). |
gg_theme |
|
... |
further arguments passed to or from other methods. |
Since all of the comparisons need a direct estimator, the plots are only created for in-sample domains.
Plots comparing direct and model-based estimators for the Mean
obtained by ggplot
.
A scatter plot and a line plot comparing direct and model-based
estimators for each selected indicator obtained by
ggplot
. If the input arguments MSE and CV are set to
TRUE
two extra plots are created, respectively: the MSE/CV estimates
of the direct and model-based estimates are compared by boxplots and scatter
plots.
saeTrafoObject
, direct
,
NER_Trafo
# Examples for creating plots to compare the saeTrafo object with direct # estimates (produced by the package emdi) # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp, MSE = TRUE) # Get direct estimates from the R-package emdi require(emdi) library(emdi) emdi_direct <- direct(y = "eqIncome", smp_data = eusilcA_smp, smp_domains = "district", weights = "weight", var = TRUE, na.rm = TRUE) # Please detach emdi or use saeTrafo::compare_plot # Example 1: Comparison plots with uncertainty assessment plots # (for MSE and CV) saeTrafo::compare_plot(model = NER_model, direct = emdi_direct, MSE = TRUE, CV = TRUE) # Example 2: Personalize comparison plots using the options provided with # this function and ggplot themes require(ggplot2) library(ggplot2) saeTrafo::compare_plot(model = NER_model, direct = emdi_direct, MSE = TRUE, CV = TRUE, label = "no_title", color = c("orange", "green"), shape = c(1,2), line_type = c("dotted", "dashed"), gg_theme = theme( text = element_text(size = 20, color = "blue"), panel.border = element_rect(linetype = "dashed", fill = "NA")))
# Examples for creating plots to compare the saeTrafo object with direct # estimates (produced by the package emdi) # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp, MSE = TRUE) # Get direct estimates from the R-package emdi require(emdi) library(emdi) emdi_direct <- direct(y = "eqIncome", smp_data = eusilcA_smp, smp_domains = "district", weights = "weight", var = TRUE, na.rm = TRUE) # Please detach emdi or use saeTrafo::compare_plot # Example 1: Comparison plots with uncertainty assessment plots # (for MSE and CV) saeTrafo::compare_plot(model = NER_model, direct = emdi_direct, MSE = TRUE, CV = TRUE) # Example 2: Personalize comparison plots using the options provided with # this function and ggplot themes require(ggplot2) library(ggplot2) saeTrafo::compare_plot(model = NER_model, direct = emdi_direct, MSE = TRUE, CV = TRUE, label = "no_title", color = c("orange", "green"), shape = c(1,2), line_type = c("dotted", "dashed"), gg_theme = theme( text = element_text(size = 20, color = "blue"), panel.border = element_rect(linetype = "dashed", fill = "NA")))
Function compare_pred
is a generic function used to compare
predictions of two model objects.
Method compare_pred.saeTrafo
compares predictions of two saeTrafo
objects or a saeTrafo object and a emdi object.
compare_pred(object1, object2, MSE = FALSE, ...) ## S3 method for class 'saeTrafo' compare_pred(object1, object2, MSE = FALSE, ...)
compare_pred(object1, object2, MSE = FALSE, ...) ## S3 method for class 'saeTrafo' compare_pred(object1, object2, MSE = FALSE, ...)
object1 |
an object of type "saeTrafo". |
object2 |
an object of type "saeTrafo" or "emdi"
( |
MSE |
optional logical. If |
... |
further arguments passed to or from other methods. |
Data frame containing the point estimates or the MSE estimates
(if MSE
is set to TRUE
) of both objects. If column names are
duplicated, the suffixes "_1" and "_2" are added to their names. "_1" and
"_2" standing for object1 and object2, respectively.
emdi
, NER_Trafo
,
saeTrafoObject
# Example comparing two saeTrafo objects # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model 1 NER_model_1 <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp, MSE = TRUE) # Nested error regression model 2 NER_model_2 <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp, MSE = TRUE) # Generate a data frame for the comparison of point estimates compare_pred(NER_model_1, NER_model_2) # Generate a data frame for the comparison of MSE estimates compare_pred(NER_model_1, NER_model_2, MSE = TRUE)
# Example comparing two saeTrafo objects # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model 1 NER_model_1 <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp, MSE = TRUE) # Nested error regression model 2 NER_model_2 <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp, MSE = TRUE) # Generate a data frame for the comparison of point estimates compare_pred(NER_model_1, NER_model_2) # Generate a data frame for the comparison of MSE estimates compare_pred(NER_model_1, NER_model_2, MSE = TRUE)
Function estimators
is a generic function used to present point and
mean squared error (MSE) estimates. Furthermore, it calculates from both the
coefficients of variation (CV).
Method estimators.saeTrafo
presents point and MSE estimates.
Coefficients of variation are calculated using these estimators. The returned
object is suitable for printing with the method
print.estimators.saeTrafo
.
estimators(object, MSE, CV, ...) ## S3 method for class 'saeTrafo' estimators(object, MSE = FALSE, CV = FALSE, ...)
estimators(object, MSE, CV, ...) ## S3 method for class 'saeTrafo' estimators(object, MSE = FALSE, CV = FALSE, ...)
object |
an object of type "saeTrafo", representing point and, if chosen, MSE estimates. |
MSE |
optional logical. If |
CV |
optional logical. If |
... |
other parameters that can be passed to function |
Objects of class "estimators.saeTrafo" have methods for following
generic functions: head
and tail
(for default documentation,
see head
, tail
),
as.matrix
(for default documentation, see matrix
),
as.data.frame
(for default documentation, see
as.data.frame
), subset
(for default documentation,
see subset
).
The return of estimators.saeTrafo
is an object of type
"estimators.saeTrafo" with point and/or MSE estimates and/or calculated CV's
from saeTrafoObject$ind
and, if chosen, saeTrafoObject$MSE
.
These objects contain two elements, one data frame
ind
and a character naming the indicator or indicator group
ind_name
.
# Example for presenting point, MSE, and CV estimates for a saeTrafo object # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp, MSE = TRUE) sae_mean <- estimators(NER_model, MSE = TRUE, CV = TRUE) class(sae_mean) # use generic functions for estimators.saeTrafo object print(sae_mean) head(sae_mean) tail(sae_mean) as.matrix(sae_mean) as.data.frame(sae_mean) subset(sae_mean)
# Example for presenting point, MSE, and CV estimates for a saeTrafo object # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp, MSE = TRUE) sae_mean <- estimators(NER_model, MSE = TRUE, CV = TRUE) class(sae_mean) # use generic functions for estimators.saeTrafo object print(sae_mean) head(sae_mean) tail(sae_mean) as.matrix(sae_mean) as.data.frame(sae_mean) subset(sae_mean)
The data set is synthetic EU-SILC data based on the data set
eusilcP
from package simFrame. The data set is
reduced to 17 variables containing three regional variables for the states
and districts.
eusilcA_pop
eusilcA_pop
A data frame with 25000 observations and 17 variables:
numeric; a simplified version of the equivalized household income.
numeric; the equivalized household size according to the modified OECD scale.
factor; the person's gender (levels: male and female).
numeric; employee cash or near cash income (net).
numeric; cash benefits or losses from self-employment (net).
numeric; unemployment benefits (net).
numeric; old-age benefits (net).
numeric; survivor's benefits (net).
numeric; sickness benefits (net).
numeric; disability benefits (net).
numeric; income from rental of a property or land (net).
numeric; family/children related allowances (net).
numeric; housing allowances (net).
numeric; interest, dividends, profit from capital investments in unincorporated business (net).
numeric; repayments/receipts for tax adjustment (net).
factor; state (nine levels).
factor; districts (94 levels).
The data set is a simple random sample of data set eusilcA_pop
which is based on eusilcP
from package
simFrame.
eusilcA_smp
eusilcA_smp
A data frame with 1000 observations and 18 variables:
numeric; a simplified version of the equivalized household income.
numeric; the equivalized household size according to the modified OECD scale.
factor; the person's gender (levels: male and female).
numeric; employee cash or near cash income (net).
numeric; cash benefits or losses from self-employment (net).
numeric; unemployment benefits (net).
numeric; old-age benefits (net).
numeric; survivor's benefits (net).
numeric; sickness benefits (net).
numeric; disability benefits (net).
numeric; income from rental of a property or land (net).
numeric; family/children related allowances (net).
numeric; housing allowances (net).
numeric; interest, dividends, profit from capital investments in unincorporated business (net).
numeric; repayments/receipts for tax adjustment (net).
factor; state (nine levels).
factor; districts (94 levels).
numeric; constant weight.
Method fixef.NER
extracts the fixed effects from an saeTrafo object of
class "NER".
## S3 method for class 'NER' fixef(object, ...) ## S3 method for class 'NER' fixed.effects(object, ...)
## S3 method for class 'NER' fixef(object, ...) ## S3 method for class 'NER' fixed.effects(object, ...)
object |
an object of type "NER". |
... |
additional arguments that are not used in this method. |
The alias fixed.effects
can also be used instead of
fixef
. The generic function fixef
is imported from package
nlme and re-exported to make the S3-methods available, even though the
nlme package itself is not loaded or attached. For default
documentation,
see fixed.effects
.
A vector containing the fixed effects is returned.
# Example to extract fixed effects # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) fixef(NER_model)
# Example to extract fixed effects # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) fixef(NER_model)
Method getData.NER
extracts the data frame used to fit the model.
## S3 method for class 'NER' getData(object, ...)
## S3 method for class 'NER' getData(object, ...)
object |
an object of type "NER". |
... |
additional arguments that are not used in this method. |
The generic function getData
is imported from package
nlme and re-exported to make the S3-methods available, even though the
nlme package itself is not loaded or attached. For default
documentation, see getData
.
Data frame used to fit the model. For "NER" the (untransformed) sample data is returned.
# Example to extract object data # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) getData(NER_model)
# Example to extract object data # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) getData(NER_model)
Method getGroups.NER
extracts grouping factors from a saeTrafo object.
## S3 method for class 'NER' getGroups(object, ...)
## S3 method for class 'NER' getGroups(object, ...)
object |
an object of type "NER". |
... |
additional arguments that are not used in this method. |
The generic function getGroups
is imported from package
nlme and re-exported to make the S3-methods available, even though
the nlme package itself is not loaded or attached. For default
documentation, see getGroups
.
A vector containing the grouping factors.
# Example to extract grouping factors # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) getGroups(NER_model)
# Example to extract grouping factors # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) getGroups(NER_model)
Method getGroupsFormula.NER
extracts the grouping formula from an
saeTrafo object.
## S3 method for class 'NER' getGroupsFormula(object, ...)
## S3 method for class 'NER' getGroupsFormula(object, ...)
object |
an object of type "NER". |
... |
additional arguments that are not used in this method. |
The generic function getGroupsFormula
is imported from
package nlme and re-exported to make the S3-methods available, even
though the nlme package itself is not loaded or attached. For default
documentation, see getGroupsFormula
.
A one-sided formula.
# Example to extract grouping formula # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) getGroupsFormula(NER_model)
# Example to extract grouping formula # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) getGroupsFormula(NER_model)
Method getResponse.NER
extracts the response variable from a saeTrafo
object.
## S3 method for class 'NER' getResponse(object, ...)
## S3 method for class 'NER' getResponse(object, ...)
object |
an object of type "NER". |
... |
additional arguments that are not used in this method. |
The generic function getResponse
is imported from package
nlme and re-exported to make the S3-methods available, even though
the nlme package itself is not loaded or attached. For default
documentation, see getResponse
.
Vector containing the response variable.
# Example to extract the response variable # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) getResponse(NER_model)
# Example to extract the response variable # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) getResponse(NER_model)
Method getVarCov.NER
extracts the variance-covariance matrix from a
fitted model of class "NER".
## S3 method for class 'NER' getVarCov(obj, individuals = 1, type = "random.effects", ...)
## S3 method for class 'NER' getVarCov(obj, individuals = 1, type = "random.effects", ...)
obj |
an object of type "NER". |
individuals |
vector of levels of the in-sample domains can be specified
for the types " |
type |
a character that determines the type of variance-covariance
matrix. Types that can be chosen
(i) random-effects variance-covariance matrix (" |
... |
additional arguments that are not used in this method. |
The generic function getVarCov
is imported from package
nlme and re-exported to make the S3-methods available, even though the
nlme package itself is not loaded or attached. For default
documentation, see getVarCov
.
A variance-covariance matrix or a list of variance-covariance
matrices, if more than one individual is selected. For method
getVarCov.NER
, the dimensions of the matrices are 1 x 1 for type
"random.effects
" and number of in-sample domains x number of
in-sample domains for types "conditional
" and "marginal
".
# Example to extract variance-covariance matrix # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) getVarCov(NER_model)
# Example to extract variance-covariance matrix # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) getVarCov(NER_model)
Method intervals.NER
provides the approximate confidence intervals on
the coefficients (fixed effects) of an saeTrafo object.
## S3 method for class 'NER' intervals(object, level = 0.95, parm = NULL, ...)
## S3 method for class 'NER' intervals(object, level = 0.95, parm = NULL, ...)
object |
an object of type "NER". |
level |
an optional numeric value with the confidence level for the intervals. Defaults to 0.95. |
parm |
vector of names to specify which parameters are to be given
confidence intervals. If |
... |
additional arguments that are not used in this method. |
The generic function intervals
is imported from package
nlme and re-exported to make the S3-methods available, even though the
nlme package itself is not loaded or attached. For default
documentation, see intervals
.
A matrix with rows corresponding to the parameters and columns containing the lower confidence limits (lower), the estimated values (est.), and upper confidence limits (upper).
# Example to extract confidence intervals on coefficients # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) intervals(NER_model)
# Example to extract confidence intervals on coefficients # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) intervals(NER_model)
The function simplifies to load the shape file for Austrian districts.
load_shapeaustria()
load_shapeaustria()
The shape file contains the borders of Austrian districts. Thus, it can be used for the visualization of estimation results for Austrian districts.
A shape file of class sf
.
Function map_plot
creates spatial visualizations of the estimates
obtained by small area estimation methods.
map_plot( object, MSE = FALSE, CV = FALSE, map_obj = NULL, map_dom_id = NULL, map_tab = NULL, color = c("white", "red4"), scale_points = NULL, guide = "colourbar", return_data = FALSE )
map_plot( object, MSE = FALSE, CV = FALSE, map_obj = NULL, map_dom_id = NULL, map_tab = NULL, color = c("white", "red4"), scale_points = NULL, guide = "colourbar", return_data = FALSE )
object |
an object of type |
MSE |
optional logical. If |
CV |
optional logical. If |
map_obj |
an |
map_dom_id |
a character string containing the name of a variable in
|
map_tab |
a |
color |
a |
scale_points |
a numeric vector of length two. This scale will be used for every plot. |
guide |
character passed to |
return_data |
if set to |
Creates the plots demanded and, if selected, a fortified data.frame containing the mapdata and chosen indicators.
# Examples for creating maps to visualize the saeTrafo estimates # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp, MSE = TRUE) # Load shape file load_shapeaustria() # Example 1: Map plots with uncertainty plots (for MSE and CV) map_plot(NER_model, MSE = TRUE, CV = TRUE, map_obj = shape_austria_dis, map_dom_id = "PB") # Example 2: Personalize map plot for point estimates map_plot(NER_model, map_obj = shape_austria_dis, map_dom_id = "PB", color = c("white", "darkblue"), scale_points = c(0, max(NER_model$ind$Mean))) # Example 3: More options to personalize map plot by using return_data = TRUE # and ggplot2 require(ggplot2) library(ggplot2) data_plot <- map_plot(NER_model, map_obj = shape_austria_dis, map_dom_id = "PB", return_data = TRUE) ggplot(data_plot, aes(long = NULL, lat = NULL, group = "PB", fill = Mean))+ geom_sf(color = "black") + theme_void() + ggtitle("Personalized map") + scale_fill_gradient2(low = "red", mid = "white", high = "darkgreen", midpoint = 20000) # Example 4: Create a suitable mapping table to use numerical identifiers of # the shape file # First find the right order dom_ord <- match(shape_austria_dis$PB, NER_model$ind$Domain) #Create the mapping table based on the order obtained above map_tab <- data.frame(pop_data_id = NER_model$ind$Domain[dom_ord], shape_id = shape_austria_dis$BKZ) # Create map plot for mean indicator - point and CV estimates but no MSE # using the numerical domain identifiers of the shape file map_plot(object = NER_model, MSE = FALSE, CV = TRUE, map_obj = shape_austria_dis, map_dom_id = "BKZ", map_tab = map_tab)
# Examples for creating maps to visualize the saeTrafo estimates # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp, MSE = TRUE) # Load shape file load_shapeaustria() # Example 1: Map plots with uncertainty plots (for MSE and CV) map_plot(NER_model, MSE = TRUE, CV = TRUE, map_obj = shape_austria_dis, map_dom_id = "PB") # Example 2: Personalize map plot for point estimates map_plot(NER_model, map_obj = shape_austria_dis, map_dom_id = "PB", color = c("white", "darkblue"), scale_points = c(0, max(NER_model$ind$Mean))) # Example 3: More options to personalize map plot by using return_data = TRUE # and ggplot2 require(ggplot2) library(ggplot2) data_plot <- map_plot(NER_model, map_obj = shape_austria_dis, map_dom_id = "PB", return_data = TRUE) ggplot(data_plot, aes(long = NULL, lat = NULL, group = "PB", fill = Mean))+ geom_sf(color = "black") + theme_void() + ggtitle("Personalized map") + scale_fill_gradient2(low = "red", mid = "white", high = "darkgreen", midpoint = 20000) # Example 4: Create a suitable mapping table to use numerical identifiers of # the shape file # First find the right order dom_ord <- match(shape_austria_dis$PB, NER_model$ind$Domain) #Create the mapping table based on the order obtained above map_tab <- data.frame(pop_data_id = NER_model$ind$Domain[dom_ord], shape_id = shape_austria_dis$BKZ) # Create map plot for mean indicator - point and CV estimates but no MSE # using the numerical domain identifiers of the shape file map_plot(object = NER_model, MSE = FALSE, CV = TRUE, map_obj = shape_austria_dis, map_dom_id = "BKZ", map_tab = map_tab)
Function NER_Trafo
estimates small area means based on the
(transformed) nested error regression (NER) model
(Battese et al., 1988).
In contrast to the empirical best predictor of Molina and Rao (2010),
which is implemented in the package emdi (ebp
), no
unit-level population data are required.NER_Trafo
supports the log as well as the data-driven log-shift
transformation. Especially for skewed variables, (data-driven)
transformations are useful to meet the model assumptions for the error terms.
If a transformation is chosen and aggregates (means and covariance) are
simultaneously provided for the population, point estimates are produced by
the method of Wuerz et al. (2022), which uses kernel density
estimation to resolve the issue of not having access to population
micro-data.
In the case that population data are available at unit-level and the log or
log-shift transformation is selected, the bias-correction of
Berg and Chandra (2014) and Molina and Martín (2018) is
applied. For this data situation, more methods and options are provided in
the package emdi.
If only population means are available and the log or log-shift
transformation is selected, a bias-correction due to
the transformation is added but for the lack of access to population data no
correction is available. Therefore, a part of the bias is
disregarded.
Additionally, analytically mean squared errors (MSE) are calculated in the
case of no transformation following Prasad and Rao (1990).
For the log and log-shift transformation, a parametric bootstrap procedure
proposed by Wuerz et al. (2022) following
Gonzalez-Manteiga et al. (2008) is applied. Please note that this can
only be determined if covariance data are also provided.
If population data is available on unit-level a bootstrap procedure as
described in Molina and Martín (2018) is applied.
NER_Trafo( fixed, pop_area_size = NULL, pop_mean = NULL, pop_cov = NULL, pop_data = NULL, pop_domains = NULL, smp_data, smp_domains, threshold = 30, B = 50, transformation = "log.shift", interval = "default", MSE = FALSE, parallel_mode = ifelse(grepl("windows", .Platform$OS.type), "socket", "multicore"), cpus = 1, seed = 123 )
NER_Trafo( fixed, pop_area_size = NULL, pop_mean = NULL, pop_cov = NULL, pop_data = NULL, pop_domains = NULL, smp_data, smp_domains, threshold = 30, B = 50, transformation = "log.shift", interval = "default", MSE = FALSE, parallel_mode = ifelse(grepl("windows", .Platform$OS.type), "socket", "multicore"), cpus = 1, seed = 123 )
fixed |
a two-sided linear formula object describing the
fixed-effects part of the nested error linear regression model with the
dependent variable on the left of a ~ operator and the explanatory
variables on the right, separated by + operators. The argument corresponds
to the argument |
pop_area_size |
a named numeric vector containing the number of individuals within each domain. This numeric vector is named with the domain names. |
pop_mean |
a named list. Each element of the list contains the
population means for the p covariates for a specicfic domain. The list is
named with the respective domain name. The numeric vector within the list is
named with the covariate names. The covariates right of the ~ operator in
|
pop_cov |
a named list. Each element of the list contains the
domain-specific covariance matrice for p covariates for a specicfic domain.
The list is named with the respective domain name. The matrix within the list
has row and column names with the respective covariate names. The covariates
right of the ~ operator in |
pop_data |
a data frame that needs to comprise the variables
named on the right of the ~ operator in |
pop_domains |
a character string containing the name of a variable that
indicates domains in the population data. The variable can be numeric or
a factor but needs to be of the same class as the variable named in
|
smp_data |
a data frame that needs to comprise all variables named in
|
smp_domains |
a character string containing the name of a variable
that indicates domains in the sample data. The variable can be numeric or a
factor but needs to be of the same class as the variable named in
|
threshold |
a numeric value indicating the threshold for using pooled domain data (for domains with sample sizes below the threshold) or non pooled domain data (for domains with sample sizes above the threshold) for the density estimation within the approach of Wuerz et al. (2022). Defaults to 30. |
B |
a number determining the number of bootstrap replications in the parametric bootstrap approach. The number must be greater than 1. Defaults to 50. For practical applications, values larger than 200 are recommended. |
transformation |
a character string. Three different transformation
types for the dependent variable can be chosen (i) no transformation ("no");
(ii) log transformation ("log"); (iii) Log-Shift transformation
("log.shift"). Defaults to |
interval |
a string equal to 'default' or a numeric vector containing a
lower and upper limit determining an interval for the estimation of the
optimal parameter for the log-shift transformation. The interval is passed to
function |
MSE |
optional logical. If |
parallel_mode |
modus of parallelization, defaults to an automatic
selection of a suitable mode, depending on the operating system, if the
number of |
cpus |
number determining the kernels that are used for the
parallelization. Defaults to 1. For details, see
|
seed |
an integer to set the seed for the random number generator. For
the usage of random number generation, see Details. If seed is set to
|
For the parametric bootstrap and the density estimation
approach random number generation is used. Thus, a seed is set by the
argument seed
.
An object of class "NER", "saeTrafo" that provides estimators for
regional means optionally corresponding MSE estimates. Several generic
functions have methods for the returned object. For a full list and
descriptions of the components of objects of class "saeTrafo", see
saeTrafoObject
.
Battese, G.E., Harter, R.M. and Fuller, W.A. (1988). An Error-Components
Model for Predictions of County Crop Areas Using Survey and Satellite Data.
Journal of the American Statistical Association, Vol.83, No. 401,
28-36.
Berg, E. and Chandra, H. (2014). Small area prediction for a unit-level
lognormal model. Computational Statistics & Data Analysis, Vol.78,
159–175.
González-Manteiga, W., Lombardía, M. J., Molina, I., Morales, D. and
Santamaría, L. (2008). Analytic and bootstrap approximations of prediction
errors under a multivariate Fay–Herriot model. Computational Statistics &
Data Analysis, Vol. 52, No. 12, 5242-5252.
Molina, I. and Martín, N. (2018). Empirical best prediction under a nested
error model with log transformation. The Annals of Statistics, Vol.46, No. 5,
1961–1993.
Molina, I. and Rao, J.N.K. (2010). Small area estimation of poverty
indicators. The Canadian Journal of Statistics, Vol. 38, No.3,
369-385.
Prasad, N.N., Rao, J.N.K. (1990). The estimation of the mean squared error of
small-area estimators. Journal of the American statistical association,
Vol. 85, No. 409, 163-171.
Wuerz, N., Schmid, T., and Tzavidis, N. (2022) Estimating regional income
indicators under transformations and access to limited population auxiliary
information. Journal of the Royal Statistical Society: Series A
(Statistics in Society), Vol. 185, No. 4, 1679-1706.
saeTrafoObject
, lme
,
estimators.saeTrafo
, plot.saeTrafo
,
summaries.saeTrafo
# Examples for (transformed) nested error regression model # Load Data data("eusilcA_pop") data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # formula object for all examples formula <- eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj # For all four examples, no MSEs/variances are determined in order to avoid # long run times. These can be obtained with MSE = TRUE. # Example 1: No transformation - classical NER NER_model_1 <- NER_Trafo(fixed = formula, transformation = "no", smp_domains = "district", smp_data = eusilcA_smp, pop_area_size = pop_area_size, pop_mean = pop_mean) # Example 2: Log-shift transformation and population aggregates # (means and covariances) with changed threshold NER_model_2 <- NER_Trafo(fixed = formula, smp_domains = "district", smp_data = eusilcA_smp, pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, threshold = 50) # Example 3: Log-shift transformation and population data # A bias-corrections which need unit-level population data are applied NER_model_3 <- NER_Trafo(fixed = formula, smp_domains = "district", smp_data = eusilcA_smp, pop_data = eusilcA_pop, pop_domains = "district") # Example 4: Log-shift transformation and population aggregates # (only means (!) - Therefore, no MSE estimation is available, bias is # disregarded) NER_model_4 <- NER_Trafo(fixed = formula, smp_domains = "district", smp_data = eusilcA_smp, pop_area_size = pop_area_size, pop_mean = pop_mean)
# Examples for (transformed) nested error regression model # Load Data data("eusilcA_pop") data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # formula object for all examples formula <- eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj # For all four examples, no MSEs/variances are determined in order to avoid # long run times. These can be obtained with MSE = TRUE. # Example 1: No transformation - classical NER NER_model_1 <- NER_Trafo(fixed = formula, transformation = "no", smp_domains = "district", smp_data = eusilcA_smp, pop_area_size = pop_area_size, pop_mean = pop_mean) # Example 2: Log-shift transformation and population aggregates # (means and covariances) with changed threshold NER_model_2 <- NER_Trafo(fixed = formula, smp_domains = "district", smp_data = eusilcA_smp, pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, threshold = 50) # Example 3: Log-shift transformation and population data # A bias-corrections which need unit-level population data are applied NER_model_3 <- NER_Trafo(fixed = formula, smp_domains = "district", smp_data = eusilcA_smp, pop_data = eusilcA_pop, pop_domains = "district") # Example 4: Log-shift transformation and population aggregates # (only means (!) - Therefore, no MSE estimation is available, bias is # disregarded) NER_model_4 <- NER_Trafo(fixed = formula, smp_domains = "district", smp_data = eusilcA_smp, pop_area_size = pop_area_size, pop_mean = pop_mean)
Diagnostic plots of the nested error regression model
(see also NER_Trafo
) are obtained. These include Q-Q plots and
density plots of residuals and random effects, a Cook's distance plot for
detecting outliers and the log-likelihood of the estimation of the
optimal parameter in log-shift transformations. The return depends on the
transformation, such that a plot for the optimal parameter is only returned
in case if a transformation with transformation parameter is chosen.
The range of the x-axis is optional but necessary to change if there are
convergence problems. All plots are obtained by
ggplot
.
## S3 method for class 'saeTrafo' plot( x, label = "orig", color = c("blue", "lightblue3"), gg_theme = NULL, cooks = TRUE, range = NULL, ... )
## S3 method for class 'saeTrafo' plot( x, label = "orig", color = c("blue", "lightblue3"), gg_theme = NULL, cooks = TRUE, range = NULL, ... )
x |
an object of type "NER", representing point
and, if chosen, MSE estimates obtained by the (transformed) nested error
regression model (see also |
label |
argument that enables to customize title and axis labels. There
are three instant options to label the diagnostic plot: (i) original labels
("orig"), (ii) axis lables but no title ("no_title"), (iii) neither axis
labels nor title ("blank"), (iv) individual labels by a list that needs to
have below structure.
Six elements can be defined called |
color |
a character vector with two elements. The first element defines the color for the line in the QQ-plots, for the Cook's Distance plot and for the optimal parameter plot. The second element defines the color for the densities. |
gg_theme |
|
cooks |
optional logical. If |
range |
optional sequence determining the range of the x-axis for plots
of the optimal transformation parameter that defaults to |
... |
optional arguments passed to generic function. |
The default settings of the label
argument are as follows:
c(title="Error term", y_lab="Quantiles of pearson residuals", x_lab="Theoretical quantiles"),
c(title="Random effect", y_lab="Quantiles of random effects", x_lab="Theoretical quantiles"),
c(title="Density - Pearson residuals", y_lab="Density", x_lab="Pearson residuals"),
c(title="Density - Standardized random effects", y_lab="Density", x_lab="Standardized random effects"),
c(title="Cook's Distance Plot", y_lab="Cook's Distance", x_lab="Index"),
c(title="Log-Shift - REML", y_lab="Log-Likelihood", x_lab="expression(lambda)"))
Two Q-Q plots in one grid, two density plots, a Cook's distance plot
and a likelihood plot for the optimal parameter of transformations with
transformation parameter obtained by ggplot
.
# Examples for diagnostic plots # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) # Example 1: Default diagnostic plot plot(NER_model) # Example 2: Creation of diagnostic plots without labels and titles, # different colors and without Cook's distance plot. plot(NER_model, label = "no_title", color = c("red", "yellow"), cooks = FALSE)
# Examples for diagnostic plots # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) # Example 1: Default diagnostic plot plot(NER_model) # Example 2: Creation of diagnostic plots without labels and titles, # different colors and without Cook's distance plot. plot(NER_model, label = "no_title", color = c("red", "yellow"), cooks = FALSE)
This data contains aggregates from eusilcA_pop
which is based on eusilcP
from package
simFrame.
pop_area_size
pop_area_size
A named numeric vector containing the number of individuals within each domain. This numeric vector is named with the domain names.
This data contains aggregates from eusilcA_pop
which is based on eusilcP
from package
simFrame.
pop_cov
pop_cov
A named list. Each element of the list contains the
domain-specific covariance matrice for p covariates for a specicfic domain.
The list is named with the respective domain name. The matrix within the list
has row and column names with the respective covariate names. The covariates
right of the ~ operator in fixed
need to comprise.
This data contains aggregates from eusilcA_pop
which is based on eusilcP
from package
simFrame.
pop_mean
pop_mean
A named list. Each element of the list contains the
population means for the p covariates for a specicfic domain. The list is
named with the respective domain name. The numeric vector within the list is
named with the covariate names. The covariates right of the ~ operator in
fixed
need to comprise.
Method predict.NER
extracts the direct estimates, the empirical
best linear unbiased or empirical best predictors for all domains from an
saeTrafo object.
## S3 method for class 'NER' predict(object, ...)
## S3 method for class 'NER' predict(object, ...)
object |
an object of type "saeTrafo". |
... |
additional arguments that are not used in this method. |
Data frame with domain predictors.
# Examples for Predictions from saeTrafo objects # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) predict(NER_model)
# Examples for Predictions from saeTrafo objects # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) predict(NER_model)
Normal quantile-quantile plots of the underlying model (see
NER_Trafo
) are obtained. The plots are obtained by
ggplot
.
## S3 method for class 'saeTrafo' qqnorm(y, color = c("blue", "lightblue3"), gg_theme = NULL, ...)
## S3 method for class 'saeTrafo' qqnorm(y, color = c("blue", "lightblue3"), gg_theme = NULL, ...)
y |
a model object of type "saeTrafo" (see |
color |
a character vector with two elements. The first element defines the color for the line in the Q-Q plots, for the Cook's Distance plot and for the Box-Cox plot. The second element defines the color for the densities. |
gg_theme |
|
... |
optional arguments passed to generic function. |
Two Q-Q plots in one grid obtained by ggplot
.
# Examples for Quantile-quantile plots # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) # Example 1: Default Quantile-quantile plots qqnorm(NER_model) # Example 2: Personalized plot using theme require("ggplot2") library(ggplot2) qqnorm(NER_model, color = c("red", "darkgreen"), gg_theme = theme(panel.background = element_rect(fill = NA), panel.grid.major = element_line(colour = "grey50"), panel.ontop = TRUE) )
# Examples for Quantile-quantile plots # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) # Example 1: Default Quantile-quantile plots qqnorm(NER_model) # Example 2: Personalized plot using theme require("ggplot2") library(ggplot2) qqnorm(NER_model, color = c("red", "darkgreen"), gg_theme = theme(panel.background = element_rect(fill = NA), panel.grid.major = element_line(colour = "grey50"), panel.ontop = TRUE) )
Method ranef.NER
extracts the random effects from an saeTrafo object.
## S3 method for class 'NER' ranef(object, ...) ## S3 method for class 'NER' random.effects(object, ...)
## S3 method for class 'NER' ranef(object, ...) ## S3 method for class 'NER' random.effects(object, ...)
object |
an object of type "NER". |
... |
additional arguments that are not used in this method. |
The alias random.effects
can also be used instead of
ranef
. The generic function ranef
is imported from package
nlme and re-exported to make the S3-methods available, even though the
nlme package itself is not loaded or attached. For default
documentation, see random.effects
.
A vector containing the estimated random effects at domain level is returned.
# Example to extract random effects # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) ranef(NER_model)
# Example to extract random effects # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp) ranef(NER_model)
The package saeTrafo supports estimating regional means based on the
nested error regression model (Battese et al., 1988).
Therefore, point estimation and mean squared error estimation
(Prasad and Rao, 1990) for the classical model is offered.
In addition to the classical model, the logarithmic and the data-driven
log-shift transformation are provided.
The core function NER_Trafo
allows several options to enter
population data: Either individual population data or only aggregates can be
entered. If full population data is given, the method of
Molina and Martín (2018) is applied.
Compared to other small area packages, these transformations are
accessible in the absence of population micro-data. Only population
aggregates (mean values, population sizes and preferably also covariances)
need to be supplied. The methodology for point and mean squared error
estimates is described in Wuerz et al. (2022) and is made available
in a user-friendly way within saeTrafo.
The estimation function is called NER_Trafo
. For this
function, several methods are available such as
estimators.saeTrafo
and
summaries.saeTrafo
. For a full list, please see
saeTrafoObject
.
Furthermore, functions map_plot
and write.excel
help to visualize and export results. An overview of all currently provided
functions can be requested by library(help=saeTrafo)
.
Battese, G.E., Harter, R.M. and Fuller, W.A. (1988). An Error-Components
Model for Predictions of County Crop Areas Using Survey and Satellite Data.
Journal of the American Statistical Association, Vol.83, No. 401,
28-36.
Molina, I. and Martín, N. (2018). Empirical best prediction under a nested
error model with log transformation. The Annals of Statistics, Vol.46, No. 5,
1961–1993.
Prasad, N.N., Rao, J.N.K. (1990). The estimation of the mean squared error of
small-area estimators. Journal of the American statistical association,
Vol.85, No. 409, 163-171.
Wuerz, N., Schmid, T., Tzavidis, N. (2022). Estimating regional income
indicators under transformations and access to limited population auxiliary
information. Unpublished.
An object of class saeTrafo that represents point predictions of domain-specific means. Optionally, it also contains corresponding MSE estimates. Objects of these classes have methods for various generic functions. See Details for more information.
Objects of class "saeTrafo" and subclass "NER_Trafo" have the following
methods:
compare_pred
, estimators
,
plot.saeTrafo
,
predict.NER
,
qqnorm.saeTrafo
,
compare_plot
,
getData
,
getGroups
, getGroupsFormula
,
getResponse
,
plot
(for documentation, see plot.saeTrafo
),
print
, qqnorm
(for documentation, see
qqnorm.saeTrafo
) and
summary
(for documentation, see
summaries.saeTrafo
),
coef
(for default documentation, see coef
),
confint
(for default documentation, see confint
),
family
(for default documentation, see family
),
fitted
(for default documentation, see
fitted.values
),
fixef
,
formula
(for default documentation, see formula
),
getVarCov
,
intervals
,
logLik
(for default documentation, see logLik
),
nobs
(for default documentation, see nobs
),
ranef
,
residuals
(for default documentation, see
residuals
),
terms
(for default documentation, see terms
),
vcov
(for default documentation, see vcov
)
sigma
(for default documentation, see sigma
)
The following components are always included in an saeTrafo object but not always filled:
call |
the function call that produced the object. |
fixed |
for details, see |
framework |
a list with components that describe the data setup, e.g., number of domains in the sample. |
ind |
data frame containing estimates for the mean per domain. |
method |
character returning the method for the estimation approach used to fit the linear mixed model and for the the optimal lambda, in our case "reml". |
model |
list containing a selection of model components. |
MSE |
data frame containing MSE estimates corresponding to the
mean predictions in |
transformation |
character or list containing information about applied transformation. |
transform_param |
a list with two elements, |
successful_bootstraps |
a numeric returning the number of
successful bootstraps. If |
Additional information about the data and model in small area estimation
methods and components of an saeTrafo object are extracted. The returned
object is suitable for printing with the print
function.
## S3 method for class 'NER' summary(object, ...)
## S3 method for class 'NER' summary(object, ...)
object |
an object of type "NER", representing point and MSE estimates. |
... |
additional arguments that are not used in this method. |
an object of type "summary.NER" with information about the sample and population data, the usage of transformation, normality tests and information of the model fit.
saeTrafoObject
, NER_Trafo
,
r.squaredGLMM
, skewness
,
kurtosis
, shapiro.test
Function write.excel
enables the user to export point and MSE
estimates as well as diagnostics from the summary
to an Excel file.
The user can choose if the results should be reported in one or several Excel
sheets. Furthermore, a selection of indicators can be specified.
Respectively the function write.ods
enables the export to OpenDocument
Spreadsheets. Note that while write.exel
will create a single document
write.ods
will create a group of files.
write.excel( object, file = "excel_output.xlsx", MSE = FALSE, CV = FALSE, split = FALSE ) write.ods( object, file = "ods_output.ods", MSE = FALSE, CV = FALSE, split = FALSE )
write.excel( object, file = "excel_output.xlsx", MSE = FALSE, CV = FALSE, split = FALSE ) write.ods( object, file = "ods_output.ods", MSE = FALSE, CV = FALSE, split = FALSE )
object |
an object of type "saeTrafo", representing point and MSE estimates. |
file |
path and filename of the spreadsheet to create. It should end on .xlsx or .ods respectively. |
MSE |
logical. If |
CV |
logical. If |
split |
logical. If |
These functions create an Excel file via the package openxlsx and ODS files via the package readODS. Both packages require a zip application to be available to R. If this is not the case the authors of openxlsx suggest the first of the following two ways.
Install Rtools from: http://cran.r-project.org/bin/windows/Rtools/ and modify the system PATH during installation.
If Rtools is installed, but no system path variable is set. One can
set such a variable temporarily to R by a command like:
Sys.setenv("R_ZIPCMD" = "PathToTheRToolsFolder/bin/zip.exe")
.
To check if a zip application is available they recommend the command
shell("zip")
.
An Excel file is created in your working directory, or at the given path. Alternatively multiple ODS files are created at the given path.
# Examples for exporting saeTrafoObject to an Excel file or OpenDocument # Spreadsheet ## Not run: # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp, MSE = TRUE) # Example 1: Export estimates for all indicators and uncertainty measures and # diagnostics to Excel write.excel(NER_model, file = "excel_output.xlsx", MSE = TRUE, CV = TRUE) # Example 2: Single Excel sheets for point, MSE and CV estimates write.excel(NER_model, file = "excel_output_split.xlsx", MSE = TRUE, CV = TRUE, split = TRUE) # Example 3: Same as example 1 but for an ODS output write.ods(NER_model, file = "ods_output_all.ods", MSE = TRUE, CV = TRUE) ## End(Not run)
# Examples for exporting saeTrafoObject to an Excel file or OpenDocument # Spreadsheet ## Not run: # Load Data data("eusilcA_smp") data("pop_area_size") data("pop_mean") data("pop_cov") # Nested error regression model NER_model <- NER_Trafo(fixed = eqIncome ~ gender + eqsize + cash + self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + fam_allow + house_allow + cap_inv + tax_adj, smp_domains = "district", pop_area_size = pop_area_size, pop_mean = pop_mean, pop_cov = pop_cov, smp_data = eusilcA_smp, MSE = TRUE) # Example 1: Export estimates for all indicators and uncertainty measures and # diagnostics to Excel write.excel(NER_model, file = "excel_output.xlsx", MSE = TRUE, CV = TRUE) # Example 2: Single Excel sheets for point, MSE and CV estimates write.excel(NER_model, file = "excel_output_split.xlsx", MSE = TRUE, CV = TRUE, split = TRUE) # Example 3: Same as example 1 but for an ODS output write.ods(NER_model, file = "ods_output_all.ods", MSE = TRUE, CV = TRUE) ## End(Not run)