Skip to contents

This function binds together data frames created from the create_metric_df() function using the rbindlist() function from the data.table package. The data frames are created by fitting linear models to the training data using the predictor variables provided in model_list, and then calculating the specified metric on the test data.

Usage

metric_bind(
  train_df,
  test_df,
  metric,
  method,
  kmin = "NA",
  target_variable,
  model_list
)

Arguments

train_df

A data frame containing the training data

test_df

A data frame containing the test data

metric

A character string indicating the metric to be calculated. Possible values are "rmse", "mae", "r2"

method

A character string indicating the type of model to be fitted. Possible values are "lm" (for linear regression) and "glm" (for generalized linear models).

kmin

An integer indicating the minimum value of k for k-fold cross-validation. If set to "NA", no cross-validation is performed.

target_variable

A character string indicating the name of the target variable.

model_list

A list containing the names of the predictor variables to be used in the linear models.

Value

A data frame containing the metric results for each predictor variable.

Examples

# load data
data(mtcars)

train_df <- mtcars[1:16, ]
test_df <- mtcars[17:32, ]

single_predictors <- list("mpg", "cyl", "disp", "hp", "am")
multiple_predictors <- list(c("mpg", "cyl"), c("disp", "am"), c("cyl", "am") )

# Single predictor lm regression model
metric_bind(train_df=train_df, test_df=test_df, metric="rmse", method="lm", kmin="NA", target_variable='gear', model_list=single_predictors)
#>    outcome predictor metric metric_value method kmin
#> 1:    gear       mpg   rmse    0.8595481     lm   NA
#> 2:    gear       cyl   rmse    0.8876402     lm   NA
#> 3:    gear      disp   rmse    0.8426984     lm   NA
#> 4:    gear        hp   rmse    1.1698191     lm   NA
#> 5:    gear        am   rmse    0.5899178     lm   NA

# Single predictor kknn regression model
metric_bind(train_df=train_df, test_df=test_df, metric="rmse", method="kknn", kmin=4, target_variable='gear', model_list=single_predictors)
#>    outcome predictor metric metric_value method kmin
#> 1:    gear       mpg   rmse    0.9291293   kknn    4
#> 2:    gear       cyl   rmse    0.8860904   kknn    4
#> 3:    gear      disp   rmse    0.8660254   kknn    4
#> 4:    gear        hp   rmse    1.0532687   kknn    4
#> 5:    gear        am   rmse    0.7126096   kknn    4
# Multiple predictors lm regression model
metric_bind(train_df=train_df, test_df=test_df, metric="rmse", method="lm", kmin="NA", target_variable='gear', model_list=multiple_predictors)
#>    outcome predictor metric metric_value method kmin
#> 1:    gear mpg + cyl   rmse    0.8839327     lm   NA
#> 2:    gear disp + am   rmse    0.7345654     lm   NA
#> 3:    gear  cyl + am   rmse    0.7427264     lm   NA

# Multiple predictors kknn regression model
metric_bind(train_df=train_df, test_df=test_df, metric="rmse", method="kknn", kmin=8, target_variable='gear', model_list=multiple_predictors)
#>    outcome predictor metric metric_value method kmin
#> 1:    gear mpg + cyl   rmse    0.9153637   kknn    8
#> 2:    gear disp + am   rmse    0.7673768   kknn    8
#> 3:    gear  cyl + am   rmse    0.8483670   kknn    8