Skip to contents

This function creates a model specification for either linear regression or k-nearest neighbor regression, with an optional kmin tuning for the latter. If kmin is not specified, the function performs a grid search to find the optimal k value that gives the minimum root mean squared error (RMSE) on a 5-fold cross-validation of the training data.

Usage

create_spec_kmin(
  df,
  model_recipe,
  method,
  kmin = "NA",
  metric,
  target_variable,
  weight_func = "rectangular",
  mode = "regression"
)

Arguments

df

A data frame containing the training data.

model_recipe

A recipe object created using the create_recipe function.

method

A character string indicating the type of regression method to be used: "lm" for linear regression or "kknn" for k-nearest neighbor regression.

kmin

A numeric value specifying the minimum number of neighbors to be considered when performing k-nearest neighbor regression. If set to "NA", the function performs a grid search to find the optimal k value. Default is "NA".

metric

A character string specifying the performance metric to calculate ("rmse", "rsq", or "mae")

target_variable

A character string indicating the name of the target variable to be predicted.

weight_func

A character string indicating the weight function used for the k-nearest neighbor regression. Default is "rectangular".

mode

A character string indicating the type of regression task. Default is "regression".

Value

A list containing the model specification and the kmin value (if applicable).

Examples

train_df <- mtcars[1:16, ]
target_df <- target_df(train_df, "gear")
model_recipe <- create_recipe(target_df, "gear")
create_spec_kmin(train_df, model_recipe, "lm", metric="rmse", target_variable="gear")
#> [[1]]
#> Linear Regression Model Specification (regression)
#> 
#> Computational engine: lm 
#> 
#> 
#> [[2]]
#> [1] "NA"
#> 
create_spec_kmin(train_df, model_recipe, "kknn", metric="rmse", kmin=5, target_variable="gear")
#> [[1]]
#> K-Nearest Neighbor Model Specification (regression)
#> 
#> Main Arguments:
#>   neighbors = kmin
#>   weight_func = weight_func
#> 
#> Computational engine: kknn 
#> 
#> 
#> [[2]]
#> [1] 5
#>