Perform preprocessing for the training data, including converting data to dataframe, and encoding categorical data into numerical representation.
preprocess_training(x, y)
A data frame of all training predictors.
A vector of all training responses.
A list of two datasets along with necessary information that encodes the preprocessing.