lightgbm train with setting categorical feature as un-exist column names, and then error when predicting after reload model from file? #6692

mangolzy · 2024-10-22T09:27:33Z

lgb version 4.5.0

error: ValueError: train and valid dataset categorical_feature do not match.

setting:
clf = lgb.train(params=params, train_set=lgb_train,
valid_sets=[lgb_train, lgb_test],
valid_names=['train', 'test'],
feval=ks_metric,
categorical_feature=cflist)
when categorical_feature is set to a listA with columns not in train_set columns(listB), it works well when train and predict onsite.
but after save_model to file and reload it by lgb.Booster().
and try to lgb.predict(X) with a new dataframe with the proper feature list(listB) used in training, it output the above error, and it's not removed if i added the listA in X.
So, is it possible to make the current model work in predicting? what should i add as parameters perhaps?

jmoralez · 2024-11-12T17:42:21Z

Hey @mangolzy, thanks for using LightGBM.

That error usually means that the columns in your input dataframe that are expected to be categoricals are not. Can you make sure that they are? e.g. X[listB] = X[listB].astype('category').

If you're able to provide a minimal reproducible example we can provide further help.

jameslamb added the question label Oct 22, 2024

jmoralez added the awaiting response label Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lightgbm train with setting categorical feature as un-exist column names, and then error when predicting after reload model from file? #6692

lightgbm train with setting categorical feature as un-exist column names, and then error when predicting after reload model from file? #6692

mangolzy commented Oct 22, 2024

jmoralez commented Nov 12, 2024

lightgbm train with setting categorical feature as un-exist column names, and then error when predicting after reload model from file? #6692

lightgbm train with setting categorical feature as un-exist column names, and then error when predicting after reload model from file? #6692

Comments

mangolzy commented Oct 22, 2024

jmoralez commented Nov 12, 2024