You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I created a model that had categorical features above the value of 0 (range of n to m, where n>0 and m>0). I wanted to plot the partial dependence for my model, but ran into a ValueError (error recreated below). The problem is that generate_X_grid creates a matrix that looks like this:
[[0,0,0, ..., 0, i, 0, ..., 0,0,0],
[0,0,0, ..., 0, i, 0, ..., 0,0,0],
...,
[0,0,0, ..., 0, i, 0, ..., 0,0,0]]
And for models that have been trained with categorical features that do not have '0' as a category, this will raise an error when calling the partial dependence function.
Here is a recreation of the error using the Quick start example code:
Input:
from pygam.datasets import wage
X, y = wage()
from pygam import LinearGAM, s, f
gam = LinearGAM(f(0) + s(1) + f(2)).fit(X, y) ##Use f(0) to make the 0th term categorical. The 0th term contains no value equal to 0
import matplotlib.pyplot as plt
for i, term in enumerate(gam.terms):
if term.isintercept:
continue
XX = gam.generate_X_grid(term=i)
pdep, confi = gam.partial_dependence(term=i, X=XX, width=0.95)
#plt.figure()
plt.plot(XX[:, term.feature], pdep)
plt.plot(XX[:, term.feature], confi, c='r', ls='--')
plt.title(repr(term))
plt.show()
Output:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-10-0e5df89ff530> in <module>()
7 XX = gam.generate_X_grid(term=i)
8 print(XX)
----> 9 pdep, confi = gam.partial_dependence(term=i, X=XX, width=0.95)
10
11 #plt.figure()
/Users/tatekeller/opt/anaconda3/envs/pbh/lib/python3.6/site-packages/pygam/pygam.py in partial_dependence(self, term, X, width, quantiles, meshgrid)
1542 features=self.feature, verbose=self.verbose)
1543
-> 1544 modelmat = self._modelmat(X, term=term)
1545 pdep = self._linear_predictor(modelmat=modelmat, term=term)
1546 out = [pdep]
/Users/tatekeller/opt/anaconda3/envs/pbh/lib/python3.6/site-packages/pygam/pygam.py in _modelmat(self, X, term)
455 X = check_X(X, n_feats=self.statistics_['m_features'],
456 edge_knots=self.edge_knots_, dtypes=self.dtype,
--> 457 features=self.feature, verbose=self.verbose)
458
459 return self.terms.build_columns(X, term=term)
/Users/tatekeller/opt/anaconda3/envs/pbh/lib/python3.6/site-packages/pygam/utils.py in check_X(X, n_feats, min_samples, edge_knots, dtypes, features, verbose)
301 'feature {}. Expected data on [{}, {}], '\
302 'but found data on [{}, {}]'\
--> 303 .format(i, min_, max_, x.min(), x.max()))
304
305 return X
ValueError: X data is out of domain for categorical feature 0. Expected data on [2003.0, 2009.0], but found data on [0.0, 0.0]
The versions that I used are:
pyGAM=0.8.0
Python=3.6.12
For now I will work around this by subtracting the respective minimum value from each categorical value changing the category range values from (n,m) to (n-n, m-n)==(0,m-n).
Thanks in advance
The text was updated successfully, but these errors were encountered:
Hi there,
I created a model that had categorical features above the value of 0 (range of n to m, where n>0 and m>0). I wanted to plot the partial dependence for my model, but ran into a ValueError (error recreated below). The problem is that generate_X_grid creates a matrix that looks like this:
And for models that have been trained with categorical features that do not have '0' as a category, this will raise an error when calling the partial dependence function.
Here is a recreation of the error using the Quick start example code:
Input:
Output:
The versions that I used are:
pyGAM=0.8.0
Python=3.6.12
For now I will work around this by subtracting the respective minimum value from each categorical value changing the category range values from (n,m) to (n-n, m-n)==(0,m-n).
Thanks in advance
The text was updated successfully, but these errors were encountered: