You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First off thanks to the devs for creating such an awesome and useful library. Just a suggestion - it would be great to add a few date transformers to this library. For example pass on a list of data columns and for each column spit out separate columns year, month, weekday, hour etc. Here is a rudimentary date differ transformer I use often.
import pandas as pd
import numpy as np
import datetime as dt
from sklearn.base import TransformerMixin
class DateDiffer(TransformerMixin):
'''
# takes the difference between two dates and returns value in days
# Please use DateFormatter() before using DateDiffer()
How it works:
If you specify 3 dates: [date1,date2,date3]
Output will be 2 columns:
date2-date1
date3 - date2
The transformer takes the following parameter 'units':
Y: year
M: month
W: week
D: day
h: hour
m: minute
s: second
ms: millisecond
us: microsecond
ns: nanosecond
ps: picosecond
fs: femtosecond
as: attosecond
'''
def __init__(self, unit='D'):
self.unit = unit
def fit(self, X, y=None):
# stateless transformer
return self
def transform(self, X):
# assumes X is a DataFrame
beg_cols = X.columns[:-1]
end_cols = X.columns[1:]
Xbeg = X[beg_cols].as_matrix()
Xend = X[end_cols].as_matrix()
Xd = (Xend - Xbeg) / np.timedelta64(1, self.unit)
diff_cols = ['->'.join(pair) for pair in zip(beg_cols, end_cols)]
Xdiff = pd.DataFrame(Xd, index=X.index, columns=diff_cols)
return Xdiff
My Python foo skills are limited - for example, I am unable to generalize the DateDiffer() transformer to an entire dataframe, or say, pass it a list of columns and do a fit_transform()
Finally, is there a way to pass two numeric columns to a transformer and obtain the column differences? I know I can create interaction variables with the sklearn polynomial transformer but not df{'x1']+df['x2'] for instance.
The text was updated successfully, but these errors were encountered:
Thanks guys - and I might add some of your classes are already solving some pain points alot of us have e..g: safelabelencoder encodes unseen values. I referenced your work in this stackoverflow thread
First off thanks to the devs for creating such an awesome and useful library. Just a suggestion - it would be great to add a few date transformers to this library. For example pass on a list of data columns and for each column spit out separate columns year, month, weekday, hour etc. Here is a rudimentary date differ transformer I use often.
My Python foo skills are limited - for example, I am unable to generalize the DateDiffer() transformer to an entire dataframe, or say, pass it a list of columns and do a fit_transform()
Finally, is there a way to pass two numeric columns to a transformer and obtain the column differences? I know I can create interaction variables with the sklearn polynomial transformer but not df{'x1']+df['x2'] for instance.
The text was updated successfully, but these errors were encountered: