-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request for Electricity Data Set or Preprocessing Script #15
Comments
https://archive.ics.uci.edu/dataset/321/electricityloaddiagrams20112014 |
你好,我尝试在这个电力数据集上进行训练,但是原来的代码中的Dataset_ECG无法使用,该数据集以“;”作为分割,然后似乎没有小数点只有“,“,但描述说是浮点数,于是我将”,“换成了小数点,并在Dataset_ECG的基础上进行了修改,发现结果很离谱,请问您复现了吗,是我存在哪些疏忽吗? 我的Dataset_Electricity如下所示 class Dataset_Electricity(Dataset):
def __init__(self, root_path, flag, seq_len, pre_len, type, train_ratio, val_ratio):
assert flag in ['train', 'test', 'val']
self.path = root_path
self.flag = flag
self.seq_len = seq_len
self.pre_len = pre_len
self.train_ratio = train_ratio
self.val_ratio = val_ratio
# my channge begin
df = pd.read_csv(root_path,sep=';', dtype=str) # 以字符串格式;为分割符读取csv,(将原本的txt改名为csv)
df = df.applymap(lambda x: x.replace(',', '.')) # 替换每个单元中的‘,’变为小数点
data = df.apply(pd.to_numeric, errors='coerce').values # 将字符串转换为浮点数
data = data[:,1:-1] # 去掉第一列的日期
# my channge end
if type == '1':
mms = MinMaxScaler(feature_range=(0, 1))
training_end = int(len(data) * self.train_ratio)
mms.fit(data[:training_end])
data = mms.transform(data)
data = np.array(data)
if self.flag == 'train':
begin = 0
end = int(len(data)*self.train_ratio)
self.trainData = data[begin:end]
if self.flag == 'val':
begin = int(len(data)*self.train_ratio)
end = int(len(data)*(self.val_ratio+self.train_ratio))
self.valData = data[begin:end]
if self.flag == 'test':
begin = int(len(data)*(self.val_ratio+self.train_ratio))
end = len(data)
self.testData = data[begin:end]
def __getitem__(self, index):
begin = index
end = index + self.seq_len
next_begin = end
next_end = next_begin + self.pre_len
if self.flag == 'train':
data = self.trainData[begin:end]
next_data = self.trainData[next_begin:next_end]
elif self.flag == 'val':
data = self.valData[begin:end]
next_data = self.valData[next_begin:next_end]
else:
data = self.testData[begin:end]
next_data = self.testData[next_begin:next_end]
return data, next_data
def __len__(self):
# minus the label length
if self.flag == 'train':
return len(self.trainData)-self.seq_len-self.pre_len
elif self.flag == 'val':
return len(self.valData)-self.seq_len-self.pre_len
else:
return len(self.testData)-self.seq_len-self.pre_len |
Hello,
I am currently working on time series forecasting of electricity data. After searching through your repository, I couldn't find the file electricity.csv. Additionally, I couldn't locate it in any public datasets online.
I assume that some preprocessing might have been done on the original data. Could you please provide me with the electricity data set or the preprocessing script used?
Thank you very much for your assistance.
Have a nice day!
Best regards,
The text was updated successfully, but these errors were encountered: