Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for Electricity Data Set or Preprocessing Script #15

Open
abramed opened this issue May 30, 2024 · 2 comments
Open

Request for Electricity Data Set or Preprocessing Script #15

abramed opened this issue May 30, 2024 · 2 comments

Comments

@abramed
Copy link

abramed commented May 30, 2024

Hello,

I am currently working on time series forecasting of electricity data. After searching through your repository, I couldn't find the file electricity.csv. Additionally, I couldn't locate it in any public datasets online.

I assume that some preprocessing might have been done on the original data. Could you please provide me with the electricity data set or the preprocessing script used?

Thank you very much for your assistance.

Have a nice day!

Best regards,

@Solaaaaa
Copy link

https://archive.ics.uci.edu/dataset/321/electricityloaddiagrams20112014
这是电力数据集的网址,但是它是txt格式的。你可能需要将它转换为csv文件或者在data_loader.py文件中写一个Dataset_Electricity类,在Dataset_Electricity函数中你需要使用np.loadtxt()来加载数据。
希望能够帮助到你!

@saber1360
Copy link

saber1360 commented Nov 3, 2024

https://archive.ics.uci.edu/dataset/321/electricityloaddiagrams20112014 这是电力数据集的网址,但是它是txt格式的。你可能需要将它转换为csv文件或者在data_loader.py文件中写一个Dataset_Electricity类,在Dataset_Electricity函数中你需要使用np.loadtxt()来加载数据。 希望能够帮助到你!

你好,我尝试在这个电力数据集上进行训练,但是原来的代码中的Dataset_ECG无法使用,该数据集以“;”作为分割,然后似乎没有小数点只有“,“,但描述说是浮点数,于是我将”,“换成了小数点,并在Dataset_ECG的基础上进行了修改,发现结果很离谱,请问您复现了吗,是我存在哪些疏忽吗?
我的结果如下所示
image
image

我的Dataset_Electricity如下所示

class Dataset_Electricity(Dataset):
    def __init__(self, root_path, flag, seq_len, pre_len, type, train_ratio, val_ratio):
        assert flag in ['train', 'test', 'val']
        self.path = root_path
        self.flag = flag
        self.seq_len = seq_len
        self.pre_len = pre_len
        self.train_ratio = train_ratio
        self.val_ratio = val_ratio
        # my channge begin
        df = pd.read_csv(root_path,sep=';', dtype=str)  # 以字符串格式;为分割符读取csv,(将原本的txt改名为csv)
        df = df.applymap(lambda x: x.replace(',', '.')) # 替换每个单元中的‘,’变为小数点
        data = df.apply(pd.to_numeric, errors='coerce').values  # 将字符串转换为浮点数
        data = data[:,1:-1] # 去掉第一列的日期
        # my channge end

        if type == '1':
            mms = MinMaxScaler(feature_range=(0, 1))
            training_end = int(len(data) * self.train_ratio)
            mms.fit(data[:training_end])
            data = mms.transform(data)
        data = np.array(data)
        if self.flag == 'train':
            begin = 0
            end = int(len(data)*self.train_ratio)
            self.trainData = data[begin:end]
        if self.flag == 'val':
            begin = int(len(data)*self.train_ratio)
            end = int(len(data)*(self.val_ratio+self.train_ratio))
            self.valData = data[begin:end]
        if self.flag == 'test':
            begin = int(len(data)*(self.val_ratio+self.train_ratio))
            end = len(data)
            self.testData = data[begin:end]

    def __getitem__(self, index):
        begin = index
        end = index + self.seq_len
        next_begin = end
        next_end = next_begin + self.pre_len
        if self.flag == 'train':
            data = self.trainData[begin:end]
            next_data = self.trainData[next_begin:next_end]
        elif self.flag == 'val':
            data = self.valData[begin:end]
            next_data = self.valData[next_begin:next_end]
        else:
            data = self.testData[begin:end]
            next_data = self.testData[next_begin:next_end]
        return data, next_data

    def __len__(self):
        # minus the label length
        if self.flag == 'train':
            return len(self.trainData)-self.seq_len-self.pre_len
        elif self.flag == 'val':
            return len(self.valData)-self.seq_len-self.pre_len
        else:
            return len(self.testData)-self.seq_len-self.pre_len

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants