Issues & feedback #32

ad48hp · 2024-04-15T14:18:10Z

Hello,
i've used pip install torch===1.6.0 torchvision===0.7.0 -f https://download.pytorch.org/whl/torch_stable.html
and it deviated from using
pip install torch===1.6.0+cpu torchvision===0.7.0+cpu -f https://download.pytorch.org/whl/torch_stable.html

I've read that the program can be installed without CUDA, so i guess both options are right ?

I've read there that GPU is needed.
#21
But the installation guide shows that the CUDA is optional.
https://github.com/ProGamerGov/dream-creator/blob/master/INSTALL.md

ad48hp · 2024-04-15T15:35:41Z

I tried the version with CUDA, and so far it works.
However, when i create images, they resemble more the original GoogleNet than my dataseet.
How large should the dataset be ?
I tried 200 and 2,000 images (2 classes), and the above happens.

`2,000 image dataset; 2 classes - Test
Running optimization with ADAM
Iteration 25, Loss -4057.275146484375
Iteration 50, Loss -6080.12939453125
Iteration 75, Loss -7767.14990234375
Iteration 100, Loss -8851.4541015625

200 image dataset; 2 classes
Epoch 50/120

train Loss: 0.0021 Acc: 1.0000
Time Elapsed 15m 12s
val Loss: 0.0000 Acc: 1.0000
Time Elapsed 15m 12s

2,000 image dataset; 2 classes
Epoch 20/120

train Loss: 0.0103 Acc: 0.9988
Time Elapsed 13m 22s
val Loss: 0.0002 Acc: 1.0000
Time Elapsed 13m 28s`

ad48hp · 2024-04-15T16:31:16Z

I tried to change the layer training to freeze conv1, and it's not very different.

I've tried to do the 10 classes dataset, totally comprising of 1,000 images (100 per class), and it looks similar.

`Epoch 10/120

train Loss: 0.6038 Acc: 0.8925
Time Elapsed 2m 26s
val Loss: 0.1755 Acc: 0.9500
Time Elapsed 2m 28s`

ad48hp · 2024-04-17T18:20:48Z

Does the model get trained based on pretrained GoogleNet ?
If so, can you add training the model from scratch ?

I know this is old project, but it pottentially has amazing potential.

ProGamerGov · 2024-04-20T21:04:20Z

@ad48hp The base models were trained with millions of images, so I'm not sure how well training from scratch would work. All you would have to do is set the layers to all be trainable and zero all the model values (or use some other number combo)

Also, some neurons/channels are going to stay the while finetuning as I discovered when I made this project, whereas others will change to match the new content

ad48hp · 2024-04-22T12:42:51Z

Can you upload a script in torch, which would set all the layers to be trainable, and zero all the model values ?
I'm not sure how to do this by myself.

ProGamerGov · 2024-04-29T21:32:41Z

@ad48hp Apologies for the late reply, but zeroing models is actually a bit of a bad idea. You general want a range of numbers to initialize a brand new model for training.

There are multiple options for initialization, and I am unsure of which ones are best for training Inception v1 models: https://pytorch.org/docs/stable/nn.init.html

Here's a quick function that ChatGPT assisted me with, that should initialize the model so that you can train from scratch. Model weights generals have the weight and bias components that you need to use, and you might need to do a bit of testing/research to see what initialization options are best.

import torch.nn as nn

def initialize_model_weights(model):
    for module in model.modules():
        if isinstance(module, (nn.Conv1d, nn.Conv2d, nn.Conv3d, nn.ConvTranspose2d)):
            nn.init.xavier_uniform_(module.weight)
            if module.bias is not None:
                nn.init.constant_(module.bias, 0)
        elif isinstance(module, nn.Linear):
            nn.init.kaiming_uniform_(module.weight)
            if module.bias is not None:
                nn.init.constant_(module.bias, 0)

This code should remove all learned knowledge from the model

ProGamerGov · 2024-04-29T21:55:52Z

Also, if you aren't using a GPU then its going to be painfully slow.

ad48hp · 2024-05-19T15:12:20Z

Can you write complete code for the init ?
I don't really know where to put this code.

ad48hp · 2024-07-25T15:07:17Z

I tried to add the code, like following:
Sorry for incorrect formating.

`
import argparse
import torch
import torch.optim as optim

from utils.training_utils import save_model, load_dataset, reset_weights, set_seed, load_checkpoint, setup_model
from utils.inceptionv1_caffe import InceptionV1_Caffe
from utils.train_model import train_model

import torch.nn as nn

def initialize_model_weights(model):
for module in model.modules():
if isinstance(module, (nn.Conv1d, nn.Conv2d, nn.Conv3d, nn.ConvTranspose2d)):
nn.init.xavier_uniform_(module.weight)
if module.bias is not None:
nn.init.constant_(module.bias, 0)
elif isinstance(module, nn.Linear):
nn.init.kaiming_uniform_(module.weight)
if module.bias is not None:
nn.init.constant_(module.bias, 0)

def main():
parser = argparse.ArgumentParser()
# Input options
parser.add_argument("-data_path", help="Path to your dataset", type=str, default='')
parser.add_argument("-model_file", type=str, default='models/pt_bvlc.pth')
parser.add_argument("-data_mean", type=str, default='')
parser.add_argument("-data_sd", type=str, default='')
parser.add_argument("-base_model", choices=['bvlc', 'p365', '5h'], default='bvlc')

# Training options
parser.add_argument("-num_epochs", type=int, default=120)
parser.add_argument("-batch_size", type=int, default=32)
parser.add_argument( "-lr", "-learning_rate", type=float, default=1e-2)
parser.add_argument("-optimizer", choices=['sgd', 'adam'], default='sgd')
parser.add_argument("-train_workers", type=int, default=0)
parser.add_argument("-val_workers", type=int, default=0)
parser.add_argument("-balance_classes", action='store_true')

# Output options
parser.add_argument("-save_epoch", type=int, default=5)
parser.add_argument("-output_name", type=str, default='bvlc_out.pth')
parser.add_argument("-individual_acc", action='store_true')
parser.add_argument("-save_csv", action='store_true')
parser.add_argument("-csv_dir", type=str, default='')

# Other options
parser.add_argument("-not_caffe", action='store_true')
parser.add_argument("-use_device", type=str, default='cuda:0')
parser.add_argument("-seed", type=int, default=-1)

# Dataset options
parser.add_argument("-val_percent", type=float, default=0.2)

# Model options
parser.add_argument("-reset_weights", action='store_true')
parser.add_argument("-delete_branches", action='store_true')
parser.add_argument("-freeze_aux1_to", choices=['none', 'loss_conv', 'loss_fc', 'loss_classifier'], default='none')
parser.add_argument("-freeze_aux2_to", choices=['none', 'loss_conv', 'loss_fc', 'loss_classifier'], default='none')
parser.add_argument("-freeze_to", choices=['none', 'conv1', 'conv2', 'conv3', 'mixed3a', 'mixed3b', 'mixed4a', 'mixed4b', 'mixed4c', 'mixed4d', 'mixed4e', 'mixed5a', 'mixed5b'], default='mixed3b')
parser.add_argument("-toggle_layers", type=str, default='none')
params = parser.parse_args()
main_func(params)

def main_func(params):
assert params.data_mean != '', "-data_mean is required"
assert params.data_sd != '', "-data_sd is required"
params.data_mean = [float(m) for m in params.data_mean.split(',')]
params.data_sd = [float(s) for s in params.data_sd.split(',')]

if params.seed > -1:
    set_seed(params.seed)
rnd_generator = torch.Generator(device='cpu') if params.seed > -1 else None

# Setup image training data
training_data, num_classes, class_weights = load_dataset(data_path=params.data_path, val_percent=params.val_percent, batch_size=params.batch_size, \
                                                         input_mean=params.data_mean, input_sd=params.data_sd, use_caffe=not params.not_caffe, \
                                                         train_workers=params.train_workers, val_workers=params.val_workers, balance_weights=params.balance_classes, \
                                                         rnd_generator=rnd_generator)


# Setup model definition
cnn, is_start_model, base_model = setup_model(params.model_file, num_classes=num_classes, base_model=params.base_model, pretrained=not params.reset_weights)

if params.optimizer == 'sgd':
    optimizer = optim.SGD(cnn.parameters(), lr=params.lr, momentum=0.9)
elif params.optimizer == 'adam':
    optimizer = optim.Adam(cnn.parameters(), lr=params.lr)

lrscheduler = optim.lr_scheduler.StepLR(optimizer, step_size=8, gamma=0.96)

if params.balance_classes:
    criterion = torch.nn.CrossEntropyLoss(weight=class_weights.to(params.use_device))
else:
    criterion = torch.nn.CrossEntropyLoss()

# Maybe delete braches
if params.delete_branches and not is_start_model:
    try:
        cnn.remove_branches()
        has_branches = False
    except:
        has_branches = True
        pass
else:
   has_branches = True


# Load pretrained model weights
start_epoch = 1
if not params.reset_weights:
    cnn, optimizer, lrscheduler, start_epoch = load_checkpoint(cnn, params.model_file, optimizer, lrscheduler, num_classes, is_start_model=is_start_model)

if params.delete_branches and is_start_model:
    try:
        cnn.remove_branches()
        has_branches = False
    except:
        has_branches = True
        pass
else:
   has_branches = True


# Maybe freeze some model layers
main_layer_list = ['conv1', 'conv2', 'conv3', 'mixed3a', 'mixed3b', 'mixed4a', 'mixed4b', 'mixed4c', 'mixed4d', 'mixed4e', 'mixed5a', 'mixed5b']
if params.freeze_to != 'none':
    for layer in main_layer_list:
        if params.freeze_to == layer:
            break
        for param in getattr(cnn, layer).parameters():
            param.requires_grad = False
branch_layer_list = ['loss_conv', 'loss_fc', 'loss_classifier']
if params.freeze_aux1_to != 'none' and has_branches:
    for layer in branch_layer_list:
        if params.freeze_aux1_to == layer:
            break
        for param in getattr(getattr(cnn, 'aux1'), layer).parameters():
            param.requires_grad = False
if params.freeze_aux2_to != 'none' and has_branches:
    for layer in branch_layer_list:
        if params.freeze_aux2_to == layer:
            break
        for param in getattr(getattr(cnn, 'aux2'), layer).parameters():
            param.requires_grad = False


   # Optionally freeze/unfreeze specific layers and sub layers
if params.toggle_layers != 'none':
    toggle_layers = [l.replace('\\', '/').replace('.', '/').split('/') for l in params.toggle_layers.split(',')]
    for layer in toggle_layers:
        if len(layer) == 2:
            for param in getattr(getattr(cnn, layer[0]), layer[1]).parameters():
                param.requires_grad = False if param.requires_grad == True else False
        else:
            for param in getattr(cnn, layer[0]).parameters():
                param.requires_grad = False if param.requires_grad == True else False


n_learnable_params = sum(param.numel() for param in cnn.parameters() if param.requires_grad)
print('Model has ' + "{:,}".format(n_learnable_params) + ' learnable parameters\n')


cnn = cnn.to(params.use_device)
if 'cuda' in params.use_device:
    if params.seed > -1:
        torch.backends.cudnn.benchmark = True
    torch.backends.cudnn.enabled = True

initialize_model_weights(cnn)
save_info = [[params.data_mean, params.data_sd, 'BGR'], num_classes, has_branches, base_model]

# Train model
train_model(model=cnn, dataloaders=training_data, criterion=criterion, optimizer=optimizer, lrscheduler=lrscheduler, \
            num_epochs=params.num_epochs, start_epoch=start_epoch, save_epoch=params.save_epoch, output_name=params.output_name, \
            device=params.use_device, has_branches=has_branches, fc_only=False, num_classes=num_classes, individual_acc=params.individual_acc, \
            should_save_csv=params.save_csv, csv_path=params.csv_dir, save_info=save_info)

if name == "main":
main()
`

But the images look noisy.

Any clue why ?

ProGamerGov · 2024-07-26T15:45:33Z

@ad48hp Not sure if there's an issue, but that's often what it looks like at first when training from scratch. How many images are you using and how many steps have you trained it for?

The model architecture itself is also over 10 years old at this point as well: https://arxiv.org/abs/1409.4842, so there could be unforeseen problems.

ad48hp changed the title ~~Calculate the mean and standard deviation of your dataset~~ Issues & feedback Apr 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues & feedback #32

Issues & feedback #32

ad48hp commented Apr 15, 2024 •

edited

Loading

ad48hp commented Apr 15, 2024

ad48hp commented Apr 15, 2024 •

edited

Loading

ad48hp commented Apr 17, 2024

ProGamerGov commented Apr 20, 2024 •

edited

Loading

ad48hp commented Apr 22, 2024

ProGamerGov commented Apr 29, 2024 •

edited

Loading

ProGamerGov commented Apr 29, 2024

ad48hp commented May 19, 2024

ad48hp commented Jul 25, 2024 •

edited

Loading

ProGamerGov commented Jul 26, 2024

Issues & feedback #32

Issues & feedback #32

Comments

ad48hp commented Apr 15, 2024 • edited Loading

ad48hp commented Apr 15, 2024

200 image dataset; 2 classes Epoch 50/120

2,000 image dataset; 2 classes Epoch 20/120

ad48hp commented Apr 15, 2024 • edited Loading

`Epoch 10/120

ad48hp commented Apr 17, 2024

ProGamerGov commented Apr 20, 2024 • edited Loading

ad48hp commented Apr 22, 2024

ProGamerGov commented Apr 29, 2024 • edited Loading

ProGamerGov commented Apr 29, 2024

ad48hp commented May 19, 2024

ad48hp commented Jul 25, 2024 • edited Loading

ProGamerGov commented Jul 26, 2024

ad48hp commented Apr 15, 2024 •

edited

Loading

200 image dataset; 2 classes
Epoch 50/120

2,000 image dataset; 2 classes
Epoch 20/120

ad48hp commented Apr 15, 2024 •

edited

Loading

ProGamerGov commented Apr 20, 2024 •

edited

Loading

ProGamerGov commented Apr 29, 2024 •

edited

Loading

ad48hp commented Jul 25, 2024 •

edited

Loading