Skip to content

Commit

Permalink
feat: dp
Browse files Browse the repository at this point in the history
  • Loading branch information
LutingWang committed Sep 9, 2024
1 parent f981bc0 commit 7da13c4
Show file tree
Hide file tree
Showing 20 changed files with 127 additions and 106 deletions.
22 changes: 13 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ OADP/data
```shell
mkdir pretrained
python -c "import torchvision; _ = torchvision.models.ResNet50_Weights.IMAGENET1K_V1.get_state_dict(True)"
ln -s ~/.cache/torch/hub/checkpoints/ pretrained/torchvision
ln -s ~/.cache/torch/hub/checkpoints/ pretrained/torch
```

## Prompts
Expand Down Expand Up @@ -82,10 +82,15 @@ The following scripts extract features with CLIP, which can be very time-consumi
Extract globals and blocks features, which can be used for both coco and lvis

```bash
[DRY_RUN=True] bash tools/torchrun.sh -m oadp.oake.val oake/coco_globals_cuda configs/oake/coco_globals_cuda.py [--auto-fix]
[DRY_RUN=True] bash tools/torchrun.sh -m oadp.oake.val oake/coco_blocks_cuda configs/oake/coco_blocks_cuda.py [--auto-fix]
[DRY_RUN=True] bash tools/torchrun.sh -m oadp.oake.val oake/coco_objects_cuda configs/oake/coco_objects_cuda.py [--auto-fix]
[DRY_RUN=True] bash tools/torchrun.sh -m oadp.oake.val oake/lvis_objects_cuda configs/oake/lvis_objects_cuda.py [--auto-fix]
[DRY_RUN=True] bash tools/torchrun.sh -m oadp.oake.val oake/coco/clip_globals_cuda configs/oake/clip_globals_cuda.py --config-options dataset::COCO [--auto-fix]
[DRY_RUN=True] bash tools/torchrun.sh -m oadp.oake.val oake/coco/clip_blocks_cuda configs/oake/clip_blocks_cuda.py --config-options dataset::COCO [--auto-fix]
[DRY_RUN=True] bash tools/torchrun.sh -m oadp.oake.val oake/coco/clip_objects_cuda configs/oake/clip_objects_cuda.py --config-options dataset::COCO [--auto-fix]
[DRY_RUN=True] bash tools/torchrun.sh -m oadp.oake.val oake/lvis/clip_objects_cuda configs/oake/clip_objects_cuda.py --config-options dataset::LVIS [--auto-fix]

[DRY_RUN=True] bash tools/torchrun.sh -m oadp.oake.val oake/coco/dino_globals_cuda configs/oake/dino_globals_cuda.py --config-options dataset::COCO [--auto-fix]
[DRY_RUN=True] bash tools/torchrun.sh -m oadp.oake.val oake/coco/dino_blocks_cuda configs/oake/dino_blocks_cuda.py --config-options dataset::COCO [--auto-fix]
[DRY_RUN=True] bash tools/torchrun.sh -m oadp.oake.val oake/coco/dino_objects_cuda configs/oake/dino_objects_cuda.py --config-options dataset::COCO [--auto-fix]
[DRY_RUN=True] bash tools/torchrun.sh -m oadp.oake.val oake/lvis/dino_objects_cuda configs/oake/dino_objects_cuda.py --config-options dataset::LVIS [--auto-fix]
```

The number of files generated by OAKE-objects may be less than the number of images in the dataset.
Expand All @@ -94,22 +99,21 @@ Images without objects are skipped.
```bash
bash tools/torchrun.sh tools/generate_sample_images.py coco
python tools/encode_sample_images.py coco
python tools/sample_visual_category_embeddings.py coco
python tools/sample_visual_category_embeddings.py coco clip
```

## DP

To conduct training for coco

```bash
[DRY_RUN=True] [TRAIN_WITH_VAL_DATASET=True] bash tools/torchrun.sh -m oadp.dp.train vild_ov_coco configs/dp/vild_ov_coco.py [--override .validator.dataloader.dataset.ann_file::data/coco/annotations/instances_val2017.48.json]
[DRY_RUN=True] [TRAIN_WITH_VAL_DATASET=True] bash tools/torchrun.sh -m oadp.dp.train configs/dp/oadp_ov_coco.py --work-dir work_dirs/oadp_ov_coco [--override .validator.dataloader.dataset.ann_file::data/coco/annotations/instances_val2017.48.json]
[DRY_RUN=True] [TRAIN_WITH_VAL_DATASET=True] bash tools/torchrun.sh -m oadp.dp.train ov_coco configs/dp/ov_coco.py [--override .validator.dataloader.dataset.ann_file::data/coco/annotations/instances_val2017.48.json]
```

To conduct training for lvis

```bash
[DRY_RUN=True] [TRAIN_WITH_VAL_DATASET=True] bash tools/torchrun.sh -m oadp.dp.train oadp_ov_lvis configs/dp/oadp_ov_lvis.py
[DRY_RUN=True] [TRAIN_WITH_VAL_DATASET=True] bash tools/torchrun.sh -m oadp.dp.train ov_lvis configs/dp/ov_lvis.py
```

To test a specific checkpoint
Expand Down
2 changes: 1 addition & 1 deletion configs/dp/datasets/coco.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
train_pipeline = [
dict(type='LoadImageFromFile', backend_args=None),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='LoadOAKE_COCO'),
dict(type='LoadOAKE_COCO', model='clip'),
dict(
type='RandomResize',
scale=[(1330, 640), (1333, 800)],
Expand Down
2 changes: 1 addition & 1 deletion configs/dp/datasets/lvis.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
train_pipeline = [
dict(type='LoadImageFromFile', backend_args=None),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='LoadOAKE_LVIS'),
dict(type='LoadOAKE_LVIS', model='clip'),
dict(
type='RandomResize',
scale=[(1330, 640), (1333, 800)],
Expand Down
2 changes: 1 addition & 1 deletion configs/dp/datasets/objects365.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
train_pipeline = [
dict(type='LoadImageFromFile', backend_args=None),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='LoadOAKE_Objects365'),
dict(type='LoadOAKE_Objects365', model='clip'),
dict(
type='RandomResize',
scale=[(1330, 640), (1333, 800)],
Expand Down
1 change: 0 additions & 1 deletion configs/oake/clip_blocks_cuda.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
_kwargs_: dict[str, Any]
_kwargs_ = dict(_kwargs_)

_kwargs_.setdefault('dataset', 'COCO')
_kwargs_.setdefault('branch', 'Block')
_kwargs_.setdefault('strategy', 'cuda')

Expand Down
1 change: 0 additions & 1 deletion configs/oake/clip_globals_cuda.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
_kwargs_: dict[str, Any]
_kwargs_ = dict(_kwargs_)

_kwargs_.setdefault('dataset', 'COCO')
_kwargs_.setdefault('branch', 'Global')
_kwargs_.setdefault('strategy', 'cuda')

Expand Down
3 changes: 1 addition & 2 deletions configs/oake/clip_objects_cuda.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,14 @@
_kwargs_: dict[str, Any]
_kwargs_ = dict(_kwargs_)

_kwargs_.setdefault('dataset', 'COCO')
_kwargs_.setdefault('branch', 'Object')
_kwargs_.setdefault('strategy', 'cuda')

_base_ = [
PyConfig.load('configs/oake/interface.py', **_kwargs_),
]

runner = dict(model=dict(type='clip_vit', expand_mask_size=28, adaptive=False))
runner = dict(model=dict(type='clip_vit', expand_mask_size=14, adaptive=False))
custom_imports = [
'oadp.oake.objects',
]
Expand Down
1 change: 0 additions & 1 deletion configs/oake/dino_blocks_cuda.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
_kwargs_: dict[str, Any]
_kwargs_ = dict(_kwargs_)

_kwargs_.setdefault('dataset', 'COCO')
_kwargs_.setdefault('branch', 'Block')
_kwargs_.setdefault('strategy', 'cuda')

Expand Down
1 change: 0 additions & 1 deletion configs/oake/dino_globals_cuda.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
_kwargs_: dict[str, Any]
_kwargs_ = dict(_kwargs_)

_kwargs_.setdefault('dataset', 'COCO')
_kwargs_.setdefault('branch', 'Global')
_kwargs_.setdefault('strategy', 'cuda')

Expand Down
1 change: 0 additions & 1 deletion configs/oake/dino_objects_cuda.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
_kwargs_: dict[str, Any]
_kwargs_ = dict(_kwargs_)

_kwargs_.setdefault('dataset', 'COCO')
_kwargs_.setdefault('branch', 'Object')
_kwargs_.setdefault('strategy', 'cuda')

Expand Down
5 changes: 3 additions & 2 deletions oadp/categories/embeddings/visual.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,10 @@

class VisualCategoryEmbedding(BaseCategoryEmbedding):

def __init__(self, *args, **kwargs) -> None:
def __init__(self, *args, model: str, **kwargs) -> None:
embeddings: dict[str, torch.Tensor] = torch.load(
f'work_dirs/visual_category_embeddings/{Globals.categories.name}.pth',
'work_dirs/visual_category_embeddings/'
f'{Globals.categories.name}_{model}.pth',
'cpu',
)
super().__init__(*args, embeddings=embeddings, **kwargs)
14 changes: 10 additions & 4 deletions oadp/dp/classifiers.py
Original file line number Diff line number Diff line change
Expand Up @@ -184,21 +184,27 @@ def __init__(
self._linear = nn.Linear(in_features, embedding_dim)

if out_features == Globals.categories.num_all + 1:
background_embedding = nn.Parameter(torch.zeros(1, embedding_dim))
nn.init.xavier_uniform_(background_embedding)
bg_embedding = nn.Parameter(torch.zeros(1, embedding_dim))
nn.init.xavier_uniform_(bg_embedding)
elif out_features == Globals.categories.num_all:
background_embedding = None
bg_embedding = None
else:
raise RuntimeError(
f"Unexpected {out_features=} given "
f"{Globals.categories.num_all=}",
)
self._background_embedding = background_embedding
self._bg_embedding = bg_embedding

def forward(self, x: torch.Tensor) -> torch.Tensor:
x = self._linear(x)
x = F.normalize(x)

embeddings: torch.Tensor = self._textual_category_embedding()
if self._bg_embedding is not None:
# TODO: is normalization needed for t5?
bg_embedding = F.normalize(self._bg_embedding)
embeddings = torch.cat([embeddings, bg_embedding])

logits = x @ embeddings.T

if Globals.training:
Expand Down
77 changes: 41 additions & 36 deletions oadp/dp/datasets/access_layers.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,31 +18,15 @@
T = TypeVar('T')


class BaseMixin(PthAccessLayer[T], ABC):
TASK_NAME: str

def __init__(self, *args, **kwargs) -> None:
super().__init__(*args, task_name=self.TASK_NAME, **kwargs)


class BaseGlobalAccessLayer(
BaseMixin[torch.Tensor],
PthAccessLayer[torch.Tensor],
):
class BaseGlobalAccessLayer(PthAccessLayer[torch.Tensor]):
pass


class BaseBlockAccessLayer(
BaseMixin[BlockOutput],
PthAccessLayer[BlockOutput],
):
class BaseBlockAccessLayer(PthAccessLayer[BlockOutput]):
pass


class BaseObjectAccessLayer(
BaseMixin[ObjectOutput],
PthAccessLayer[ObjectOutput],
):
class BaseObjectAccessLayer(PthAccessLayer[ObjectOutput]):

def __getitem__(self, key: str) -> ObjectOutput:
item = super().__getitem__(key)
Expand Down Expand Up @@ -101,26 +85,44 @@ def touch(self) -> Never:


class COCOGlobalAccessLayer(BaseGlobalAccessLayer):
TASK_NAME = 'coco_globals_cuda_train/output'

def __init__(self, *args, model: str, **kwargs) -> None:
super().__init__(
*args,
task_name=f'coco/{model}_globals_cuda_train/output',
**kwargs,
)


class COCOBlockAccessLayer(BaseBlockAccessLayer):
TASK_NAME = 'coco_blocks_cuda_train/output'

def __init__(self, *args, model: str, **kwargs) -> None:
super().__init__(
*args,
task_name=f'coco/{model}_blocks_cuda_train/output',
**kwargs,
)


class COCOObjectAccessLayer(BaseObjectAccessLayer):
TASK_NAME = 'coco_objects_cuda_train/output'

def __init__(self, *args, model: str, **kwargs) -> None:
super().__init__(
*args,
task_name=f'coco/{model}_objects_cuda_train/output',
**kwargs,
)


@DPAccessLayerRegistry.register_()
class COCOAccessLayer(AccessLayer[int]):

def __init__(self, *args, **kwargs) -> None:
def __init__(self, *args, model: str, **kwargs) -> None:
super().__init__(
*args,
global_=COCOGlobalAccessLayer(self.DATA_ROOT),
block=COCOBlockAccessLayer(self.DATA_ROOT),
object_=COCOObjectAccessLayer(self.DATA_ROOT),
global_=COCOGlobalAccessLayer(self.DATA_ROOT, model=model),
block=COCOBlockAccessLayer(self.DATA_ROOT, model=model),
object_=COCOObjectAccessLayer(self.DATA_ROOT, model=model),
**kwargs,
)

Expand All @@ -130,6 +132,10 @@ def __getitem__(self, key: int) -> T:

class LVISMixin(PthAccessLayer[T], ABC):

def __init__(self, *args, model: str, **kwargs) -> None:
super().__init__(*args, **kwargs)
self._model = model

@abstractmethod
def get_key(self, split: Literal['train', 'val'], key: str) -> str:
pass
Expand All @@ -141,35 +147,34 @@ def __getitem__(self, key: str) -> T:


class LVISGlobalAccessLayer(LVISMixin, BaseGlobalAccessLayer):
TASK_NAME = ''

def get_key(self, split: Literal['train', 'val'], key: str) -> str:
return f'coco_globals_cuda_{split}/output/{key}'
return f'coco/{self._model}_globals_cuda_{split}/output/{key}'


class LVISBlockAccessLayer(LVISMixin, BaseBlockAccessLayer):
TASK_NAME = ''

def get_key(self, split: Literal['train', 'val'], key: str) -> str:
return f'coco_blocks_cuda_{split}/output/{key}'
return f'coco/{self._model}_blocks_cuda_{split}/output/{key}'


class LVISObjectAccessLayer(LVISMixin, BaseObjectAccessLayer):
TASK_NAME = 'lvis_objects_cuda_train/output'

def get_key(self, split: Literal['train', 'val'], key: str) -> str:
return f'{split}2017_{key}'
return (
f'lvis/{self._model}_objects_cuda_train/output/{split}2017_{key}'
)


@DPAccessLayerRegistry.register_()
class LVISAccessLayer(AccessLayer[str]):

def __init__(self, *args, **kwargs) -> None:
def __init__(self, *args, model: str, **kwargs) -> None:
super().__init__(
*args,
global_=LVISGlobalAccessLayer(self.DATA_ROOT),
block=LVISBlockAccessLayer(self.DATA_ROOT),
object_=LVISObjectAccessLayer(self.DATA_ROOT),
global_=LVISGlobalAccessLayer(self.DATA_ROOT, model=model),
block=LVISBlockAccessLayer(self.DATA_ROOT, model=model),
object_=LVISObjectAccessLayer(self.DATA_ROOT, model=model),
**kwargs,
)

Expand Down
4 changes: 2 additions & 2 deletions oadp/dp/datasets/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ def filter_data(self) -> list[dict[str, Any]]:
valid_keys = {
k.removesuffix('.pth')
for k in
os.listdir('work_dirs/oake/coco_objects_cuda_train/output')
os.listdir('work_dirs/oake/coco/clip_objects_cuda_train/output')
if k.endswith('.pth')
}

Expand Down Expand Up @@ -94,7 +94,7 @@ def filter_data(self) -> list[dict[str, Any]]:
valid_keys = {
k.removesuffix('.pth')
for k in
os.listdir('work_dirs/oake/lvis_objects_cuda_train/output')
os.listdir('work_dirs/oake/lvis/clip_objects_cuda_train/output')
if k.endswith('.pth')
}

Expand Down
8 changes: 4 additions & 4 deletions oadp/dp/datasets/transforms.py
Original file line number Diff line number Diff line change
Expand Up @@ -126,8 +126,8 @@ def _load(self, results: dict[str, Any], key: str) -> dict[str, Any]:
@TRANSFORMS.register_module()
class LoadOAKE_COCO(LoadOAKE): # noqa: N801 pylint: disable=invalid-name

def __init__(self) -> None:
super().__init__(COCOAccessLayer())
def __init__(self, model: str) -> None:
super().__init__(COCOAccessLayer(model=model))

def __call__(self, results: dict[str, Any]) -> dict[str, Any]:
return self._load(results, results['img_id'])
Expand All @@ -136,8 +136,8 @@ def __call__(self, results: dict[str, Any]) -> dict[str, Any]:
@TRANSFORMS.register_module()
class LoadOAKE_LVIS(LoadOAKE): # noqa: N801 pylint: disable=invalid-name

def __init__(self) -> None:
super().__init__(LVISAccessLayer())
def __init__(self, model: str) -> None:
super().__init__(LVISAccessLayer(model=model))

def __call__(self, results: dict[str, Any]) -> dict[str, Any]:
return self._load(results, results['img_path'])
Expand Down
Loading

0 comments on commit 7da13c4

Please sign in to comment.