Skip to content
This repository has been archived by the owner on Oct 9, 2023. It is now read-only.

Adding integration with Label Studio #554

Merged
merged 64 commits into from
Sep 28, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
6a887e9
Adding label studio integration
KonstantinKorotaev Jul 7, 2021
f9987d1
Import fixes
KonstantinKorotaev Jul 7, 2021
81d2de3
Moving project.json to test data in S3
KonstantinKorotaev Jul 7, 2021
ae1cbe7
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 7, 2021
c51dd64
Moving from dataset to datasource
KonstantinKorotaev Aug 13, 2021
79f3d70
Merge fixes
KonstantinKorotaev Aug 13, 2021
718b152
Add video example
KonstantinKorotaev Aug 17, 2021
02c4cea
Fix merge conflict and Analysis checks
KonstantinKorotaev Aug 18, 2021
0c3f2fb
Updating imports to solve conflict
KonstantinKorotaev Aug 18, 2021
92672ef
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 18, 2021
6049fae
Moving to separate classes for image, video and text
KonstantinKorotaev Aug 23, 2021
e995d8c
Fixing merge conflict
KonstantinKorotaev Aug 23, 2021
3e8056d
Merging changes from PyTorchLightning master
KonstantinKorotaev Aug 23, 2021
33958d0
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 23, 2021
9a0d919
Merge pull request #2 from KonstantinKorotaev/PyTorchLightning-master
KonstantinKorotaev Aug 23, 2021
0a3c906
Merge pull request #3 from KonstantinKorotaev/PyTorchLightning-master
KonstantinKorotaev Aug 23, 2021
34e7a75
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 23, 2021
595f266
Fixing DeepSource analysis issues
KonstantinKorotaev Aug 23, 2021
8ba9f4c
Fix last DeepSource analysis issues
KonstantinKorotaev Aug 23, 2021
c9af6f6
Merge pull request #4 from KonstantinKorotaev/PyTorchLightning-master
KonstantinKorotaev Aug 23, 2021
fb3103f
Delete useless init
KonstantinKorotaev Aug 23, 2021
bf82fd9
Move label studio datasource to separate file
KonstantinKorotaev Aug 27, 2021
7f67ce1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 27, 2021
5ecd88e
Add test for LabelStudioDataSource._load_json_data
KonstantinKorotaev Aug 30, 2021
ba560c7
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 30, 2021
19fe4a8
Fix DeepSource analysis issues
KonstantinKorotaev Aug 30, 2021
7b44ea1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 30, 2021
baf4fa1
Merge branch 'master' into master
tchaton Sep 3, 2021
3c117f0
Adding test for each Label Studio datasource
KonstantinKorotaev Sep 8, 2021
23b6d7e
Fixing typo and grouping import
KonstantinKorotaev Sep 8, 2021
d711724
Merge remote-tracking branch 'upstream/master'
KonstantinKorotaev Sep 8, 2021
4964a6c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 8, 2021
55b1345
Fix import for DefaultDataSources
KonstantinKorotaev Sep 8, 2021
e41151d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 8, 2021
c5c11d7
Fix import for ImageClassificationData
KonstantinKorotaev Sep 8, 2021
ea2e474
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 8, 2021
a940f6f
Fix tests conditions
KonstantinKorotaev Sep 9, 2021
0cc7bd9
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 9, 2021
a3066d8
Fix data sources prerequisite
KonstantinKorotaev Sep 9, 2021
f59a1fa
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 9, 2021
119593e
Separating tests for Datamodule
KonstantinKorotaev Sep 9, 2021
20c0c48
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 9, 2021
28c51ee
Fixing link to file for image data sets
KonstantinKorotaev Sep 9, 2021
a46140b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 9, 2021
466bd69
Fix LabelStudioImageClassificationDataSource test
KonstantinKorotaev Sep 9, 2021
18ae793
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 9, 2021
77f1338
Adding App for visualizing predictions
KonstantinKorotaev Sep 9, 2021
5bf56b2
Fix import for label studio launch_app
KonstantinKorotaev Sep 9, 2021
50ebe0f
Fix strings in tests
KonstantinKorotaev Sep 9, 2021
c172b01
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 9, 2021
81fd7a7
Rename visualizer module
KonstantinKorotaev Sep 9, 2021
f512e0f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 9, 2021
5590a2a
Fix test for video and image multilabel
KonstantinKorotaev Sep 9, 2021
a19712f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 9, 2021
0d9dd95
Fixing docstring and test condition
KonstantinKorotaev Sep 9, 2021
54186b7
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 9, 2021
593e51e
Fix docstring and CODEOWNERS
KonstantinKorotaev Sep 13, 2021
022e1ab
Adding converter to tasks
KonstantinKorotaev Sep 14, 2021
0e1080a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 14, 2021
85bf6b6
Merge branch 'master' into master
tchaton Sep 24, 2021
b81f53f
update
tchaton Sep 28, 2021
46b3f6b
Merge branch 'master' into master
tchaton Sep 28, 2021
b89d6bb
update changelog
tchaton Sep 28, 2021
5b5be24
Merge commit 'refs/pull/554/head' of https://github.com/PyTorchLightn…
tchaton Sep 28, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,4 @@
/.github/*.md @edenlightning @ethanwharris @ananyahjha93
/.github/ISSUE_TEMPLATE/*.md @edenlightning @ethanwharris @ananyahjha93
/docs/source/conf.py @borda @ethanwharris @ananyahjha93
/flash/core/integrations/labelstudio @KonstantinKorotaev @niklub
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).

### Added

- Added `LabelStudio` integration ([#554](https://github.com/PyTorchLightning/lightning-flash/pull/554))

- Added support `learn2learn` training_strategy for `ImageClassifier` ([#737](https://github.com/PyTorchLightning/lightning-flash/pull/737))

- Added `vissl` training_strategies for `ImageEmbedder` ([#682](https://github.com/PyTorchLightning/lightning-flash/pull/682))
Expand Down
125 changes: 125 additions & 0 deletions flash/core/data/data_module.py
Original file line number Diff line number Diff line change
Expand Up @@ -1246,3 +1246,128 @@ def from_fiftyone(
num_workers=num_workers,
**preprocess_kwargs,
)

@classmethod
def from_labelstudio(
cls,
export_json: str = None,
train_export_json: str = None,
val_export_json: str = None,
test_export_json: str = None,
predict_export_json: str = None,
data_folder: str = None,
train_data_folder: str = None,
val_data_folder: str = None,
test_data_folder: str = None,
predict_data_folder: str = None,
train_transform: Optional[Dict[str, Callable]] = None,
val_transform: Optional[Dict[str, Callable]] = None,
test_transform: Optional[Dict[str, Callable]] = None,
predict_transform: Optional[Dict[str, Callable]] = None,
data_fetcher: Optional[BaseDataFetcher] = None,
preprocess: Optional[Preprocess] = None,
val_split: Optional[float] = None,
batch_size: int = 4,
num_workers: Optional[int] = None,
**preprocess_kwargs: Any,
) -> "DataModule":
"""Creates a :class:`~flash.core.data.data_module.DataModule` object
from the given export file and data directory using the
:class:`~flash.core.data.data_source.DataSource` of name
:attr:`~flash.core.data.data_source.DefaultDataSources.FOLDERS`
from the passed or constructed :class:`~flash.core.data.process.Preprocess`.

Args:
export_json: path to label studio export file
train_export_json: path to label studio export file for train set,
overrides export_json if specified
val_export_json: path to label studio export file for validation
test_export_json: path to label studio export file for test
predict_export_json: path to label studio export file for predict
data_folder: path to label studio data folder
train_data_folder: path to label studio data folder for train data set,
overrides data_folder if specified
val_data_folder: path to label studio data folder for validation data
test_data_folder: path to label studio data folder for test data
predict_data_folder: path to label studio data folder for predict data
train_transform: The dictionary of transforms to use during training which maps
:class:`~flash.core.data.process.Preprocess` hook names to callable transforms.
val_transform: The dictionary of transforms to use during validation which maps
:class:`~flash.core.data.process.Preprocess` hook names to callable transforms.
test_transform: The dictionary of transforms to use during testing which maps
:class:`~flash.core.data.process.Preprocess` hook names to callable transforms.
predict_transform: The dictionary of transforms to use during predicting which maps
:class:`~flash.core.data.process.Preprocess` hook names to callable transforms.
data_fetcher: The :class:`~flash.core.data.callback.BaseDataFetcher` to pass to the
:class:`~flash.core.data.data_module.DataModule`.
preprocess: The :class:`~flash.core.data.data.Preprocess` to pass to the
:class:`~flash.core.data.data_module.DataModule`. If ``None``, ``cls.preprocess_cls``
will be constructed and used.
val_split: The ``val_split`` argument to pass to the :class:`~flash.core.data.data_module.DataModule`.
batch_size: The ``batch_size`` argument to pass to the :class:`~flash.core.data.data_module.DataModule`.
num_workers: The ``num_workers`` argument to pass to the :class:`~flash.core.data.data_module.DataModule`.
preprocess_kwargs: Additional keyword arguments to use when constructing the preprocess. Will only be used
if ``preprocess = None``.

Returns:
The constructed data module.

Examples::

data_module = DataModule.from_labelstudio(
export_json='project.json',
data_folder='label-studio/media/upload',
KonstantinKorotaev marked this conversation as resolved.
Show resolved Hide resolved
val_split=0.8,
)
"""
data = {
"data_folder": data_folder,
"export_json": export_json,
"split": val_split,
"multi_label": preprocess_kwargs.get("multi_label", False),
}
train_data = None
val_data = None
test_data = None
predict_data = None
if (train_data_folder or data_folder) and train_export_json:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat !

train_data = {
"data_folder": train_data_folder or data_folder,
"export_json": train_export_json,
"multi_label": preprocess_kwargs.get("multi_label", False),
}
if (val_data_folder or data_folder) and val_export_json:
val_data = {
"data_folder": val_data_folder or data_folder,
"export_json": val_export_json,
"multi_label": preprocess_kwargs.get("multi_label", False),
}
if (test_data_folder or data_folder) and test_export_json:
test_data = {
"data_folder": test_data_folder or data_folder,
"export_json": test_export_json,
"multi_label": preprocess_kwargs.get("multi_label", False),
}
if (predict_data_folder or data_folder) and predict_export_json:
predict_data = {
"data_folder": predict_data_folder or data_folder,
"export_json": predict_export_json,
"multi_label": preprocess_kwargs.get("multi_label", False),
}
return cls.from_data_source(
DefaultDataSources.LABELSTUDIO,
train_data=train_data if train_data else data,
val_data=val_data,
test_data=test_data,
predict_data=predict_data,
train_transform=train_transform,
val_transform=val_transform,
test_transform=test_transform,
predict_transform=predict_transform,
data_fetcher=data_fetcher,
preprocess=preprocess,
val_split=val_split,
batch_size=batch_size,
num_workers=num_workers,
**preprocess_kwargs,
)
1 change: 1 addition & 0 deletions flash/core/data/data_source.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,7 @@ class DefaultDataSources(LightningEnum):
JSON = "json"
DATASETS = "datasets"
FIFTYONE = "fiftyone"
LABELSTUDIO = "labelstudio"

# TODO: Create a FlashEnum class???
def __hash__(self) -> int:
Expand Down
Loading