Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate yapf as a commit hook #4

Closed
schrockn opened this issue Jun 1, 2018 · 0 comments
Closed

Integrate yapf as a commit hook #4

schrockn opened this issue Jun 1, 2018 · 0 comments

Comments

@schrockn
Copy link
Member

schrockn commented Jun 1, 2018

No nitpicking on formatting. Just use a tool and be done with it.

rexledesma added a commit that referenced this issue Nov 14, 2023
## Summary & Motivation
Following up on #17980, one
of the issues with memory usage in `dagster-dbt` is that if the
`manifest` is specified as a path, we potentially hold multiple copies
of the dictionary manifest in memory when creating our assets. This is
because we read from the path to create the dictionary manifest, and
then we hold a reference to the manifest dictionary in our asset
metadata. When the same path is read multiple times, multiple copies
ensure.

We can optimize this by implementing a cache when reading from the path,
so that the same dictionary manifest is returned even when a manifest
path is specified.

To further optimize this to hold no reference to the manifest
dictionary, we could instead hold a reference to the manifest path in
our asset metadata. However,

1. This only works if a manifest path is passed. If a manifest is
already specified as a dictionary, then we still have to hold it in
memory. This is the worst case.
2. (1) can be solved by preventing a manifest from being specified as a
dictionary, but that would be a breaking change.

## How I Tested These Changes
Use `tracemalloc`.

1. Create a `jaffle_dagster` project using `dagster-dbt project
scaffold` on `jaffle_shop`.
2. Update the code to read the manifest from a path in three separate
`@dbt_assets` definitions.
3. Run `PYTHONTRACEMALLOC=1 DAGSTER_DBT_PARSE_PROJECT_ON_LOAD=1 dagster
dev`

**Before Change**
```diff
Top 10 lines
#1: <frozen importlib._bootstrap_external>:672: 37114.4 KiB
#2: /Users/rexledesma/.pyenv/versions/3.10.13/envs/dagster/lib/python3.10/site-packages/text_unidecode/__init__.py:6: 3675.4 KiB
    _replaces = pkgutil.get_data(__name__, 'data.bin').decode('utf8').split('\x00')
- #3: /Users/rexledesma/dagster-labs/dagster/python_modules/libraries/dagster-- dbt/dagster_dbt/dbt_manifest.py:17: 3262.9 KiB
-    manifest = cast(Mapping[str, Any], orjson.loads(manifest.read_bytes()))
#4: /Users/rexledesma/.pyenv/versions/3.10.13/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/abc.py:106: 1313.7 KiB
    cls = super().__new__(mcls, name, bases, namespace, **kwargs)
#5: /Users/rexledesma/.pyenv/versions/3.10.13/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/collections/__init__.py:481: 987.5 KiB
    result = type(typename, (tuple,), class_namespace)
#6: /Users/rexledesma/.pyenv/versions/3.10.13/envs/dagster/lib/python3.10/site-packages/mashumaro/core/meta/code/builder.py:114: 978.7 KiB
    self.globals = globals().copy()
#7: /Users/rexledesma/.pyenv/versions/3.10.13/envs/dagster/lib/python3.10/site-packages/mashumaro/core/meta/code/builder.py:263: 891.8 KiB
    exec(code, self.globals, self.__dict__)
#8: /Users/rexledesma/.pyenv/versions/3.10.13/envs/dagster/lib/python3.10/site-packages/google/protobuf/internal/builder.py:85: 697.9 KiB
    message_class = _reflection.GeneratedProtocolMessageType(
#9: <frozen importlib._bootstrap_external>:128: 655.2 KiB
#10: <frozen importlib._bootstrap_external>:1616: 569.6 KiB
53203 other: 38030.8 KiB
Total allocated size: 88177.9 KiB
```

**After Change**
```diff
Top 10 lines
#1: <frozen importlib._bootstrap_external>:672: 37114.9 KiB
#2: /Users/rexledesma/.pyenv/versions/3.10.13/envs/dagster/lib/python3.10/site-packages/text_unidecode/__init__.py:6: 3675.4 KiB
    _replaces = pkgutil.get_data(__name__, 'data.bin').decode('utf8').split('\x00')
#3: /Users/rexledesma/.pyenv/versions/3.10.13/Library/Frameworks/Python.framework/Versions/3.10 /lib/python3.10/abc.py:106: 1313.7 KiB
    cls = super().__new__(mcls, name, bases, namespace, **kwargs)
+ #4: /Users/rexledesma/dagster-labs/dagster/python_modules/libraries/dagster-dbt/dagster_dbt/dbt_manifest.py:13: 1057.4 KiB
+    return cast(Mapping[str, Any], orjson.loads(manifest_path.read_bytes()))
#5: /Users/rexledesma/.pyenv/versions/3.10.13/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/collections/__init__.py:481: 987.5 KiB
    result = type(typename, (tuple,), class_namespace)
#6: /Users/rexledesma/.pyenv/versions/3.10.13/envs/dagster/lib/python3.10/site-packages/mashumaro/core/meta/code/builder.py:114: 978.7 KiB
    self.globals = globals().copy()
#7: /Users/rexledesma/.pyenv/versions/3.10.13/envs/dagster/lib/python3.10/site-packages/mashumaro/core/meta/code/builder.py:263: 893.6 KiB
    exec(code, self.globals, self.__dict__)
#8: /Users/rexledesma/.pyenv/versions/3.10.13/envs/dagster/lib/python3.10/site-packages/google/protobuf/internal/builder.py:85: 697.9 KiB
    message_class = _reflection.GeneratedProtocolMessageType(
#9: <frozen importlib._bootstrap_external>:128: 655.2 KiB
#10: <frozen importlib._bootstrap_external>:1616: 569.6 KiB
53195 other: 38013.7 KiB
Total allocated size: 85957.5 KiB
```

**Baseline (one `@dbt_assets` definition)**

```diff
Top 10 lines
#1: <frozen importlib._bootstrap_external>:672: 37114.2 KiB
#2: /Users/rexledesma/.pyenv/versions/3.10.13/envs/dagster/lib/python3.10/site-packages/text_unidecode/__init__.py:6: 3675.4 KiB
    _replaces = pkgutil.get_data(__name__, 'data.bin').decode('utf8').split('\x00')
#3: /Users/rexledesma/.pyenv/versions/3.10.13/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/abc.py:106: 1313.7 KiB
    cls = super().__new__(mcls, name, bases, namespace, **kwargs)
+ #4: /Users/rexledesma/dagster-labs/dagster/python_modules/libraries/dagster-dbt/dagster_dbt/dbt_manifest.py:13: 1056.9 KiB
+    return cast(Mapping[str, Any], orjson.loads(manifest_path.read_bytes()))
#5: /Users/rexledesma/.pyenv/versions/3.10.13/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/collections/__init__.py:481: 987.5 KiB
    result = type(typename, (tuple,), class_namespace)
#6: /Users/rexledesma/.pyenv/versions/3.10.13/envs/dagster/lib/python3.10/site-packages/mashumaro/core/meta/code/builder.py:114: 978.7 KiB
    self.globals = globals().copy()
#7: /Users/rexledesma/.pyenv/versions/3.10.13/envs/dagster/lib/python3.10/site-packages/mashumaro/core/meta/code/builder.py:263: 893.7 KiB
    exec(code, self.globals, self.__dict__)
#8: /Users/rexledesma/.pyenv/versions/3.10.13/envs/dagster/lib/python3.10/site-packages/google/protobuf/internal/builder.py:85: 697.9 KiB
    message_class = _reflection.GeneratedProtocolMessageType(
#9: <frozen importlib._bootstrap_external>:128: 655.2 KiB
#10: <frozen importlib._bootstrap_external>:1616: 569.6 KiB
53175 other: 37976.1 KiB
Total allocated size: 85918.9 KiB
```
PedramNavid added a commit that referenced this issue Aug 20, 2024
# This is the 1st commit message:

[docs-beta] Add Vale linter and Sphinx templates

# This is the commit message #2:

fix tox

# This is the commit message #3:

add placeholder doc string

# This is the commit message #4:

Fix broken docs build (#23562)

## Summary & Motivation

## How I Tested These Changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants