Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple queries failing in 2021-03-16 nightlies #195

Closed
beckernick opened this issue Mar 16, 2021 · 4 comments · Fixed by #198
Closed

Multiple queries failing in 2021-03-16 nightlies #195

beckernick opened this issue Mar 16, 2021 · 4 comments · Fixed by #198
Assignees

Comments

@beckernick
Copy link
Member

beckernick commented Mar 16, 2021

Failing queries:

  • Q02
  • Q04
  • Q08
  • Then a variety of queries fail with memory or with cluster connection issues

Tracebacks

Q02

Encountered Exception while running query
Traceback (most recent call last):
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/utils.py", line 180, in raise_on_meta_error
    yield
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/core.py", line 5332, in _emulate
    return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))
  File "queries/q02/gpu_bdb_query_02.py", line 55, in reduction_function
    df, keep_cols=["wcs_user_sk", "wcs_item_sk"], time_out=q02_session_timeout_inSec
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/sessionization.py", line 92, in get_distinct_sessions
    df = get_sessions(df, keep_cols, time_out=3600)
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/sessionization.py", line 79, in get_sessions
    df["session_id"] = get_session_id(df, keep_cols, time_out)
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/sessionization.py", line 73, in get_session_id
    assert len(session_ids) == len(df)
AssertionError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/utils.py", line 280, in run_dask_cudf_query
    config=config,
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/utils.py", line 61, in benchmark
    result = func(*args, **kwargs)
  File "queries/q02/gpu_bdb_query_02.py", line 133, in main
    grouped_df = f_wcs_df.map_partitions(reduction_function, q02_session_timeout_inSec)
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/core.py", line 684, in map_partitions
    return map_partitions(func, self, *args, **kwargs)
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/core.py", line 5385, in map_partitions
    meta = _emulate(func, *args, udf=True, **kwargs)
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/core.py", line 5332, in _emulate
    return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/contextlib.py", line 130, in __exit__
    self.gen.throw(type, value, traceback)
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/utils.py", line 201, in raise_on_meta_error
    raise ValueError(msg) from e
ValueError: Metadata inference failed in `reduction_function`

Q04

Encountered Exception while running query
Traceback (most recent call last):
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/utils.py", line 180, in raise_on_meta_error
    yield
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/core.py", line 5332, in _emulate
    return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))
  File "queries/q04/gpu_bdb_query_04.py", line 102, in reduction_function
    df = get_sessions(df, keep_cols=keep_cols)
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/sessionization.py", line 79, in get_sessions
    df["session_id"] = get_session_id(df, keep_cols, time_out)
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/sessionization.py", line 73, in get_session_id
    assert len(session_ids) == len(df)
AssertionError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/utils.py", line 280, in run_dask_cudf_query
    config=config,
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/utils.py", line 61, in benchmark
    result = func(*args, **kwargs)
  File "queries/q04/gpu_bdb_query_04.py", line 155, in main
    reduction_function, keep_cols, DYNAMIC_CAT_CODE, ORDER_CAT_CODE
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/core.py", line 684, in map_partitions
    return map_partitions(func, self, *args, **kwargs)
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/core.py", line 5385, in map_partitions
    meta = _emulate(func, *args, udf=True, **kwargs)
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/core.py", line 5332, in _emulate
    return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/contextlib.py", line 130, in __exit__
    self.gen.throw(type, value, traceback)
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/utils.py", line 201, in raise_on_meta_error
    raise ValueError(msg) from e
ValueError: Metadata inference failed in `reduction_function`.

8

Encountered Exception while running query
Traceback (most recent call last):
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/utils.py", line 280, in run_dask_cudf_query
    config=config,
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/utils.py", line 61, in benchmark
    result = func(*args, **kwargs)
  File "queries/q08/gpu_bdb_query_08.py", line 308, in main
    q08_reviewed_sales_sum.result(),
  File "/raid/nicholasb/miniconda3/envs/rapids-gpubdb-20210316/lib/python3.7/site-packages/distributed/client.py", line 222, in result
    raise exc.with_traceback(tb)
  File "/raid/nicholasb/miniconda3/envs/rapids-gpubdb-20210316/lib/python3.7/site-packages/dask/optimization.py", line 963, in __call__
    return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
  File "/raid/nicholasb/miniconda3/envs/rapids-gpubdb-20210316/lib/python3.7/site-packages/dask/core.py", line 151, in get
    result = _execute_task(task, cache)
  File "/raid/nicholasb/miniconda3/envs/rapids-gpubdb-20210316/lib/python3.7/site-packages/dask/core.py", line 121, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/raid/nicholasb/miniconda3/envs/rapids-gpubdb-20210316/lib/python3.7/site-packages/dask/utils.py", line 35, in apply
    return func(*args, **kwargs)
  File "/raid/nicholasb/miniconda3/envs/rapids-gpubdb-20210316/lib/python3.7/site-packages/dask/dataframe/core.py", line 5487, in apply_and_enforce
    df = func(*args, **kwargs)
  File "queries/q08/gpu_bdb_query_08.py", line 209, in reduction_function
    df = get_sessions(df)
  File "queries/q08/gpu_bdb_query_08.py", line 138, in get_sessions
    df["session_id"] = get_session_id(df)
  File "queries/q08/gpu_bdb_query_08.py", line 130, in get_session_id
    assert len(session_ids) == len(df)
AssertionError
@beckernick
Copy link
Member Author

beckernick commented Mar 16, 2021

diff 20210315-env.txt 20210316-env.txt
46c46
< cudf=0.19.0a210315=cuda_10.2_py37_g325d5b800b_212
---
> cudf=0.19.0a210316=cuda_10.2_py37_g2f5901ffb4_216
48c48
< cuml=0.19.0a210315=cuda10.2_py37_gf5d86b957_106
---
> cuml=0.19.0a210316=cuda10.2_py37_g96eaf623e_109
57,58c57,58
< dask-cuda=0.19.0a210315=py37_41
< dask-cudf=0.19.0a210315=py37_g325d5b800b_212
---
> dask-cuda=0.19.0a210316=py37_42
> dask-cudf=0.19.0a210316=py37_g2f5901ffb4_216
102c102
< jupyter-server-proxy=3.0.0=pypi_0
---
> jupyter-server-proxy=3.0.1=pypi_0
116,117c116,117
< libcudf=0.19.0a210315=cuda10.2_g325d5b800b_212
< libcuml=0.19.0a210315=cuda10.2_gf5d86b957_106
---
> libcudf=0.19.0a210316=cuda10.2_g2f5901ffb4_216
> libcuml=0.19.0a210316=cuda10.2_g96eaf623e_109
139c139
< librmm=0.19.0a210315=cuda10.2_gcb81c80_40
---
> librmm=0.19.0a210316=cuda10.2_gdd718e2_41
219c219
< rmm=0.19.0a210315=cuda_10.2_py37_gcb81c80_40
---
> rmm=0.19.0a210316=cuda_10.2_py37_gdd718e2_41
254c254
< ucx-py=0.19.0a210315=py37_gcd9efd3_19
---
> ucx-py=0.19.0a210316=py37_gcd9efd3_20

@beckernick
Copy link
Member Author

Likely due to rapidsai/cudf#7490

Likely will need to refactor queries for the updated null handling

@beckernick beckernick self-assigned this Mar 16, 2021
@beckernick
Copy link
Member Author

In the failing commit, we get the following metadata dataframe inside get_session_id after the following lines:

df["user_change_flag"] = df["wcs_user_sk"].diff(periods=1) != 0
df["time_delta"] = df["tstamp_inSec"].diff(periods=1)
df["session_timeout_flag"] = df["tstamp_inSec"].diff(periods=1) > time_out
df["session_change_flag"] = df["session_timeout_flag"] | df["user_change_flag"]

   wcs_user_sk  wcs_item_sk  tstamp_inSec  ... time_delta session_timeout_flag session_change_flag
0            0            0             0  ...       <NA>                 <NA>                <NA>
1            1            1             1  ...          1                False                True

In the prior succeeding commit:

   wcs_user_sk  wcs_item_sk  tstamp_inSec  ...  time_delta session_timeout_flag  session_change_flag
0            0            0             0  ...        <NA>                False                 True
1            1            1             1  ...           1                False                 True

@beckernick
Copy link
Member Author

Local fix passing correctness checks. Will push a PR soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant