Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Temporarily disable the deletion of the dask dataframe #3814

Merged
merged 39 commits into from
Sep 28, 2023
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
929edb5
add temporary workaround to copy the dask dataframe
Aug 22, 2023
49c1d55
update docstrings
Aug 22, 2023
7f392ea
temporarily avoid deleting the copy of the dataframe
Aug 22, 2023
6de48ca
Merge remote-tracking branch 'upstream/branch-23.10' into branch-23.1…
Aug 22, 2023
1673c8d
update docstrings with accurate column names
Aug 22, 2023
8cb3022
Merge remote-tracking branch 'upstream/branch-23.10' into branch-23.1…
Aug 22, 2023
e6a1d30
undo changes
Aug 22, 2023
89f3a0b
Enable temporarily disabled MG tests
Aug 30, 2023
ab59c9b
Enable temporarily disabled MG tests
Aug 30, 2023
208b1cc
Enable temporarily disabled MG tests
Aug 30, 2023
d31b296
Enable temporarily disabled MG tests
Aug 30, 2023
95e5249
Enable temporarily disabled MG tests
Aug 30, 2023
1016b77
Enable temporarily disabled MG tests
naimnv Aug 30, 2023
2aa7f25
Enable temporarily disabled MG tests
naimnv Aug 30, 2023
4d193da
Enable temporarily disabled MG tests
naimnv Aug 30, 2023
f016e8e
Enable temporarily disabled MG tests
naimnv Aug 30, 2023
c04aad8
Enable temporarily disabled MG tests
naimnv Aug 30, 2023
da917d6
Enable temporarily disabled MG tests
naimnv Aug 30, 2023
49f9d0e
Merge branch 'branch-23.10' of github.com:rapidsai/cugraph into enabl…
naimnv Sep 20, 2023
fc72f31
Use token in the map_partitions to possibly get around error in graph…
naimnv Sep 20, 2023
7b99d79
Use token in the map_partitions to possibly get around error in graph…
naimnv Sep 21, 2023
12c0675
Merge branch 'branch-23.10' of github.com:rapidsai/cugraph into enabl…
naimnv Sep 21, 2023
bd85881
Skip deleting copied dask dataframe to avoid crash
naimnv Sep 22, 2023
0d8ff97
Temporarily skip deleting copied dask dataframe to avoid crash
naimnv Sep 22, 2023
9bb51fe
Skip deleting copied dask dataframe to avoid crash while creating fro…
naimnv Sep 22, 2023
a407617
Merge remote-tracking branch 'upstream/branch-23.10' into branch-23.1…
jnke2016 Sep 25, 2023
9ff7124
fix merge conflict
jnke2016 Sep 25, 2023
e5f45e7
Merge remote-tracking branch 'upstream/branch-23.10' into branch-23.1…
jnke2016 Sep 25, 2023
67fd5c6
undo change
jnke2016 Sep 25, 2023
c777f87
undo changes
jnke2016 Sep 25, 2023
1485c4c
Merge remote-tracking branch 'upstream/branch-23.10' into branch-23.1…
jnke2016 Sep 26, 2023
0262d34
update branch with the latest changes
jnke2016 Sep 26, 2023
384cf39
fix style
jnke2016 Sep 27, 2023
9b5635d
Merge remote-tracking branch 'upstream/branch-23.10' into branch-23.1…
jnke2016 Sep 27, 2023
dde98b3
create unique token and delete ddf after compute call
jnke2016 Sep 27, 2023
4f588de
create unique token and delete ddf after compute call
jnke2016 Sep 27, 2023
d7f3e59
fix style
jnke2016 Sep 28, 2023
8221b16
Merge remote-tracking branch 'upstream/branch-23.10' into branch-23.1…
jnke2016 Sep 28, 2023
557b5b3
Merge remote-tracking branch 'upstream/branch-23.10' into branch-23.1…
jnke2016 Sep 28, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 1 addition & 5 deletions ci/test_python.sh
Original file line number Diff line number Diff line change
Expand Up @@ -63,10 +63,6 @@ pytest \
tests
popd

# FIXME: TEMPORARILY disable single-GPU "MG" testing until
# https://github.com/rapidsai/cugraph/issues/3790 is closed
# When closed, replace -k "not _mg" with
# -k "not test_property_graph_mg" \
rapids-logger "pytest cugraph"
pushd python/cugraph/cugraph
export DASK_WORKER_DEVICES="0"
Expand All @@ -79,7 +75,7 @@ pytest \
--cov=cugraph \
--cov-report=xml:"${RAPIDS_COVERAGE_DIR}/cugraph-coverage.xml" \
--cov-report=term \
-k "not _mg" \
-k "not test_property_graph_mg" \
tests
popd

Expand Down
4 changes: 1 addition & 3 deletions ci/test_wheel.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,5 @@ arch=$(uname -m)
if [[ "${arch}" == "aarch64" && ${RAPIDS_BUILD_TYPE} == "pull-request" ]]; then
python ./ci/wheel_smoke_test_${package_name}.py
else
# FIXME: TEMPORARILY disable single-GPU "MG" testing until
# https://github.com/rapidsai/cugraph/issues/3790 is closed
RAPIDS_DATASET_ROOT_DIR=`pwd`/datasets python -m pytest -k "not _mg" ./python/${package_name}/${package_name}/tests
RAPIDS_DATASET_ROOT_DIR=`pwd`/datasets python -m pytest ./python/${package_name}/${package_name}/tests
fi
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,6 @@ def __from_edgelist(
workers = _client.scheduler_info()["workers"]
# Repartition to 2 partitions per GPU for memory efficient process
input_ddf = input_ddf.repartition(npartitions=len(workers) * 2)
# FIXME: Make a copy of the input ddf before implicitly altering it.
input_ddf = input_ddf.map_partitions(lambda df: df.copy())
# The dataframe will be symmetrized iff the graph is undirected
# otherwise, the inital dataframe will be returned
Expand Down Expand Up @@ -334,7 +333,7 @@ def __from_edgelist(
)
for w, edata in ddf.items()
}
del ddf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this deletion come after the computation below is done anyway? If you are building graph from ddf, then you cannot release the keys of ddf until delayed_tasks_d is finished computing/persisting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why the keys were deleted before delayed_tasks_d computes. My PR essentially removed that deletion and it worked. I am not sure if doing it after delayed_tasks_d computes will make a difference but let me try.

# FIXME: For now, don't delete the copied dataframe to avoid crash
self._plc_graph = {
w: _client.compute(delayed_task, workers=w, allow_other_workers=False)
for w, delayed_task in delayed_tasks_d.items()
Expand Down Expand Up @@ -1193,7 +1192,5 @@ def _get_column_from_ls_dfs(lst_df, col_name):
if len_df == 0:
return lst_df[0][col_name]
output_col = cudf.concat([df[col_name] for df in lst_df], ignore_index=True)
for df in lst_df:
df.drop(columns=[col_name], inplace=True)
gc.collect()
# FIXME: For now, don't delete the copied dataframe to avoid crash
return output_col
Loading