Skip to content

Commit

Permalink
Publishing v3.0.84
Browse files Browse the repository at this point in the history
  • Loading branch information
SireInsectus committed Jun 26, 2023
1 parent 61cc44d commit 3f83506
Show file tree
Hide file tree
Showing 8 changed files with 48 additions and 47 deletions.
2 changes: 1 addition & 1 deletion docs/dbacademy.dbhelper.dataset_manager_class.html
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@
This&nbsp;ensures&nbsp;that&nbsp;data&nbsp;and&nbsp;compute&nbsp;are&nbsp;in&nbsp;the&nbsp;same&nbsp;region&nbsp;which&nbsp;subsequently&nbsp;mitigates&nbsp;performance&nbsp;issues<br>
when&nbsp;the&nbsp;storage&nbsp;and&nbsp;compute&nbsp;are,&nbsp;for&nbsp;example,&nbsp;on&nbsp;opposite&nbsp;sides&nbsp;of&nbsp;the&nbsp;world.</tt></dd></dl>

<dl><dt><a name="DatasetManager-validate_datasets"><strong>validate_datasets</strong></a>(self, fail_fast: bool) -&gt; None</dt><dd><tt>Validates&nbsp;the&nbsp;"install"&nbsp;of&nbsp;the&nbsp;datasets&nbsp;by&nbsp;recursively&nbsp;listing&nbsp;all&nbsp;files&nbsp;in&nbsp;the&nbsp;remote&nbsp;data&nbsp;repository&nbsp;as&nbsp;well&nbsp;as&nbsp;the&nbsp;local&nbsp;data&nbsp;repository,&nbsp;validating&nbsp;that&nbsp;each&nbsp;file&nbsp;exists&nbsp;but&nbsp;DOES&nbsp;NOT&nbsp;validate&nbsp;file&nbsp;size&nbsp;or&nbsp;checksum.</tt></dd></dl>
<dl><dt><a name="DatasetManager-validate_datasets"><strong>validate_datasets</strong></a>(self, fail_fast: bool) -&gt; None</dt></dl>

<hr>
Static methods defined here:<br>
Expand Down
2 changes: 1 addition & 1 deletion docs/dbacademy.dbrest.instance_pools.html
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@
Methods defined here:<br>
<dl><dt><a name="InstancePoolsClient-__init__"><strong>__init__</strong></a>(self, client: dbacademy.dbrest.client.DBAcademyRestClient)</dt><dd><tt>Initialize&nbsp;self.&nbsp;&nbsp;See&nbsp;<a href="#InstancePoolsClient-help">help</a>(type(self))&nbsp;for&nbsp;accurate&nbsp;signature.</tt></dd></dl>

<dl><dt><a name="InstancePoolsClient-create"><strong>create</strong></a>(self, name: str, definition: dict, tags: &lt;function InstancePoolsClient.list at 0x00000218B4B02B80&gt; = None)</dt></dl>
<dl><dt><a name="InstancePoolsClient-create"><strong>create</strong></a>(self, name: str, definition: dict, tags: &lt;function InstancePoolsClient.list at 0x000001961AFF9EE0&gt; = None)</dt></dl>

<dl><dt><a name="InstancePoolsClient-create_or_update"><strong>create_or_update</strong></a>(self, instance_pool_name: str, idle_instance_autotermination_minutes: int, min_idle_instances: int = 0, max_capacity: int = None, node_type_id: str = None, preloaded_spark_version: str = None, tags: dict = None)</dt></dl>

Expand Down
8 changes: 4 additions & 4 deletions docs/dbacademy.dbrest.pipelines.html
Original file line number Diff line number Diff line change
Expand Up @@ -53,11 +53,11 @@
Methods defined here:<br>
<dl><dt><a name="PipelinesClient-__init__"><strong>__init__</strong></a>(self, client: dbacademy.dbrest.client.DBAcademyRestClient)</dt><dd><tt>Initialize&nbsp;self.&nbsp;&nbsp;See&nbsp;<a href="#PipelinesClient-help">help</a>(type(self))&nbsp;for&nbsp;accurate&nbsp;signature.</tt></dd></dl>

<dl><dt><a name="PipelinesClient-create"><strong>create</strong></a>(self, name: str, storage: str, target: str, continuous: bool = False, development: bool = True, configuration: dict = None, notebooks: &lt;function PipelinesClient.list at 0x00000218B4C403A0&gt; = None, libraries: &lt;function PipelinesClient.list at 0x00000218B4C403A0&gt; = None, clusters: &lt;function PipelinesClient.list at 0x00000218B4C403A0&gt; = None, min_workers: int = 0, max_workers: int = 0, photon: bool = True)</dt></dl>
<dl><dt><a name="PipelinesClient-create"><strong>create</strong></a>(self, name: str, storage: str, target: str, continuous: bool = False, development: bool = True, configuration: dict = None, notebooks: &lt;function PipelinesClient.list at 0x000001961B033280&gt; = None, libraries: &lt;function PipelinesClient.list at 0x000001961B033280&gt; = None, clusters: &lt;function PipelinesClient.list at 0x000001961B033280&gt; = None, min_workers: int = 0, max_workers: int = 0, photon: bool = True)</dt></dl>

<dl><dt><a name="PipelinesClient-create_from_dict"><strong>create_from_dict</strong></a>(self, params: dict)</dt></dl>

<dl><dt><a name="PipelinesClient-create_or_update"><strong>create_or_update</strong></a>(self, name: str, storage: str, target: str, continuous: bool = False, development: bool = True, configuration: dict = None, notebooks: &lt;function PipelinesClient.list at 0x00000218B4C403A0&gt; = None, libraries: &lt;function PipelinesClient.list at 0x00000218B4C403A0&gt; = None, clusters: &lt;function PipelinesClient.list at 0x00000218B4C403A0&gt; = None, min_workers: int = 0, max_workers: int = 0, photon: bool = True, pipeline_id: Optional[str] = None)</dt></dl>
<dl><dt><a name="PipelinesClient-create_or_update"><strong>create_or_update</strong></a>(self, name: str, storage: str, target: str, continuous: bool = False, development: bool = True, configuration: dict = None, notebooks: &lt;function PipelinesClient.list at 0x000001961B033280&gt; = None, libraries: &lt;function PipelinesClient.list at 0x000001961B033280&gt; = None, clusters: &lt;function PipelinesClient.list at 0x000001961B033280&gt; = None, min_workers: int = 0, max_workers: int = 0, photon: bool = True, pipeline_id: Optional[str] = None)</dt></dl>

<dl><dt><a name="PipelinesClient-delete_by_id"><strong>delete_by_id</strong></a>(self, pipeline_id)</dt></dl>

Expand All @@ -75,15 +75,15 @@

<dl><dt><a name="PipelinesClient-start_by_name"><strong>start_by_name</strong></a>(self, name: str)</dt></dl>

<dl><dt><a name="PipelinesClient-update"><strong>update</strong></a>(self, pipeline_id: str, name: str, storage: str, target: str, continuous: bool = False, development: bool = True, configuration: dict = None, notebooks: &lt;function PipelinesClient.list at 0x00000218B4C403A0&gt; = None, libraries: &lt;function PipelinesClient.list at 0x00000218B4C403A0&gt; = None, clusters: &lt;function PipelinesClient.list at 0x00000218B4C403A0&gt; = None, min_workers: int = 0, max_workers: int = 0, photon: bool = True)</dt></dl>
<dl><dt><a name="PipelinesClient-update"><strong>update</strong></a>(self, pipeline_id: str, name: str, storage: str, target: str, continuous: bool = False, development: bool = True, configuration: dict = None, notebooks: &lt;function PipelinesClient.list at 0x000001961B033280&gt; = None, libraries: &lt;function PipelinesClient.list at 0x000001961B033280&gt; = None, clusters: &lt;function PipelinesClient.list at 0x000001961B033280&gt; = None, min_workers: int = 0, max_workers: int = 0, photon: bool = True)</dt></dl>

<dl><dt><a name="PipelinesClient-update_from_dict"><strong>update_from_dict</strong></a>(self, pipeline_id: str, params: dict)</dt></dl>

<hr>
Static methods defined here:<br>
<dl><dt><a name="PipelinesClient-existing_to_create"><strong>existing_to_create</strong></a>(pipeline)</dt></dl>

<dl><dt><a name="PipelinesClient-to_dict"><strong>to_dict</strong></a>(name: str, storage: str, target: str, continuous: bool = False, development: bool = True, configuration: dict = None, notebooks: &lt;function PipelinesClient.list at 0x00000218B4C403A0&gt; = None, libraries: &lt;function PipelinesClient.list at 0x00000218B4C403A0&gt; = None, clusters: &lt;function PipelinesClient.list at 0x00000218B4C403A0&gt; = None, min_workers: int = 0, max_workers: int = 0, photon: bool = True)</dt></dl>
<dl><dt><a name="PipelinesClient-to_dict"><strong>to_dict</strong></a>(name: str, storage: str, target: str, continuous: bool = False, development: bool = True, configuration: dict = None, notebooks: &lt;function PipelinesClient.list at 0x000001961B033280&gt; = None, libraries: &lt;function PipelinesClient.list at 0x000001961B033280&gt; = None, clusters: &lt;function PipelinesClient.list at 0x000001961B033280&gt; = None, min_workers: int = 0, max_workers: int = 0, photon: bool = True)</dt></dl>

<hr>
Methods inherited from <a href="dbacademy.rest.common.html#ApiContainer">dbacademy.rest.common.ApiContainer</a>:<br>
Expand Down
8 changes: 4 additions & 4 deletions docs/dbacademy.dbrest.sql.endpoints.html
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@

<dl><dt><a name="SqlWarehousesClient-create_user_endpoint"><strong>create_user_endpoint</strong></a>(self, user, naming_template: str, naming_params: dict, cluster_size: str, enable_serverless_compute: bool, min_num_clusters: int, max_num_clusters: int, auto_stop_mins: int, enable_photon: bool, spot_instance_policy: str, channel: str, tags: dict)</dt></dl>

<dl><dt><a name="SqlWarehousesClient-create_user_endpoints"><strong>create_user_endpoints</strong></a>(self, naming_template: str, naming_params: dict, cluster_size: str, enable_serverless_compute: bool, min_num_clusters: int = 1, max_num_clusters: int = 1, auto_stop_mins: int = 120, enable_photon: bool = True, spot_instance_policy: str = 'RELIABILITY_OPTIMIZED', channel: str = 'CHANNEL_NAME_CURRENT', tags: dict = None, users: &lt;function SqlWarehousesClient.list at 0x00000218B4C138B0&gt; = None)</dt><dd><tt>Creates&nbsp;one&nbsp;SQL&nbsp;endpoint&nbsp;per&nbsp;user&nbsp;in&nbsp;the&nbsp;current&nbsp;workspace.&nbsp;The&nbsp;list&nbsp;of&nbsp;users&nbsp;can&nbsp;be&nbsp;limited&nbsp;to&nbsp;a&nbsp;subset&nbsp;of&nbsp;users&nbsp;with&nbsp;the&nbsp;"users"&nbsp;parameter.<br>
<dl><dt><a name="SqlWarehousesClient-create_user_endpoints"><strong>create_user_endpoints</strong></a>(self, naming_template: str, naming_params: dict, cluster_size: str, enable_serverless_compute: bool, min_num_clusters: int = 1, max_num_clusters: int = 1, auto_stop_mins: int = 120, enable_photon: bool = True, spot_instance_policy: str = 'RELIABILITY_OPTIMIZED', channel: str = 'CHANNEL_NAME_CURRENT', tags: dict = None, users: &lt;function SqlWarehousesClient.list at 0x000001961B044790&gt; = None)</dt><dd><tt>Creates&nbsp;one&nbsp;SQL&nbsp;endpoint&nbsp;per&nbsp;user&nbsp;in&nbsp;the&nbsp;current&nbsp;workspace.&nbsp;The&nbsp;list&nbsp;of&nbsp;users&nbsp;can&nbsp;be&nbsp;limited&nbsp;to&nbsp;a&nbsp;subset&nbsp;of&nbsp;users&nbsp;with&nbsp;the&nbsp;"users"&nbsp;parameter.<br>
Parameters:&nbsp;<br>
&nbsp;&nbsp;&nbsp;&nbsp;naming_template&nbsp;(str):&nbsp;The&nbsp;template&nbsp;used&nbsp;to&nbsp;name&nbsp;each&nbsp;user's&nbsp;endpoint.<br>
&nbsp;&nbsp;&nbsp;&nbsp;naming_params&nbsp;(str):&nbsp;The&nbsp;parameters&nbsp;used&nbsp;in&nbsp;completing&nbsp;the&nbsp;template.<br>
Expand All @@ -72,7 +72,7 @@

<dl><dt><a name="SqlWarehousesClient-delete_user_endpoint"><strong>delete_user_endpoint</strong></a>(self, user, naming_template: str, naming_params: dict)</dt></dl>

<dl><dt><a name="SqlWarehousesClient-delete_user_endpoints"><strong>delete_user_endpoints</strong></a>(self, naming_template: str, naming_params: dict, users: &lt;function SqlWarehousesClient.list at 0x00000218B4C138B0&gt; = None)</dt></dl>
<dl><dt><a name="SqlWarehousesClient-delete_user_endpoints"><strong>delete_user_endpoints</strong></a>(self, naming_template: str, naming_params: dict, users: &lt;function SqlWarehousesClient.list at 0x000001961B044790&gt; = None)</dt></dl>

<dl><dt><a name="SqlWarehousesClient-edit"><strong>edit</strong></a>(self, endpoint_id: str, name: str = None, cluster_size: str = None, enable_serverless_compute: bool = None, min_num_clusters: int = None, max_num_clusters: int = None, auto_stop_mins: int = None, enable_photon: bool = None, spot_instance_policy: str = None, channel: str = None, tags: dict = None)</dt></dl>

Expand All @@ -86,13 +86,13 @@

<dl><dt><a name="SqlWarehousesClient-start_user_endpoint"><strong>start_user_endpoint</strong></a>(self, user, naming_template: str, naming_params: dict)</dt></dl>

<dl><dt><a name="SqlWarehousesClient-start_user_endpoints"><strong>start_user_endpoints</strong></a>(self, naming_template: str, naming_params: dict, users: &lt;function SqlWarehousesClient.list at 0x00000218B4C138B0&gt; = None)</dt></dl>
<dl><dt><a name="SqlWarehousesClient-start_user_endpoints"><strong>start_user_endpoints</strong></a>(self, naming_template: str, naming_params: dict, users: &lt;function SqlWarehousesClient.list at 0x000001961B044790&gt; = None)</dt></dl>

<dl><dt><a name="SqlWarehousesClient-stop"><strong>stop</strong></a>(self, endpoint_id)</dt></dl>

<dl><dt><a name="SqlWarehousesClient-stop_user_endpoint"><strong>stop_user_endpoint</strong></a>(self, user, naming_template: str, naming_params: dict)</dt></dl>

<dl><dt><a name="SqlWarehousesClient-stop_user_endpoints"><strong>stop_user_endpoints</strong></a>(self, naming_template: str, naming_params: dict, users: &lt;function SqlWarehousesClient.list at 0x00000218B4C138B0&gt; = None)</dt></dl>
<dl><dt><a name="SqlWarehousesClient-stop_user_endpoints"><strong>stop_user_endpoints</strong></a>(self, naming_template: str, naming_params: dict, users: &lt;function SqlWarehousesClient.list at 0x000001961B044790&gt; = None)</dt></dl>

<dl><dt><a name="SqlWarehousesClient-update"><strong>update</strong></a>(self, endpoint_id: str, name: str = None, cluster_size: str = None, enable_serverless_compute: bool = None, min_num_clusters: int = None, max_num_clusters: int = None, auto_stop_mins: int = None, enable_photon: bool = None, spot_instance_policy: str = None, channel: str = None, tags: dict = None)</dt><dd><tt>#&nbsp;TODO&nbsp;[email protected]:&nbsp;Potential&nbsp;bugs.<br>
#&nbsp;noinspection&nbsp;PyUnresolvedReferences</tt></dd></dl>
Expand Down
4 changes: 2 additions & 2 deletions docs/dbacademy.dougrest.runs.html
Original file line number Diff line number Diff line change
Expand Up @@ -47,11 +47,11 @@

<dl><dt><a name="Runs-cancel"><strong>cancel</strong></a>(self, run: Union[int, dict], *, if_not_exists: str = 'error') -&gt; dict</dt></dl>

<dl><dt><a name="Runs-cancel_all"><strong>cancel_all</strong></a>(self, job_id: int = None) -&gt; &lt;function Runs.list at 0x00000218B4CA93A0&gt;</dt></dl>
<dl><dt><a name="Runs-cancel_all"><strong>cancel_all</strong></a>(self, job_id: int = None) -&gt; &lt;function Runs.list at 0x000001961B119280&gt;</dt></dl>

<dl><dt><a name="Runs-delete"><strong>delete</strong></a>(self, run: Union[int, dict], *, if_not_exists: str = 'error') -&gt; dict</dt></dl>

<dl><dt><a name="Runs-delete_all"><strong>delete_all</strong></a>(self, job_id: int = None) -&gt; &lt;function Runs.list at 0x00000218B4CA93A0&gt;</dt></dl>
<dl><dt><a name="Runs-delete_all"><strong>delete_all</strong></a>(self, job_id: int = None) -&gt; &lt;function Runs.list at 0x000001961B119280&gt;</dt></dl>

<dl><dt><a name="Runs-get"><strong>get</strong></a>(self, run: Union[int, dict], *, if_not_exists: str = 'error') -&gt; dict</dt><dd><tt>#&nbsp;TODO&nbsp;Remove&nbsp;unused&nbsp;parameter<br>
#&nbsp;noinspection&nbsp;PyUnusedLocal</tt></dd></dl>
Expand Down
6 changes: 3 additions & 3 deletions docs/dbacademy.dougrest.workspace.html
Original file line number Diff line number Diff line change
Expand Up @@ -69,13 +69,13 @@

<dl><dt><a name="Workspace-is_empty"><strong>is_empty</strong></a>(self, workspace_path)</dt></dl>

<dl><dt><a name="Workspace-list"><strong>list</strong></a>(self, workspace_path, sort_key=&lt;function Workspace.&lt;lambda&gt; at 0x00000218B49FAE50&gt;)</dt></dl>
<dl><dt><a name="Workspace-list"><strong>list</strong></a>(self, workspace_path, sort_key=&lt;function Workspace.&lt;lambda&gt; at 0x000001961AE60E50&gt;)</dt></dl>

<dl><dt><a name="Workspace-list_names"><strong>list_names</strong></a>(self, workspace_path, sort_key=&lt;function Workspace.&lt;lambda&gt; at 0x00000218B49FAF70&gt;)</dt></dl>
<dl><dt><a name="Workspace-list_names"><strong>list_names</strong></a>(self, workspace_path, sort_key=&lt;function Workspace.&lt;lambda&gt; at 0x000001961AE60F70&gt;)</dt></dl>

<dl><dt><a name="Workspace-mkdirs"><strong>mkdirs</strong></a>(self, workspace_path)</dt></dl>

<dl><dt><a name="Workspace-walk"><strong>walk</strong></a>(self, workspace_path, sort_key=&lt;function Workspace.&lt;lambda&gt; at 0x00000218B4A040D0&gt;)</dt><dd><tt>Recursively&nbsp;list&nbsp;files&nbsp;into&nbsp;an&nbsp;iterator.&nbsp;&nbsp;Sorting&nbsp;within&nbsp;a&nbsp;directory&nbsp;is&nbsp;done&nbsp;by&nbsp;the&nbsp;provided&nbsp;sort_key.</tt></dd></dl>
<dl><dt><a name="Workspace-walk"><strong>walk</strong></a>(self, workspace_path, sort_key=&lt;function Workspace.&lt;lambda&gt; at 0x000001961AE5E0D0&gt;)</dt><dd><tt>Recursively&nbsp;list&nbsp;files&nbsp;into&nbsp;an&nbsp;iterator.&nbsp;&nbsp;Sorting&nbsp;within&nbsp;a&nbsp;directory&nbsp;is&nbsp;done&nbsp;by&nbsp;the&nbsp;provided&nbsp;sort_key.</tt></dd></dl>

<hr>
Static methods defined here:<br>
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
from setuptools import find_packages

setuptools.setup(
version="v3.0.83",
version="v3.0.84",
name="dbacademy",
author="Databricks, Inc",
maintainer="Databricks Academy",
Expand Down
63 changes: 32 additions & 31 deletions src/dbacademy/dbhelper/dataset_manager_class.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,37 +104,38 @@ def install_dataset(self, *, install_min_time: Optional[str], install_max_time:
self.validate_datasets(fail_fast=False)

def validate_datasets(self, fail_fast: bool) -> None:
"""
Validates the "install" of the datasets by recursively listing all files in the remote data repository as well as the local data repository, validating that each file exists but DOES NOT validate file size or checksum.
"""
from dbacademy import dbgems

validation_start = dbgems.clock_start()

if self.staging_source_uri == self.data_source_uri:
# When working with staging data, we need to enumerate what is in there
# and use it as a definitive source to the complete enumeration of our files
start = dbgems.clock_start()
print("\nEnumerating staged files for validation", end="...")
self.__remote_files = DatasetManager.list_r(self.staging_source_uri)
print(dbgems.clock_stopped(start))
print()

print(f"\nValidating the locally installed datasets:")

self.__validate_and_repair()

if self.fixes == 1:
print(f"| fixed 1 issue", end="...")
elif self.fixes > 0:
print(f"| fixed {self.fixes} issues", end="...")
else:
print(f"| validation completed", end="...")

print(dbgems.clock_stopped(validation_start, " total"))

if fail_fast:
assert self.fixes == 0, f"Unexpected modifications to source datasets."
pass
# """
# Validates the "install" of the datasets by recursively listing all files in the remote data repository as well as the local data repository, validating that each file exists but DOES NOT validate file size or checksum.
# """
# from dbacademy import dbgems
#
# validation_start = dbgems.clock_start()
#
# if self.staging_source_uri == self.data_source_uri:
# # When working with staging data, we need to enumerate what is in there
# # and use it as a definitive source to the complete enumeration of our files
# start = dbgems.clock_start()
# print("\nEnumerating staged files for validation", end="...")
# self.__remote_files = DatasetManager.list_r(self.staging_source_uri)
# print(dbgems.clock_stopped(start))
# print()
#
# print(f"\nValidating the locally installed datasets:")
#
# self.__validate_and_repair()
#
# if self.fixes == 1:
# print(f"| fixed 1 issue", end="...")
# elif self.fixes > 0:
# print(f"| fixed {self.fixes} issues", end="...")
# else:
# print(f"| validation completed", end="...")
#
# print(dbgems.clock_stopped(validation_start, " total"))
#
# if fail_fast:
# assert self.fixes == 0, f"Unexpected modifications to source datasets."

def __validate_and_repair(self) -> None:
from dbacademy import dbgems
Expand Down

0 comments on commit 3f83506

Please sign in to comment.