Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Resource Design Proposal #97

Closed
tdstein opened this issue Mar 15, 2024 Discussed in #96 · 0 comments
Closed

Implement Resource Design Proposal #97

tdstein opened this issue Mar 15, 2024 Discussed in #96 · 0 comments

Comments

@tdstein
Copy link
Collaborator

tdstein commented Mar 15, 2024

Discussed in #96

Originally posted by tdstein March 15, 2024
This discussion is a continuation of #92.

tl;dr - inherit from dict and provide @property fields for stable access patterns.

class Resource(ABC, dict):
	pass

class User(Resource):
	@property
	def guid(self) -> str:
		return self["guid"]

Objective

Define a convention for Resource implementations that is intuitive to use and can provide consumers guarantees as the Connect API changes over time.

Proposal

  1. Define an ABC class named Resource class that inherits from dict.
  2. For each resource type, define a class that inherits from Resource (e.g., class User(Resource))
  3. Use a @property annotation for fields that deserve long-time-support (LTS).

Example Implementation:

class Resource(ABC, dict):
pass

class User(Resource):
	@property
	def guid(self) -> str:
		return self["guid"]

This provides two options for accessing the GUID value.

  1. Get the guid value from user with the key guid.

Example

user = User(guid="foobar")
guid = user["guid"]
  1. Get the guid value from user with the field guid.

Example

user = User(guid="foobar")
guid = user.guid

Backwards Compatibility

An implementation is backwards compatible when the implementation is written for a older version of Connect and continues operating correctly on a newer version of Connect which removes one or more fields.

This approach inherently supports backwards compatibility for the programmer under the following conditions:

  1. The program does not access removed items by key.
  2. The program accesses removed items by @property field.
  3. The program is using the latest "minor" version of posit-sdk.

For developers, these constraints are met when:

  1. The @property method implementation provides default behavior when the item is missing from the underlying dict.
class User(Resource):

	@property
	def old_field(self) -> str | None:
		return self.get("old_field", None)

or

  1. The @property method resolves to a new item when the original item is missing (i.e., a field name is changed).
class User(Resource):

	@property
	def old_field(self) -> str | None:
		return self.get("new_field", None)

To make these constraints clear to the programmer, the developer should utilize FutureWarning to notify the programmer of possible side-effects when performing key-based access patterns.

This can be accomplished by override the default behavior of all key-based dunder methods implemented by dict.

This includes the following methods:

  • __getitem__(self, key)
  • __setitem__(self, key, value)
  • __delitem__(self, key)
import warnings

class Resource(dict):

	def __getitem__(self, key):
		warnings.warn(f"__getitem__ for '{key}' does not support backwards compatibility. Consider using field based access instead: 'instance.{key}'", FutureWarning)
		super().__getitem__(key)

	def __setitem__(self, key, value):
		warnings.warn(f"__setitem__ for '{key}' does not support backwards compatibility. Consider using field based access instead: 'instance.{key} = {value}", FutureWarning)
		super().__getitem__(key, value)

	def __delitem__(self, key):
		warnings.warn(f"__delitem__ for '{key}' does not support backwards compatibility. Consider using field based access instead: 'del instance.{key}'", FutureWarning)
		super().__delitem__(key)

This behavior may be disable by ignoring FutureWarning messages at runtime.

import warnings

# Disable FutureWarning messages
warnings.filterwarnings("ignore", category=FutureWarning)

Caveats

The major caveat to this approach is that pandas does not inspect @property fields when creating a DataFrame.

Therefore, the default behavior when initializing a pandas.DataFrame with a list[Resource] argument will result in a DataFrame with all underlying dict items as cells.

This behavior can be bypassed by converting the list[Resource] before initializing the pandas.DataFrame.

For example, the following snippet provides a workaround to this behavior. This comprehension creates a new dict using the property fields as the keys. In the future, this method or could be added to the SDK to support backwards compatible support for DataFrames.

pandas.DataFrame([{k: getattr(o, k) for k in dir(o) if isinstance(getattr(o.__class__, prop, None), property)} for o in users])

Forwards Compatibility

An implementation is forwards compatible when the implementation is written for a older version of Connect and continues operating correctly on a newer version of Connect which adds one or more fields.

This approach inherently supports forwards compatibility since a Resource may be initialized with additional fields.

Alternatives

Data Classes

The dataclasses module provides a viable alternative, but introduces additional complexity that provides little value at this time.

The main benefit that dataclasses provides is the ability to define named types.

The other advantage to dataclasses is that conversion to a pandas.DataFrame is supported by pandas: https://github.com/pandas-dev/pandas/blob/v2.2.1/pandas/core/frame.py#L835

The drawback to dataclasses is that casting a dict to a dataclass an error will result if the dict contains keys not defined by the dataclass. This behavior can be modified by redefining init method and setting dataclass(init=False), but this implementation is non-obvious and will likely result in additional confusion for developers.

TypedDict

The TypedDict type suffers from the same drawbacks as @dataclass.

The default behavior of TypedDict does not allow additional kwargs to be passed on init. This behavior can be altered by setting total=False.

Additionally, TypedDict does not allow additional attributes. This forces backwards compatibility to be implemented outside of the Resource definition. This can be accomplished, but a successful implementation is tedious and error prone.

Overall, inheriting from TypedDict does not provide any major advantages over inheriting from dict.

@tdstein tdstein closed this as not planned Won't fix, can't repro, duplicate, stale Mar 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant