Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add new variant of the state object with a history #70

Draft
wants to merge 71 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
11d03a5
feat: add dataclass based historystate
hollandjg May 20, 2024
5ca28b1
refactor: move dataclass history to a new file
hollandjg May 20, 2024
b08288b
feat: add history state based on UserList
hollandjg May 21, 2024
21172ab
feat: add custom repr of history state
hollandjg May 21, 2024
25ac836
refactor: rename function to get
hollandjg May 21, 2024
0b60fd7
feat: add new extension to state with a simple history
hollandjg May 21, 2024
d1a5b40
feat: add initialization of the history based on the initial argument…
hollandjg May 21, 2024
d17ac7e
refactor: remove warnings on unused fields
hollandjg May 21, 2024
1288c93
refactor: simplify tests
hollandjg May 21, 2024
43bb491
feat: add StandardStateHistory which combines both the standaard stat…
hollandjg May 21, 2024
33d84a1
test: fix tests – missing import
hollandjg May 21, 2024
73d613e
feat: add an alternative state history which uses a different set of …
hollandjg May 21, 2024
6ae0504
feat: add some special history methods to the state
hollandjg May 21, 2024
af64987
chore: remove outdated dataclass implementation
hollandjg May 21, 2024
2ce8b1e
chore: remove outdated list implementation
hollandjg May 21, 2024
3b2d7ef
refactor: split standard implementation into its own file
hollandjg May 21, 2024
d9f335d
test: add delta import to doctest
hollandjg May 21, 2024
e0878fb
refactor: StateHistory to DeltaHistory
hollandjg May 22, 2024
88425f2
feat: add a new state with history which stores the full state object
hollandjg May 22, 2024
1893502
feat: update history to be a recursive thing which uses the parents o…
hollandjg May 22, 2024
807884d
feat: add a new stub with a linear history
hollandjg May 22, 2024
9cefaa9
feat: include the initial version of the state in the history
hollandjg May 22, 2024
03245cf
test: update doctests
hollandjg May 22, 2024
35ea519
test: update doctests
hollandjg May 22, 2024
09516c0
feat: add a new function as_of_last
hollandjg May 22, 2024
2ea7129
test: update doctests
hollandjg May 22, 2024
b23ccdc
refactor: move post_init up to the top of the declarations
hollandjg May 22, 2024
b18acc2
refactor: move methods into separate functions
hollandjg May 22, 2024
84960c1
refactor: update state_history_delta to use the new functions
hollandjg May 22, 2024
48f7d9b
chore: delete state_history_linear
hollandjg May 22, 2024
d44b09d
chore: delete state_history_recursive
hollandjg May 22, 2024
8a61167
refactor: move history functions out of the object and split into a f…
hollandjg May 22, 2024
9441b6a
test: fix broken test
hollandjg May 22, 2024
7288f22
test: fix broken test
hollandjg May 22, 2024
65d5d69
docs: add documentation and testing for history_where
hollandjg May 23, 2024
910b2a9
docs: add documentation and testing for history_contains
hollandjg May 23, 2024
36cb45d
docs: update docstring of history_of
hollandjg May 23, 2024
1f6ba34
chore: delete unused history_filter
hollandjg May 23, 2024
62c00a5
chore: update type argument
hollandjg May 23, 2024
a6281b0
docs: update docs for filter_to_last
hollandjg May 23, 2024
8560389
refactor: update history_up_to_last
hollandjg May 23, 2024
6a15611
refactor: dont' throw an error infilter_to_last if index isn't found
hollandjg May 23, 2024
96653d4
refactor: use mapping rather than mutablemapping
hollandjg May 23, 2024
aaffc08
refactor: update history function
hollandjg May 23, 2024
519a5d4
feat: add new history_of_key_where function
hollandjg May 23, 2024
17efd9e
feat: add a new function which can aggregate histories
hollandjg May 24, 2024
35fd326
Revert "feat: add a new function which can aggregate histories"
hollandjg May 24, 2024
16b06bb
docs: document some recipes for handling histories
hollandjg May 24, 2024
7997f15
feat: add a new History class
hollandjg May 24, 2024
d2e83dd
refactor: make helper functions private and updated examples
hollandjg May 24, 2024
225d691
refactor: use new HistoryObject in DeltaHistory
hollandjg May 24, 2024
a8167e8
docs: add type hints
hollandjg May 24, 2024
40fdd99
docs: simplify doctests
hollandjg May 24, 2024
69d6143
docs: simplify doctests
hollandjg May 24, 2024
59d00a9
docs: simplify doctests
hollandjg May 24, 2024
f30170d
docs: simplify queries using history object
hollandjg May 24, 2024
1e79b7b
refactor: remove unneeded special function
hollandjg May 24, 2024
8a1d78f
refactor: rename shadowed function
hollandjg May 24, 2024
785a91f
feat: add arbitrary history filter to last function
hollandjg May 24, 2024
c5ade53
refactor: simplify code to get values in condition
hollandjg May 24, 2024
79b1713
test: update error string
hollandjg May 24, 2024
8fd9b52
refactor: add reconstruct method on the history
hollandjg May 24, 2024
8638190
refactor: make the _history_up_to_last also accept single field names…
hollandjg May 24, 2024
b54f1b3
docs: update documentation for AlternateDeltaHistory using the new hi…
hollandjg May 24, 2024
9b5079e
test: fix doctest for python 3.9 and lower
hollandjg May 24, 2024
cfb3275
feat: add starting point of delta_to_state_transformed
hollandjg May 24, 2024
c738808
test: update formatting of tests
hollandjg May 24, 2024
faf554b
docs: add starting notebook for States with History tutorial.
hollandjg May 24, 2024
44ef8fc
refactor: move dataclass history to a new file
hollandjg May 20, 2024
d6f1c36
feat: add history state based on UserList
hollandjg May 21, 2024
003f2ad
type: fix type for add function
hollandjg Jun 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
364 changes: 364 additions & 0 deletions docs/States with History.ipynb
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starting notebook on examples.

Original file line number Diff line number Diff line change
@@ -0,0 +1,364 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "3a5b711036c1580c",
"metadata": {},
"source": [
"# State Objects with History"
]
},
{
"cell_type": "markdown",
"id": "4f3b8109e7e35d3a",
"metadata": {},
"source": [
"Some state objects use a history, rather than the fields direct in the state, to record changes in an experiment's \n",
"data over time."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "initial_id",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AlternateDeltaHistory(history=[...], variables=None, conditions=None, experiment_data=None, model=None)"
]
},
"execution_count": null,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import random\n",
"from pprint import pp\n",
"\n",
"from autora.state import Delta\n",
"from autora.state_history_delta_alternative import AlternateDeltaHistory\n",
"\n",
"AlternateDeltaHistory()"
]
},
{
"cell_type": "markdown",
"id": "f6a614d39f993a06",
"metadata": {},
"source": [
"We create an empty state and add initial data using Deltas. Each time this state is updated, the values in the fields `variables`, `conditions` etc. are replaced:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1af8d93494c5256d",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>x</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>9</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" x\n",
"0 5\n",
"1 6\n",
"2 7\n",
"3 8\n",
"4 9"
]
},
"execution_count": null,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas as pd\n",
"\n",
"s = AlternateDeltaHistory() + Delta(conditions=dict(x=range(5))) + Delta(conditions=dict(x=range(5, 10)))\n",
"s.conditions"
]
},
{
"cell_type": "markdown",
"id": "3aed485655696cc8",
"metadata": {},
"source": [
"But the history keeps a record of the changes: the initial state (emtpy), and then the two Deltas:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b4f746a8af05c195",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"History([AlternateDeltaHistory(history=[...], variables=None, conditions=None, experiment_data=None, model=None),\n",
" {'conditions': {'x': range(0, 5)}},\n",
" {'conditions': {'x': range(5, 10)}}])"
]
},
"execution_count": null,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"s.history"
]
},
{
"cell_type": "markdown",
"id": "bef0879f430ea121",
"metadata": {},
"source": [
"We can reconstruct the state at any point by slicing the history and using the `.reconstruct` method, here after the \n",
"first Delta but before the second:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cc7f9c6532f28f8d",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>x</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>4</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" x\n",
"0 0\n",
"1 1\n",
"2 2\n",
"3 3\n",
"4 4"
]
},
"execution_count": null,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"s.history[:2].reconstruct().conditions"
]
},
{
"cell_type": "markdown",
"id": "e72803574116a22f",
"metadata": {},
"source": [
"By adding additional metadata to the Deltas, we can make it easier to find particular states. This might be useful in\n",
" a complex AutoRA cycle where different steps need different versions of the same data. "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9b6d6e76dad8beff",
"metadata": {},
"outputs": [],
"source": [
"def apply_transformation_to_input_state(function):\n",
" \"\"\"Decorator which applies a transformation to the input state\"\"\"\n",
" "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1d187204c2e7a82a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Empty DataFrame\n",
"Columns: [x]\n",
"Index: []\n",
"[AlternateDeltaHistory(history=[...], variables=None, conditions=Empty DataFrame\n",
"Columns: [x]\n",
"Index: [], experiment_data=None, model=None), {'conditions': {'x': [0]}}]\n",
" x\n",
"0 0\n",
"[AlternateDeltaHistory(history=[...], variables=None, conditions=Empty DataFrame\n",
"Columns: [x]\n",
"Index: [], experiment_data=None, model=None), {'conditions': {'x': [0]}}, {'conditions': {'x': [1]}}]\n",
" x\n",
"0 1\n",
"[AlternateDeltaHistory(history=[...], variables=None, conditions=Empty DataFrame\n",
"Columns: [x]\n",
"Index: [], experiment_data=None, model=None), {'conditions': {'x': [0]}}, {'conditions': {'x': [1]}}, {'conditions': {'x': [0]}}]\n",
" x\n",
"0 0\n",
"[AlternateDeltaHistory(history=[...], variables=None, conditions=Empty DataFrame\n",
"Columns: [x]\n",
"Index: [], experiment_data=None, model=None), {'conditions': {'x': [0]}}, {'conditions': {'x': [1]}}, {'conditions': {'x': [0]}}, {'conditions': {'x': [1]}}]\n",
" x\n",
"0 1\n",
"[AlternateDeltaHistory(history=[...], variables=None, conditions=Empty DataFrame\n",
"Columns: [x]\n",
"Index: [], experiment_data=None, model=None), {'conditions': {'x': [0]}}, {'conditions': {'x': [1]}}, {'conditions': {'x': [0]}}, {'conditions': {'x': [1]}}, {'conditions': {'x': [0]}}]\n",
" x\n",
"0 0\n",
"[AlternateDeltaHistory(history=[...], variables=None, conditions=Empty DataFrame\n",
"Columns: [x]\n",
"Index: [], experiment_data=None, model=None), {'conditions': {'x': [0]}}, {'conditions': {'x': [1]}}, {'conditions': {'x': [0]}}, {'conditions': {'x': [1]}}, {'conditions': {'x': [0]}}, {'conditions': {'x': [1]}}]\n"
]
}
],
"source": [
"from autora.state_history_delta import delta_to_state_transformed\n",
"from autora.state import inputs_from_state\n",
"\n",
"t = AlternateDeltaHistory(conditions=pd.DataFrame({\"x\": []}))\n",
"\n",
"# TODO: At this step we need a function which we can use to concatenate all of the \n",
"# TODO: raw condition data but return them as part of the state. We don't have the \n",
"# TODO: necessary helper functions on the History yet.\n",
"@delta_to_state_transformed(lambda s: s.where(meta=\"raw\")) \n",
"@inputs_from_state\n",
"def experimentalist(conditions):\n",
" possible_conditions = set(range(10))\n",
" print(conditions)\n",
" already_seen_conditions = set(conditions[\"x\"]) \n",
" allowed_conditions = possible_conditions - already_seen_conditions\n",
" conditions_out = min(allowed_conditions)\n",
" return Delta(conditions=dict(x=[conditions_out]))\n",
"\n",
"for i in range(6):\n",
" t = experimentalist(t)\n",
" print(t.history)\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e42f5f53024068ff",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
8 changes: 4 additions & 4 deletions src/autora/state.py
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are changes to the basic state adding so that we don't have to emit a warning if there's a field missing in the Delta, which may often be the case for metadata fields.

Original file line number Diff line number Diff line change
Expand Up @@ -214,7 +214,7 @@ class State:

"""

def __add__(self, other: Union[Delta, Mapping]):
def __add__(self, other: Union[Delta, Mapping], warn_on_unused_fields=True):
updates = dict()
other_fields_unused = list(other.keys())
for self_field in fields(self):
Expand Down Expand Up @@ -245,7 +245,7 @@ def __add__(self, other: Union[Delta, Mapping]):
"delta_behaviour=`%s` not implemented" % delta_behavior
)

if len(other_fields_unused) > 0:
if warn_on_unused_fields and len(other_fields_unused) > 0:
warnings.warn(
"These fields: %s could not be used to update %s, "
"which has these fields & aliases: %s"
Expand Down Expand Up @@ -1104,8 +1104,8 @@ class StandardState(State):
1 7

Datatypes which are incompatible with a pd.DataFrame will throw an error:
>>> s + Delta(conditions="not compatible with pd.DataFrame") \
# doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
>>> s + Delta(conditions="not compatible with pd.DataFrame")
... # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
Traceback (most recent call last):
...
ValueError: ...
Expand Down
Loading
Loading