Skip to content

Commit

Permalink
add example
Browse files Browse the repository at this point in the history
  • Loading branch information
younesStrittmatter committed Jul 9, 2024
1 parent 77461bc commit 249c571
Show file tree
Hide file tree
Showing 3 changed files with 380 additions and 39 deletions.
329 changes: 313 additions & 16 deletions docs/Basic Usage.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -7,45 +7,342 @@
"collapsed": false
},
"source": [
"# Basic Usage\n"
"# Basic Usage\n",
"Here, we show how to randomly sample a sequence of rewards which can be used in a bandit task.\n",
"A bandit task provides the participant with multiple options (number of arms). Each arm has a reward probability.\n",
"Here, we show how to create a sequence of reward probabilities and rewards for a 2-arm bandit task."
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 27,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from autora.experimentalist.bandit_random import Example"
"import pandas as pd\n",
"\n",
"from autora.experimentalist.bandit_random import bandit_random_pool_proba, \\\n",
" bandit_random_pool_from_proba, bandit_random_pool"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Include inline mathematics like this: $4 < 5$\n",
"This package provides functions to randomly sample a list of\n",
"- probability sequences\n",
"- reward sequences\n",
"\n",
"Include block mathematics like this (don't forget the empty lines above and below the block):\n",
"# Pool_proba\n",
"First, we can use default values, to create a sequence, where the reward probability is .5 for each arm.\n",
"We need to pass in the number of arms and the length of the sequence that we want to generate:"
]
},
{
"cell_type": "code",
"execution_count": 28,
"outputs": [
{
"data": {
"text/plain": "[[[0.5, 0.5], [0.5, 0.5], [0.5, 0.5], [0.5, 0.5]]]"
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"default_probability_sequences = bandit_random_pool_proba(num_probabilities=2, sequence_length=4)\n",
"default_probability_sequences"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"We also can set initial values:"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 29,
"outputs": [
{
"data": {
"text/plain": "[[[0.1, 0.9], [0.1, 0.9], [0.1, 0.9], [0.1, 0.9]]]"
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"constant_probability_sequence = bandit_random_pool_proba(num_probabilities=2, sequence_length=4, initial_probabilities=[.1, .9])\n",
"constant_probability_sequence"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"We can do the same for drift rates:"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 30,
"outputs": [
{
"data": {
"text/plain": "[[[0.1, 0.9],\n [0.2, 0.8],\n [0.30000000000000004, 0.7000000000000001],\n [0.4, 0.6000000000000001]]]"
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"changing_probability_sequence = bandit_random_pool_proba(num_probabilities=2, sequence_length=4, initial_probabilities=[0.1, .9], drift_rates=[.1, -.1])\n",
"changing_probability_sequence"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"Instead of having a fixed initial value and drift rate, we can also sample them from a range:"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 31,
"outputs": [
{
"data": {
"text/plain": "[[[0.015097359670462374, 0.975340226214809],\n [0.08820028316160722, 0.954897827401469],\n [0.16130320665275205, 0.934455428588129],\n [0.23440613014389688, 0.914013029774789]]]"
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"random_probability_sequence = bandit_random_pool_proba(num_probabilities=2, sequence_length=4, initial_probabilities=[[0.,.1], [.8, 1.]], drift_rates=[[0,.1],[-.1,0]])\n",
"random_probability_sequence"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"We pass in the number of sequence to generate as `num_samples`"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 32,
"outputs": [
{
"data": {
"text/plain": "[[[0.05277517161024149, 0.9157516813620797],\n [0.06601093773147637, 0.8512254219581823],\n [0.07924670385271125, 0.786699162554285],\n [0.09248246997394613, 0.7221729031503876]],\n [[0.03308990634055732, 0.8608567922155729],\n [0.05423027527564794, 0.8348824396142384],\n [0.07537064421073857, 0.8089080870129038],\n [0.09651101314582919, 0.7829337344115693]],\n [[0.05228116419768012, 0.9571430988304549],\n [0.10872837330001228, 0.922489870191641],\n [0.16517558240234442, 0.887836641552827],\n [0.22162279150467656, 0.8531834129140131]],\n [[0.017985053533171515, 0.9696895439983294],\n [0.07759069582130446, 0.9603867806583171],\n [0.13719633810943738, 0.9510840173183047],\n [0.1968019803975703, 0.9417812539782924]]]"
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"random_probability_sequence = bandit_random_pool_proba(num_probabilities=2, sequence_length=4, initial_probabilities=[[0.,.1], [.8, 1.]], drift_rates=[[0,.1],[-.1,0]], num_samples=4)\n",
"random_probability_sequence"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"We can use the created probability sequences to create reward sequences:"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 33,
"outputs": [
{
"data": {
"text/plain": "[[[0, 1], [0, 1], [0, 1], [0, 1]],\n [[0, 1], [0, 1], [0, 1], [0, 0]],\n [[0, 1], [0, 1], [0, 1], [0, 0]],\n [[0, 1], [0, 0], [1, 1], [0, 1]]]"
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"reward_sequences = bandit_random_pool_from_proba(random_probability_sequence)\n",
"reward_sequences"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"Or, we can use `bandit_random_pool` with the same arguments as in the `bandit_random_pool_proba` to generate reward sequences directly:"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 34,
"outputs": [
{
"data": {
"text/plain": "[[[0, 0], [0, 1], [0, 1], [1, 1]],\n [[0, 1], [0, 1], [0, 1], [1, 1]],\n [[0, 1], [0, 1], [0, 1], [1, 1]],\n [[0, 1], [0, 0], [0, 1], [0, 1]]]"
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"reward_sequences = bandit_random_pool(num_rewards=2, sequence_length=4, initial_probabilities=[[0.,.1], [.8, 1.]], drift_rates=[[0,.1],[-.1,0]], num_samples=4)\n",
"reward_sequences"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"## Use in State\n",
"\n",
"!!!Warning If you want to use this in the AutoRa `StandardState` you need to convert the return value into a `pd.DataFrame`:"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 41,
"outputs": [
{
"data": {
"text/plain": " reward-trajectory\n0 [[0, 1], [0, 1], [1, 0], [0, 0], [1, 1], [0, 1...",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>reward-trajectory</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>[[0, 1], [0, 1], [1, 0], [0, 0], [1, 1], [0, 1...</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# First, we define the variables:\n",
"from autora.variable import VariableCollection, Variable\n",
"\n",
"$$ \n",
"y + 1 = 4 \n",
"$$\n",
"variables = VariableCollection(\n",
" independent_variables=[Variable(name=\"reward-trajectory\")],\n",
" dependent_variables=[Variable(name=\"choice-trajectory\")]\n",
")\n",
"\n",
"... or this:\n",
"# With these variables, we initialize a StandardState\n",
"from autora.state import StandardState\n",
"\n",
"\\begin{align}\n",
" p(v_i=1|\\mathbf{h}) & = \\sigma\\left(\\sum_j w_{ij}h_j + b_i\\right) \\\\\n",
" p(h_j=1|\\mathbf{v}) & = \\sigma\\left(\\sum_i w_{ij}v_i + c_j\\right)\n",
"\\end{align}"
]
"state = StandardState()\n",
"\n",
"# Here, we want to create a random reward-sequences directly as on state function\n",
"from autora.state import Delta, on_state\n",
"\n",
"\n",
"@on_state()\n",
"def pool_on_state(num_rewards=2, sequence_length=10, num_samples=1, initial_probabilities=None,\n",
" drift_rates=None):\n",
" sequence_as_list = bandit_random_pool(\n",
" num_rewards=num_rewards, sequence_length=sequence_length, num_samples=num_samples,\n",
" initial_probabilities=initial_probabilities, drift_rates=drift_rates)\n",
" # the condition of the state expect a pandas DataFrame,\n",
" sequence_as_df = pd.DataFrame({\"reward-trajectory\": sequence_as_list})\n",
" return Delta(conditions=sequence_as_df)\n",
"\n",
"\n",
"# now we can use the pool_on_state on the state to create conditions:\n",
"state = pool_on_state(state)\n",
"state.conditions"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
"source": [
"We can pass in keyword arguments into the on_state function as well. Here, we create 3 sequences with initial values for the first arm between 0 and .3 and for the second arm between .7 and 1. And drift rates are sampled between 0 and .1, or -.1 and 0, respectively:"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 42,
"outputs": [
{
"data": {
"text/plain": " reward-trajectory\n0 [[1, 1], [1, 1], [0, 0], [1, 0], [1, 0], [0, 0...\n1 [[1, 0], [0, 1], [1, 1], [1, 1], [0, 0], [0, 0...\n2 [[0, 1], [0, 0], [0, 0], [1, 0], [0, 1], [0, 1...",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>reward-trajectory</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>[[1, 1], [1, 1], [0, 0], [1, 0], [1, 0], [0, 0...</td>\n </tr>\n <tr>\n <th>1</th>\n <td>[[1, 0], [0, 1], [1, 1], [1, 1], [0, 0], [0, 0...</td>\n </tr>\n <tr>\n <th>2</th>\n <td>[[0, 1], [0, 0], [0, 0], [1, 0], [0, 1], [0, 1...</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"state = pool_on_state(state, num_samples=3, initial_probabilities=[[0, .3], [.7, 1.]], drift_rates=[[0, .1], [-.1,0]])\n",
"state.conditions"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [],
"metadata": {
"collapsed": false
}
}
],
"metadata": {
Expand Down
Loading

0 comments on commit 249c571

Please sign in to comment.