Add jest replays

autoblocksai · Oct 27, 2023 · 8456788 · 8456788
1 parent ad0bcba
commit 8456788
Show file tree

Hide file tree

Showing 6 changed files with 3,697 additions and 303 deletions.
diff --git a/.github/workflows/jest-replays.yml b/.github/workflows/jest-replays.yml
@@ -0,0 +1,31 @@
+name: Autoblocks Replays
+
+on:
+  push:
+    paths:
+      - 'JavaScript/jest-replays/**'
+
+jobs:
+  autoblocks-replays:
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+
+      - name: Create .env file
+        run: |
+          touch .env
+          echo "OPENAI_API_KEY=${{ secrets.OPENAI_API_KEY }}" >> .env
+          echo "AUTOBLOCKS_INGESTION_KEY=${{ secrets.AUTOBLOCKS_REPLAY_INGESTION_KEY }}" >> .env
+
+      - name: Setup Node
+        uses: actions/setup-node@v3
+        with:
+          node-version: 20
+
+      - name: Install dependencies
+        run: npm ci
+
+      - name: Run script
+        run: npm run start
diff --git a/JavaScript/jest-replays/README.md b/JavaScript/jest-replays/README.md
@@ -35,4 +35,90 @@ AUTOBLOCKS_INGESTION_KEY=<your-ingestion-key>
 
 <!-- getting started end -->
 
-This project shows how you can run Autoblocks replays via your [jest](https://jestjs.io/) test suite.
+## Replays
+
+This project shows how you can run Autoblocks Replays via your [Jest](https://jestjs.io/) test suite. Follow the steps below to get started.
+
+### 1. Use your replay key
+
+Replace the value for `AUTOBLOCKS_INGESTION_KEY` in the `.env` file with your replay key. Your replay key is in the same place as your
+ingestion key: https://app.autoblocks.ai/settings/api-keys
+
+> **_NOTE:_** This means you need to make very few code changes to your production code to get started with Autoblocks Replays. You simply need to swap out an environment variable.
+
+### 2. Set an `AUTOBLOCKS_REPLAY_ID`
+
+This is already set up in this example via the `start` script in [`package.json`](./package.json):
+
+```json
+  "scripts": {
+    "start": "AUTOBLOCKS_REPLAY_ID=$(date +%Y%m%d-%H%M%S) dotenv -e .env -- jest"
+  },
+```
+
+### 3. Run the tests
+
+First install the dependencies:
+
+```
+npm install
+```
+
+Then run `npm start` (which runs the Jest test suite):
+
+```
+npm start
+```
+
+Within the test suite, you should see a link printed to the console that will take you to the replay in the Autoblocks UI:
+
+```
+> [email protected] start
+> AUTOBLOCKS_REPLAY_ID=$(date +%Y%m%d-%H%M%S) dotenv -e .env -- jest
+
+  console.log
+    View your replay at https://app.autoblocks.ai/replays/local/20231027-112722
+
+      at Object.log (test/index.spec.js:13:13)
+
+ PASS  test/index.spec.js (13.689 s)
+  run
+    ✓ should return a response for "How do I sign up?" (4344 ms)
+    ✓ should return a response for "How many pricing plans do you have?" (6913 ms)
+    ✓ should return a response for "What is your refund policy?" (2237 ms)
+
+Test Suites: 1 passed, 1 total
+Tests:       3 passed, 3 total
+Snapshots:   0 total
+Time:        13.71 s, estimated 20 s
+Ran all test suites.
+```
+
+Run the tests a few times so that you generate multiple replays (your first replay won't have any baseline to compare against!).
+
+### 4. View the replays in the Autoblocks UI
+
+The link will take you to the replay UI where you can see at-a-glance differences between the replay runs over the three test cases. There are four main columns:
+
+- **Message**: The name of the Autoblocks event sent
+  - a gray icon indicates no changes
+  - a yellow icon indicates changes
+  - a red icon indicates the event was there before but not now
+  - a green icon indicates the event was not there before but is now
+- **Changes**: The number of word changes between the event properties of the replay run and the baseline run
+- **Difference Scores**: For properties that we've detected to be LLM outputs, this column will show you a difference score between the value from the baseline run and the current run
+- **Evals**: The results of your [Autoblocks Evaluators](https://docs.autoblocks.ai/features/evaluators)
+
+In one of my runs, I could see that the difference score was pretty high for the `"What is your refund policy?"` test case:
+
+![replay-summary](https://github.com/autoblocksai/autoblocks-examples/assets/7498009/cb99858a-8f94-4bd9-b8b4-893e32097642)
+
+Clicking into **View Differences**, I could see that the response now included an apology about not being able to answer questions about refunds, even though it did previously:
+
+![replay-differences](https://github.com/autoblocksai/autoblocks-examples/assets/7498009/53b33ed5-fe8e-44cf-ac07-c2f315ecb4b9)
+
+This kind of snapshot / stability testing is important to run over LLM outputs on every pull request so that you can catch regressions before they go to production.
+
+### 5. Run the tests in CI
+
+See the [GitHub Action](/.github/workflows/jest-replays.yml) workflow associated with this project. This ensures we run replays on every pull request.