-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logging Instrumentation | Context & Prompt Logging Infra For Enhanced Understanding of Context Composition #196
Comments
I would care substantially about where these logs would be stored and where they can be accessed, and am extremely interested in agents being able to access all logs. |
@monilpat What branch this should target? |
The target should always be sif-dev and a new branch per issue makes sense
…On Jan 10, 2025 at 09:40 -0800, Arsalon ***@***.***>, wrote:
> @monilpat What branch this should target?
each issue should have a new branch created and after it's been completed and then you can make a pull request to the next env (dev ENV?) and after that then to main branch?
@monilpat what's your thoughts ^^
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Adding notes from our call 1/13/24 w/ Jure and Monil
Define the DB architecture
Run (Definition)
Query and UI
|
After some initial research, the approach that @monilpat suggested appears to have several drawbacks. In particular, if we use logging with existing PostgreSQL adapter:
|
@jzvikart @monilpat @jkbrooks I created this ADR for the feature - we may have jumped in quickly and skipped this technical scoping step. Let's fill this out, take a step back and ensure we're all on the same page with the ADR (architectural decision record) - https://docs.google.com/document/d/11CB3FyorvSxPxqbO4P35wTNuHJ-EDD2rKBEUiyO-ngc/edit?usp=sharing |
What is the scenario that we want to instrument? |
Implementation is now working, the recommended next steps are:
Collection and refinement of trace data should be done selectively and iteratively. Capturing and analyzing "everything" is not realistic. To kick this off I recommend doing a demo, or a pair coding session. |
@jzvikart thanks for the update. A few PM comments -
@monilpat can you review this PR and give feedback |
I did not create a PR yet since it would make sense to answer some of the questions first.
I cannot make a video, but I am happy to pair up and discuss whatever the person who will use this wants to know. Testing/QA does not make sense here.
Yes, as Monil suggested, although it is now clear that it would be better to separate the trace data in a separate DB instance. We can still change that though.
No, this is currently only on development machine due to reasons and limitations that I mentioned, and we should make a plan how/where to deploy it. See above.
No, the current interface is SQL, any additional tools need to be discussed and developed. |
One more thing: running and building is still failing non-deterministically. I've tried 3 different branches already and verified that the problem exists in version prior to my changes. We should address this. I've been in contact with Caner, but so far there is no known cause or fix. |
Hey, thanks so much for doing this in terms of the bill issues. It's something that the V2 separation into community plug-ins is gonna solve so it's a separate repository. Note with the way it currently works you will need to run it multiple times for it to successfully build and if it still fails, you'll need to comment out the blame plug-ins. We need to address this as long as your plug-in is being built you are not blocked by this so if you read the logs, you can see if your plug-in has been built or not |
Yeah, that makes sense regarding the PR note. We preferred to have draft PR's when possible for review yeah I think our arsalon would probably be the best person to pair up with at this point. And I'm happy to hop in as needed yeah that can be a fast fallout to separate it as it makes sense but right now getting something working is very important. Yeah, I think Ars and I were talking about a simple UI that is part of the Eliza chat UI that for a conversation shows the runs and then when you click on the run or select the run from a drop-down, it will show you all the associated logs |
@monilpat Thanks for explanation, that's exactly what I've been doing. If it's a known issue that's being worked on that's enough for me. |
OK, I'll create a draft PR so that we can continue the discussion there. As for tools - everything is possible, but we need to decide on the right approach first, considering the tradeoffs and skills of the person who will be doing this. I think more than UI/dropdowns we will need some data analysis tools, scripting, etc. And if we do go into UI, it should definitely be separate from Eliza main UI. |
Happy to review the PR when it available thanks for flagging!
…On Jan 20, 2025 at 20:49 -0800, Arsalon ***@***.***>, wrote:
@jzvikart thanks for the update. A few PM comments -
1. > I don't see a pull request for this feature. Is this PR in review now?
2. > Can you comment on the solution approach or attach a Loom video so that our QA team can understand / begin manual testing scenario development
• did you utilize the adapter-postgres and create additional table in the DB?
• is this live in the test ENV running on an instance of PROSPER?
• is there a simple UI client (did you create a new React client UI to visualize the run data, a simple table or a filter by RUN ID?)
@monilpat can you review this PR and give feedback
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
@TimKozak Tracing framework is implemented and works. To meaningfully continue this ticket, we would need a "customer" who will be analysing the data/prompts and one or more use cases. When we know who the "customer" is I can provide engineering support and everything that's needed. See my comments above. |
Next steps that we discussed so far:
|
Yes, I spoke to Juri. We just finished up a meeting. I effectively
unblocked him for what we need to get done for the scope of this ticket.
There's a broader question that Juri has about how we're gonna use this but
I think he is now unblocked I'm hoping he's able to implement the four
calls to the trace method for each of the actions under plug-in GitHub and
plug in Coinbase
…On Thu, Jan 23, 2025 at 10:43 AM jzvikart ***@***.***> wrote:
Next steps that we discussed so far:
- Record a video
- Trace a random scenario with unknown tracing criteria
—
Reply to this email directly, view it on GitHub
<#196 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADS6ROIIEB5S4UWOI4TXSXT2MEZ6TAVCNFSM6AAAAABUYFPN2WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMJQGY4TMNZWG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I'm not sure what his band with is, but I imagine he should be able to do
it within the next day or two as it's pretty much copy paste once he gets
one working and I explained to him how to test it
…On Thu, Jan 23, 2025 at 11:18 AM Monil Patel ***@***.***> wrote:
Yes, I spoke to Juri. We just finished up a meeting. I effectively
unblocked him for what we need to get done for the scope of this ticket.
There's a broader question that Juri has about how we're gonna use this but
I think he is now unblocked I'm hoping he's able to implement the four
calls to the trace method for each of the actions under plug-in GitHub and
plug in Coinbase
On Thu, Jan 23, 2025 at 10:43 AM jzvikart ***@***.***>
wrote:
> Next steps that we discussed so far:
>
> - Record a video
> - Trace a random scenario with unknown tracing criteria
>
> —
> Reply to this email directly, view it on GitHub
> <#196 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ADS6ROIIEB5S4UWOI4TXSXT2MEZ6TAVCNFSM6AAAAABUYFPN2WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMJQGY4TMNZWG4>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
Note to self:
This would wrap up this ticket. |
@jzvikart the above sounds good. The ultimate use case I want to be able to do is this. Have an API endpoint in which I can GET the traces for runs (add some pagination, optional query params like run ID, agent name, date range). The endpoint will be consumed by a frontend (Swagger UI is fine) and displayed. If we don't have Swagger implemented in the codebase add the config please and host on a non local URL (public URL) we can all access. with the swagger UI i can send in params like agent name, date range, etc. and get back the runs as a response I can then look into the response for various details like what the prompt was, the action, etc. it will also help me to SEE and understand the implementation so i can give feedback on additional things we want to include If you can get the following done, we can consider this completed:
I think if we have a Swagger UI that I can play around with this endpoint that will be good enough here. |
@jzvikart once the PR is merged, I will close this out. |
As RSP team, we want to have deeper visibility in Context Construction via providers, so that we can understand how key details (Recent Messages, User Context, Relevant Facts) are constructed for debugging and context construction optimization.
Acceptance Criteria:
We want to log and review these as it relates to constructing context
The text was updated successfully, but these errors were encountered: