-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return cursor of the first host logs entry via headers #5333
Conversation
Return first entry's cursor via custom `X-First-Cursor` header that can be consumed by the client and used for continual requesting of the historic logs. Once the first fetch returns data, the cursor can be supplied as the first argument to the Range header in another call, fetching accurate slice of the journal with the previous log entries using the `Range: entries=cursor[[:num_skip]:num_entries]` syntax. Let's say we fetch logs with the Range header `entries=:-19:20` (to fetch very last 20 lines of the logs, see below why not `entries:-20:20`) and we get `cursor50` as the reply (the actual value will be much more complex and with no guaranteed format). To fetch previous slice of the logs, we use `entries=cursor50:-20:20`, which would return 20 lines previous to `cursor50` and `cursor30` in the cursor header. This way we can go all the way back to the history. One problem with the cursor is that it's not possible to determine when the negative num_skip points beyond the first log entry. In that case the client either needs to know what the first entry is (via `entries=:0:1`) or can iterate naively and stop once two subsequent requests return the same first cursor. Another caveat, even though it's unlikely it will be hit in real usage, is that it's not possible to fetch the last line only - if no cursor is provided, negative num_skip argument is needed, and in that case we're pointing one record back from the current cursor, which is the previous record. The least we can return without knowing any cursor is thus `entries=:-1:2` (where the `2` can be omitted, however with `entries=:-1:1` we would lose the last line). This also explains why different `num_skip` and `num_entries` must be used for the first fetch.
📝 Walkthrough📝 WalkthroughWalkthroughThe changes primarily involve enhancements to the logging functionality within the Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant APIHost
participant JournalReader
Client->>APIHost: Request advanced logs
APIHost->>JournalReader: Call journal_logs_reader
JournalReader->>APIHost: Yield (cursor, log entry)
APIHost->>APIHost: Prepare response with headers
APIHost->>Client: Return prepared response
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (10)
tests/api/test_host.py (5)
265-266
: LGTM! Consider adding a comment for clarity.The addition of the third
ANY
parameter in the assertion correctly reflects the updated method signature ofjournal_logs_reader
. This change aligns with the new cursor functionality mentioned in the PR objectives.Consider adding a brief comment explaining the purpose of the third parameter for better code readability:
# Third parameter represents the cursor callback journal_logs_reader.assert_called_with(ANY, LogFormatter.VERBOSE, ANY)
283-284
: LGTM! Consider adding a comment for consistency.The addition of the third
ANY
parameter in the assertion is correct and consistent with the updated method signature ofjournal_logs_reader
.For consistency with the previous suggestion, consider adding a brief comment explaining the purpose of the third parameter:
# Third parameter represents the cursor callback journal_logs_reader.assert_called_with(ANY, LogFormatter.VERBOSE, ANY)
336-337
: LGTM! Consider adding a comment for consistency.The addition of the third
ANY
parameter in the assertion is correct and consistent with the updated method signature ofjournal_logs_reader
.For consistency with the previous suggestions, consider adding a brief comment explaining the purpose of the third parameter:
# Third parameter represents the cursor callback journal_logs_reader.assert_called_once_with(ANY, LogFormatter.VERBOSE, ANY)
342-343
: LGTM! Consider adding comments for consistency and clarity.The addition of the third
ANY
parameter in all assertions is correct and consistent with the updated method signature ofjournal_logs_reader
. These test cases appropriately cover various scenarios for log formatting and header handling.For consistency and clarity, consider adding brief comments explaining the purpose of the third parameter in each assertion:
# Third parameter represents the cursor callback journal_logs_reader.assert_called_once_with(ANY, LogFormatter.VERBOSE, ANY) # Third parameter represents the cursor callback journal_logs_reader.assert_called_once_with(ANY, LogFormatter.PLAIN, ANY) # Third parameter represents the cursor callback journal_logs_reader.assert_called_once_with(ANY, LogFormatter.VERBOSE, ANY)Also applies to: 347-348, 353-354
Line range hint
1-400
: Overall assessment: Changes are consistent and align with PR objectives.The updates to the test cases in this file consistently reflect the changes in the
journal_logs_reader
method signature, which aligns with the PR objective of introducing a cursor for log retrieval. The test coverage appears comprehensive, addressing various scenarios for log formatting and header handling.To further improve the code:
- Consider adding brief comments explaining the purpose of the third parameter in each
journal_logs_reader
assertion for better clarity and maintainability.- It might be beneficial to add a test case specifically for the new cursor functionality, ensuring that the
X-First-Cursor
header is correctly handled and returned.supervisor/utils/systemd_journal.py (2)
65-65
: Consider specifying the Coroutine return type in the type hint.The type hint for
first_cursor_callback
could be more precise. Currently, it's specified asCallable[[str], Coroutine] | None = None
. Consider specifying the return type of the coroutine for clarity:-first_cursor_callback: Callable[[str], Coroutine] | None = None, +from typing import Any +first_cursor_callback: Callable[[str], Coroutine[Any, Any, None]] | None = None,This indicates that the coroutine returns
None
, enhancing readability and type checking.
67-71
: Enhance the docstring for clarity and compliance with PEP 257.Consider rephrasing the docstring to improve readability and adhere to style guidelines:
-"""Read logs from systemd journal line by line, formatted using the given formatter. Optionally takes a first_cursor_callback which is an async function that is called with the journal cursor value found in the first log entry and awaited. """ +"""Read logs from the systemd journal line by line, formatted using the given formatter. Optionally, a first_cursor_callback can be provided, which is an async function that is called with the journal cursor value from the first log entry and awaited. """This modification improves the docstring's clarity and follows PEP 257 conventions.
tests/utils/test_systemd_journal.py (2)
161-161
: Update the docstring to accurately reflect the test purposeThe docstring currently states "Test reading multiple messages," but this test specifically verifies that the
cursor_callback
receives the first cursor from the log stream. Consider updating the docstring to more precisely describe the test.Apply this diff to update the docstring:
- """Test reading multiple messages.""" + """Test that cursor_callback receives the first cursor from the log stream."""
183-183
: Update the docstring to accurately reflect the test purposeThe docstring says "Test reading multiple messages," but this test checks that the
cursor_callback
is called withNone
when no cursor is present in the log stream. Please update the docstring to accurately describe the test.Apply this diff to update the docstring:
- """Test reading multiple messages.""" + """Test that cursor_callback receives None when no cursor is present in the log stream."""supervisor/api/host.py (1)
244-248
: Add unit tests for the newX-First-Cursor
headerConsider adding unit tests to verify that the
X-First-Cursor
header is correctly set when retrieving logs. This will ensure the new functionality works as intended and helps prevent future regressions.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
- supervisor/api/host.py (1 hunks)
- supervisor/utils/systemd_journal.py (5 hunks)
- tests/api/test_host.py (3 hunks)
- tests/utils/test_systemd_journal.py (2 hunks)
🧰 Additional context used
🔇 Additional comments (4)
supervisor/utils/systemd_journal.py (2)
3-3
: LGTM!The import statements are correct and appropriate.
25-25
: LGTM!Adding
"__CURSOR"
torequired_fields
ensures that this field is included during journal parsing.tests/utils/test_systemd_journal.py (2)
160-180
: New testtest_cursor_callback
is well-implementedThe test effectively verifies that the
first_cursor_callback
receives the correct cursor when processing log entries that include cursors.
182-200
: New testtest_cursor_callback_no_cursor
is well-implementedThe test correctly ensures that the
first_cursor_callback
is called withNone
when the log entries do not include any cursors.
supervisor/api/host.py
Outdated
async def finish_prepare(cursor: str): | ||
if cursor: | ||
response.headers["X-First-Cursor"] = cursor | ||
await response.prepare(request) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure response.prepare
is called only once
In the finish_prepare
function, response.prepare(request)
is called. If finish_prepare
is invoked multiple times, this could result in multiple calls to response.prepare(request)
, which may raise an exception since prepare
should only be called once per response.
Consider adding a check to prevent multiple calls to prepare
.
Apply this diff to fix the issue:
async def finish_prepare(cursor: str):
if cursor:
response.headers["X-First-Cursor"] = cursor
- await response.prepare(request)
+ if not response.prepared:
+ await response.prepare(request)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async def finish_prepare(cursor: str): | |
if cursor: | |
response.headers["X-First-Cursor"] = cursor | |
await response.prepare(request) | |
async def finish_prepare(cursor: str): | |
if cursor: | |
response.headers["X-First-Cursor"] = cursor | |
if not response.prepared: | |
await response.prepare(request) |
supervisor/api/host.py
Outdated
async for line in journal_logs_reader( | ||
resp, log_formatter, finish_prepare | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels a bit weird that we hand-off control over when the response.prepare
gets executed to the journal_logs_reader. Can the journal_logs_reader maybe return besides the generator something else? A bit funky too but like so:
generator, first_cursor = journal_logs_reader(resp, log_formatter)
if first_cursor:
response.headers["X-First-Cursor"] = cursor
await response.prepare(request)
async for line in generator:
...
🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rewrote it to return the cursor in a tuple on every iteration, it's closest to what you propose. To get even closer, you'll have to perform the first anext on the generator to get the first cursor, so it'll end up like this:
generator = journal_logs_reader(resp, log_formatter)
first_cursor, first_line = await anext(generator)
async def wrapped_generator(first):
yield first
async for _, x in generator:
yield x
if first_cursor:
response.headers["X-First-Cursor"] = first_cursor
await response.prepare(request)
async for line in wrapped_generator(first_line):
await response.write(line.encode("utf-8") + b"\n")
This is maybe slightly better than the conditional return in the loop but let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, yeah I think that rewrite is better.
This is maybe slightly better than the conditional return in the loop but let me know what you think.
anext and generator is quite a bit less readable, let's keep the condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (11)
tests/host/test_logs.py (3)
43-49
: LGTM! Consider adding a constant for the expected cursor value.The changes effectively test the new cursor functionality, aligning with the PR objectives. The assertion for the cursor value ensures that the correct cursor is being returned alongside the log line.
To improve maintainability, consider defining the expected cursor value as a constant at the top of the file. This would make it easier to update if needed and improve readability. For example:
EXPECTED_CURSOR = "s=83fee99ca0c3466db5fc120d52ca7dd8;i=203f2ce;b=f5a5c442fa6548cf97474d2d57c920b3;m=3191a3c620;t=612ccd299e7af;x=8675b540119d10bb" # Then in the test: assert cursor == EXPECTED_CURSOR
71-75
: LGTM! Consider consistent cursor handling across tests.The changes in this test function mirror those in
test_logs
, providing consistent testing of the new cursor functionality for colored logs.As suggested for
test_logs
, consider defining the expected cursor value as a constant. Additionally, to ensure consistency across tests, you might want to create a helper function for cursor assertions. For example:def assert_cursor(actual_cursor: str, expected_cursor: str): assert actual_cursor == expected_cursor, f"Expected cursor {expected_cursor}, but got {actual_cursor}" # Then in both tests: assert_cursor(cursor, EXPECTED_CURSOR_LOGS) # In test_logs assert_cursor(cursor, EXPECTED_CURSOR_COLORED_LOGS) # In test_logs_colouredThis approach would make the tests more maintainable and ensure consistent cursor checking across different test cases.
Line range hint
1-138
: Consider adding more cursor-specific tests.The changes effectively test the new cursor functionality alongside existing log content checks. However, to ensure comprehensive coverage of the new feature, consider adding more cursor-specific tests.
Here are some suggestions for additional tests:
- Test retrieving logs using a cursor from a previous request.
- Test the behavior when an invalid cursor is provided.
- Test the scenario where the same cursor is received twice, as mentioned in the PR objectives.
- Test the limitation of fetching only the last log entry without a cursor.
These additional tests would help ensure that all aspects of the new cursor functionality are thoroughly validated.
tests/utils/test_systemd_journal.py (8)
92-92
: Ensure Consistent Handling of Cursor in Test AssertionsIn this test, the cursor returned by
journal_logs_reader
is ignored using_, line = await anext(...)
. In other tests, assertions are made directly on the tuple(cursor, line)
usingassert await anext(...) == (ANY, expected_line)
. For consistency and clarity across your test suite, consider adopting a uniform approach to handling the cursor in test assertions.
106-106
: Consistent Cursor Handling in Verbose Test AssertionsSimilar to the previous comment, you're unpacking the cursor with
_, line = await anext(...)
and asserting only online
. To maintain consistency, consider aligning this test with others by either consistently ignoring the cursor or always asserting on the full(cursor, line)
tuple.
121-121
: Uniform Approach to Cursor in Test AssertionsYou're ignoring the cursor here by unpacking it with
_, line = await anext(...)
. For consistency with other tests where the full tuple is asserted, consider adopting a consistent method of handling the cursor across all tests.
154-157
: Consistent Assertion of Cursor and Message TuplesHere, you're asserting the entire tuple
(cursor, line)
withANY
for the cursor. Review whether this aligns with your approach in other tests and consider making cursor handling consistent throughout your test suite.
161-161
: Update Docstring to Reflect Test PurposeThe docstring currently states:
"""Test reading multiple messages."""
. Since this test is specifically verifying the handling of cursors when reading messages, consider updating the docstring to something like"""Test that cursor values are correctly returned with messages."""
for better clarity.
184-184
: Clarify Docstring in Cursor Absence TestThe docstring reads:
"""Test reading multiple messages.""""
. Given that this test checks the behavior when no cursors are present, consider updating it to"""Test handling of messages when no cursor is provided.""""
to accurately describe the test's intent.
218-218
: Consistent Cursor Handling in Host Logs TestYou are ignoring the cursor in this test by unpacking it with
_, line = await anext(...)
. For consistency with other tests, consider whether the cursor value should be asserted or if ignoring it is appropriate in this context.
226-226
: Uniformity in Handling Cursor and Message in AssertionsSimilar to previous instances, you're unpacking and ignoring the cursor. Review whether this aligns with the cursor handling strategy you've adopted in other tests. Consistent handling improves readability and maintainability.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
- supervisor/api/host.py (1 hunks)
- supervisor/utils/systemd_journal.py (3 hunks)
- tests/host/test_logs.py (2 hunks)
- tests/utils/test_systemd_journal.py (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- supervisor/api/host.py
- supervisor/utils/systemd_journal.py
🧰 Additional context used
🔇 Additional comments (2)
tests/utils/test_systemd_journal.py (2)
4-4
: Appropriate Import ofANY
andMagicMock
The addition of
ANY
andMagicMock
fromunittest.mock
is necessary for the updated tests that utilizeANY
in assertions andMagicMock
for mocking objects.
138-139
: Clarify the Use ofANY
in AssertionsIn these assertions,
ANY
is used for the cursor value, which is acceptable when the cursor is not the focus of the test. Ensure that this intentional use ofANY
appropriately reflects the test's purpose and that the cursor value doesn't need to be explicitly checked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
tests/utils/test_systemd_journal.py (1)
160-178
: LGTM: New test function for cursor parsingThe new
test_cursor_parsing
function effectively tests the cursor extraction functionality, covering scenarios with and without cursors. This aligns well with the PR objectives.Consider adding a comment explaining the significance of the
None
cursor in the last assertion:# Verify that messages without a cursor return None assert await anext(reader) == (None, "No cursor")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- tests/utils/test_systemd_journal.py (7 hunks)
🧰 Additional context used
🔇 Additional comments (9)
tests/utils/test_systemd_journal.py (9)
4-4
: LGTM: Import ofANY
addedThe addition of
ANY
fromunittest.mock
is appropriate for the new assertion style used in the updated tests.
92-92
: LGTM: Updated assertion to handle new return value structureThe change correctly adapts the test to the new
journal_logs_reader
return value, which now includes a cursor. UsingANY
for the cursor allows the test to focus on the log message content while being flexible about the cursor value.
106-108
: LGTM: Assertion updated for new return value structureThe change correctly adapts the test to the new
journal_logs_reader
return value structure, usingANY
for the cursor and focusing on the formatted log message.
121-121
: LGTM: Assertion updated for new return value structureThe change correctly adapts the test to the new
journal_logs_reader
return value structure, maintaining consistency with the other updated tests.
138-139
: LGTM: Assertions updated for multiple log entriesThe changes correctly adapt the test to handle multiple log entries with the new return value structure, using
ANY
for cursor values consistently.
154-157
: LGTM: Assertions updated and error handling improvedThe changes correctly adapt the test to the new return value structure. The addition of the
StopAsyncIteration
check improves the test by ensuring the reader behaves correctly when all messages are consumed.
200-200
: LGTM: Assertion updated for new return value structureThe change correctly adapts the test to the new
journal_logs_reader
return value structure, maintaining consistency with the other updated tests.
208-208
: LGTM: Assertion updated for new return value structureThe change correctly adapts the test to the new
journal_logs_reader
return value structure, maintaining consistency with the other updated tests.
Line range hint
1-211
: Overall: Excellent updates to test suite for cursor functionalityThe changes in this file effectively update the test suite to accommodate the new cursor functionality introduced in the PR. Key points:
- Consistent updates across all test functions to handle the new return value structure from
journal_logs_reader
.- Addition of a new test function
test_cursor_parsing
that specifically tests cursor extraction scenarios.- Maintained existing test coverage while adapting to the new functionality.
These changes align well with the PR objectives and provide good test coverage for the new feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Proposed change
Return first entry's cursor via custom
X-First-Cursor
header that can be consumed by the client and used for continual requesting of the historic logs. Once the first fetch returns data, the cursor can be supplied as the first argument to the Range header in another call, fetching accurate slice of the journal with the previous log entries using theRange: entries=cursor[[:num_skip]:num_entries]
syntax.Let's say we fetch logs with the Range header
entries=:-19:20
(to fetch very last 20 lines of the logs, see below why notentries:-20:20
) and we getcursor50
as the reply (the actual value will be much more complex and with no guaranteed format). To fetch previous slice of the logs, we useentries=cursor50:-20:20
, which would return 20 lines previous tocursor50
andcursor30
in the cursor header. This way we can go all the way back to the history.One problem with the cursor is that it's not possible to determine when the negative num_skip points beyond the first log entry. In that case the client either needs to know what the first entry is (via
entries=:0:1
) or can iterate naively and stop once two subsequent requests return the same first cursor.Another caveat, even though it's unlikely it will be hit in real usage, is that it's not possible to fetch the last line only - if no cursor is provided, negative num_skip argument is needed, and in that case we're pointing one record back from the current cursor, which is the previous record. The least we can return without knowing any cursor is thus
entries=:-1:2
(where the2
can be omitted, however withentries=:-1:1
we would lose the last line). This also explains why differentnum_skip
andnum_entries
must be used for the first fetch.Type of change
Additional information
Checklist
ruff format supervisor tests
)If API endpoints or add-on configuration are added/changed:
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Tests