Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change md5 to hash and hash_algorithm, fix incompatibility #1367

Merged
merged 16 commits into from
Nov 24, 2023
18 changes: 18 additions & 0 deletions .github/workflows/downstream.yml
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,23 @@ jobs:
test_command: pip install pytest-jupyter[server] && pytest -vv -raXxs -W default --durations 10 --color=yes
package_name: jupyter_server_terminals

jupytext:
runs-on: ubuntu-latest
timeout-minutes: 10

steps:
- name: Checkout
uses: actions/checkout@v4

- name: Base Setup
uses: jupyterlab/maintainer-tools/.github/actions/base-setup@v1

- name: Test jupytext
uses: jupyterlab/maintainer-tools/.github/actions/downstream-test@v1
with:
package_name: jupytext
test_command: pip install pytest-jupyter[server] gitpython pre-commit && python -m ipykernel install --name jupytext-dev --user && pytest -vv -raXxs -W default --durations 10 --color=yes --ignore=tests/test_doc_files_are_notebooks.py --ignore=tests/test_changelog.py
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_doc_files_are_notebooks.py and test_changelog.py need docs floder, but not in the distribution.


downstream_check: # This job does nothing and is only used for the branch protection
if: always()
needs:
Expand All @@ -115,6 +132,7 @@ jobs:
- jupyterlab_server
- notebook
- nbclassic
- jupytext
runs-on: ubuntu-latest
steps:
- name: Decide whether the needed jobs succeeded or failed
Expand Down
92 changes: 52 additions & 40 deletions docs/source/developers/contents.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,40 +33,48 @@ which we refer to as **models**.

Models may contain the following entries:

+--------------------+-----------+------------------------------+
| Key | Type |Info |
+====================+===========+==============================+
|**name** |unicode |Basename of the entity. |
+--------------------+-----------+------------------------------+
|**path** |unicode |Full |
| | |(:ref:`API-style<apipaths>`) |
| | |path to the entity. |
+--------------------+-----------+------------------------------+
|**type** |unicode |The entity type. One of |
| | |``"notebook"``, ``"file"`` or |
| | |``"directory"``. |
+--------------------+-----------+------------------------------+
|**created** |datetime |Creation date of the entity. |
+--------------------+-----------+------------------------------+
|**last_modified** |datetime |Last modified date of the |
| | |entity. |
+--------------------+-----------+------------------------------+
|**content** |variable |The "content" of the entity. |
| | |(:ref:`See |
| | |Below<modelcontent>`) |
+--------------------+-----------+------------------------------+
|**mimetype** |unicode or |The mimetype of ``content``, |
| |``None`` |if any. (:ref:`See |
| | |Below<modelcontent>`) |
+--------------------+-----------+------------------------------+
|**format** |unicode or |The format of ``content``, |
| |``None`` |if any. (:ref:`See |
| | |Below<modelcontent>`) |
+--------------------+-----------+------------------------------+
|**md5** |unicode or |The md5 of the contents. |
| |``None`` | |
| | | |
+--------------------+-----------+------------------------------+
+--------------------+------------+-------------------------------+
| Key | Type | Info |
+====================+============+===============================+
| **name** | unicode | Basename of the entity. |
+--------------------+------------+-------------------------------+
| **path** | unicode | Full |
| | | (:ref:`API-style<apipaths>`) |
| | | path to the entity. |
+--------------------+------------+-------------------------------+
| **type** | unicode | The entity type. One of |
| | | ``"notebook"``, ``"file"`` or |
| | | ``"directory"``. |
+--------------------+------------+-------------------------------+
| **created** | datetime | Creation date of the entity. |
+--------------------+------------+-------------------------------+
| **last_modified** | datetime | Last modified date of the |
| | | entity. |
+--------------------+------------+-------------------------------+
| **content** | variable | The "content" of the entity. |
| | | (:ref:`See |
| | | Below<modelcontent>`) |
+--------------------+------------+-------------------------------+
| **mimetype** | unicode or | The mimetype of ``content``, |
| | ``None`` | if any. (:ref:`See |
| | | Below<modelcontent>`) |
+--------------------+------------+-------------------------------+
| **format** | unicode or | The format of ``content``, |
| | ``None`` | if any. (:ref:`See |
| | | Below<modelcontent>`) |
+--------------------+------------+-------------------------------+
| [optional] | | |
| **hash** | unicode or | The hash of the contents. |
| | ``None`` | It cannot be null if |
| | | ``hash_algorithm`` is |
| | | defined. |
+--------------------+------------+-------------------------------+
| [optional] | | |
| **hash_algorithm** | unicode or | The algorithm used to compute |
| | ``None`` | hash value. |
| | | It cannot be null |
| | | if ``hash`` is defined. |
+--------------------+------------+-------------------------------+

.. _modelcontent:

Expand All @@ -80,8 +88,9 @@ model. There are three model types: **notebook**, **file**, and **directory**.
:class:`nbformat.notebooknode.NotebookNode` representing the .ipynb file
represented by the model. See the `NBFormat`_ documentation for a full
description.
- The ``md5`` field a hexdigest string of the md5 value of the notebook
file.
- The ``hash`` field a hexdigest string of the hash value of the file.
If ``ContentManager.get`` not support hash, it should always be ``None``.
- ``hash_algorithm`` is the algorithm used to compute the hash value.

- ``file`` models
- The ``format`` field is either ``"text"`` or ``"base64"``.
Expand All @@ -91,14 +100,16 @@ model. There are three model types: **notebook**, **file**, and **directory**.
file models, ``content`` simply contains the file's bytes after decoding
as UTF-8. Non-text (``base64``) files are read as bytes, base64 encoded,
and then decoded as UTF-8.
- The ``md5`` field a hexdigest string of the md5 value of the file.
- The ``hash`` field a hexdigest string of the hash value of the file.
If ``ContentManager.get`` not support hash, it should always be ``None``.
- ``hash_algorithm`` is the algorithm used to compute the hash value.

- ``directory`` models
- The ``format`` field is always ``"json"``.
- The ``mimetype`` field is always ``None``.
- The ``content`` field contains a list of :ref:`content-free<contentfree>`
models representing the entities in the directory.
- The ``md5`` field is always ``None``.
- The ``hash`` field is always ``None``.

.. note::

Expand All @@ -115,7 +126,7 @@ model. There are three model types: **notebook**, **file**, and **directory**.

.. code-block:: python

# Notebook Model with Content
# Notebook Model with Content and Hash
{
"content": {
"metadata": {},
Expand All @@ -137,7 +148,8 @@ model. There are three model types: **notebook**, **file**, and **directory**.
"path": "foo/a.ipynb",
"type": "notebook",
"writable": True,
"md5": "7e47382b370c05a1b14706a2a8aff91a",
"hash": "f5e43a0b1c2e7836ab3b4d6b1c35c19e2558688de15a6a14e137a59e4715d34b",
"hash_algorithm": "sha256",
}

# Notebook Model without Content
Expand Down
14 changes: 8 additions & 6 deletions jupyter_server/services/api/api.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -106,9 +106,9 @@ paths:
in: query
description: "Return content (0 for no content, 1 for return content)"
type: integer
- name: md5
- name: hash
in: query
description: "Return md5 hexdigest string of content (0 for no md5, 1 for return md5)"
description: "May return hash hexdigest string of content and the hash algorithm (0 for no hash - default, 1 for return hash). It may be ignored by the content manager."
type: integer
responses:
404:
Expand Down Expand Up @@ -889,7 +889,7 @@ definitions:
kernel:
$ref: "#/definitions/Kernel"
Contents:
description: "A contents object. The content and format keys may be null if content is not contained. The md5 maybe null if md5 is not contained. If type is 'file', then the mimetype will be null."
description: "A contents object. The content and format keys may be null if content is not contained. The hash maybe null if hash is not required. If type is 'file', then the mimetype will be null."
type: object
required:
- type
Expand All @@ -901,7 +901,6 @@ definitions:
- mimetype
- format
- content
- md5
properties:
name:
type: string
Expand Down Expand Up @@ -939,9 +938,12 @@ definitions:
format:
type: string
description: Format of content (one of null, 'text', 'base64', 'json')
md5:
hash:
type: string
description: "The md5 hexdigest string of content, if requested (otherwise null)."
description: "[optional] The hexdigest hash string of content, if requested (otherwise null). It cannot be null if hash_algorithm is defined."
hash_algorithm:
type: string
description: "[optional] The algorithm used to produce the hash, if requested (otherwise null). It cannot be null if hash is defined."
Checkpoints:
description: A checkpoint object.
type: object
Expand Down
4 changes: 2 additions & 2 deletions jupyter_server/services/contents/filecheckpoints.py
Original file line number Diff line number Diff line change
Expand Up @@ -252,7 +252,7 @@ def get_file_checkpoint(self, checkpoint_id, path):
if not os.path.isfile(os_checkpoint_path):
self.no_such_checkpoint(path, checkpoint_id)

content, format = self._read_file(os_checkpoint_path, format=None)
content, format = self._read_file(os_checkpoint_path, format=None) # type: ignore[misc]
return {
"type": "file",
"content": content,
Expand Down Expand Up @@ -318,7 +318,7 @@ async def get_file_checkpoint(self, checkpoint_id, path):
if not os.path.isfile(os_checkpoint_path):
self.no_such_checkpoint(path, checkpoint_id)

content, format = await self._read_file(os_checkpoint_path, format=None)
content, format = await self._read_file(os_checkpoint_path, format=None) # type: ignore[misc]
return {
"type": "file",
"content": content,
Expand Down
Loading
Loading