Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IT subtests manifest and repository_files fail due to lack of files #5168

Closed
14 tasks
hannes-ucsc opened this issue Apr 28, 2023 · 16 comments
Closed
14 tasks
Assignees
Labels
+ [priority] High bug [type] A defect preventing use of the system as specified no demo [process] Not to be demonstrated at the end of the sprint orange [process] Done by the Azul team spike:5 [process] Spike estimate of five points test [subject] Unit and integration test code

Comments

@hannes-ucsc
Copy link
Member

hannes-ucsc commented Apr 28, 2023

https://gitlab.prod.anvil.gi.ucsc.edu/ucsc/azul/-/jobs/2673

======================================================================
FAIL: test_indexing (integration_test.IndexingIntegrationTest) [manifest] (catalog='anvil-it', format=None)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builds/ucsc/azul/test/integration_test.py", line 378, in subTest
    yield
  File "/builds/ucsc/azul/test/integration_test.py", line 559, in _test_manifest
    validator(catalog, response)
  File "/builds/ucsc/azul/test/integration_test.py", line 684, in _check_manifest
    self.__check_manifest(BytesIO(response), 'bundle_uuid')
  File "/builds/ucsc/azul/test/integration_test.py", line 760, in __check_manifest
    self.assertGreater(len(rows), 0)
AssertionError: 0 not greater than 0
======================================================================
FAIL: test_indexing (integration_test.IndexingIntegrationTest) [manifest] (catalog='anvil-it', format=<ManifestFormat.compact: 'compact'>)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builds/ucsc/azul/test/integration_test.py", line 378, in subTest
    yield
  File "/builds/ucsc/azul/test/integration_test.py", line 559, in _test_manifest
    validator(catalog, response)
  File "/builds/ucsc/azul/test/integration_test.py", line 684, in _check_manifest
    self.__check_manifest(BytesIO(response), 'bundle_uuid')
  File "/builds/ucsc/azul/test/integration_test.py", line 760, in __check_manifest
    self.assertGreater(len(rows), 0)
AssertionError: 0 not greater than 0
======================================================================
FAIL: test_indexing (integration_test.IndexingIntegrationTest) [repository_files] (catalog='anvil-it')
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builds/ucsc/azul/test/integration_test.py", line 378, in subTest
    yield
  File "/builds/ucsc/azul/test/integration_test.py", line 788, in _test_repository_files
    file = self._get_one_file(catalog)
  File "/builds/ucsc/azul/test/integration_test.py", line 581, in _get_one_file
    self.fail('No files found')
AssertionError: No files found

There seem to be many bundles without any files in the 1000G snapshot, so the integration subtests that rely on presence of files in the index fail frequently, depending on the random seed chosen.

  • Security design review completed; the Resolution of this issue does not
    • … affect authentication; for example:
      • OAuth 2.0 with the application (API or Swagger UI)
      • Authentication of developers with Google Cloud APIs
      • Authentication of developers with AWS APIs
      • Authentication with a GitLab instance in the system
      • Password and 2FA authentication with GitHub
      • API access token authentication with GitHub
      • Authentication with
    • … affect the permissions of internal users like access to
      • Cloud resources on AWS and GCP
      • GitLab repositories, projects and groups, administration
      • an EC2 instance via SSH
      • GitHub issues, pull requests, commits, commit statuses, wikis, repositories, organizations
    • … affect the permissions of external users like access to
      • TDR snapshots
    • … affect permissions of service or bot accounts
      • Cloud resources on AWS and GCP
    • … affect audit logging in the system, like
      • adding, removing or changing a log message that represents an auditable event
      • changing the routing of log messages through the system
    • … affect monitoring of the system
    • … introduce a new software dependency like
      • Python packages on PYPI
      • Command-line utilities
      • Docker images
      • Terraform providers
    • … add an interface that exposes sensitive or confidential data at the security boundary
    • … affect the encryption of data at rest
    • … require persistence of sensitive or confidential data that might require encryption at rest
    • … require unencrypted transmission of data within the security boundary
    • … affect the network security layer; for example by
      • modifying, adding or removing firewall rules
      • modifying, adding or removing security groups
      • changing or adding a port a service, proxy or load balancer listens on
  • Documentation on any unchecked boxes is provided in comments below
@hannes-ucsc hannes-ucsc added the orange [process] Done by the Azul team label Apr 28, 2023
@hannes-ucsc
Copy link
Member Author

hannes-ucsc commented Apr 28, 2023

This needs

  • better reproduction (a seed where the tests fail),
  • symptoms and
  • verification of the hypothesized cause (many bundles without files in 1000G).

Spike for these, but assignee will need access to anvilprod.

@achave11-ucsc
Copy link
Member

@hannes-ucsc": "Alternatively, assignee could index the three snapshots from anvilprod in a personal deployment."

@achave11-ucsc achave11-ucsc added bug [type] A defect preventing use of the system as specified test [subject] Unit and integration test code spike:5 [process] Spike estimate of five points labels Apr 28, 2023
@dsotirho-ucsc dsotirho-ucsc self-assigned this May 4, 2023
hannes-ucsc added a commit that referenced this issue May 12, 2023
@dsotirho-ucsc
Copy link
Contributor

@hannes-ucsc: "Lets hand over the spike to @nadove-ucsc, who will need to ask for the following permissions:

  • developer on anvilprod via email to Erich
  • access to anvil snapshots in TDR dev via Slack post to @hannes-ucsc"

@dsotirho-ucsc dsotirho-ucsc added the + [priority] High label May 12, 2023
@dsotirho-ucsc
Copy link
Contributor

dsotirho-ucsc commented May 12, 2023

@hannes-ucsc: "#5207 might reduce the prevalence of this problem because it would add supplementary files in anvilprod from the 1000G snapshot."

@dsotirho-ucsc
Copy link
Contributor

Hold off on spike until blocker has been reviewed.

@hannes-ucsc hannes-ucsc removed their assignment Jun 9, 2023
@hannes-ucsc hannes-ucsc added the no demo [process] Not to be demonstrated at the end of the sprint label Jun 13, 2023
@hannes-ucsc hannes-ucsc added this to the AnVIL Public Release milestone Jun 13, 2023
dsotirho-ucsc added a commit that referenced this issue Jun 16, 2023
…ck of files (#5168, PR #5305)"

This reverts commit 5f6eaf0, reversing
changes made to 80e840f.
@dsotirho-ucsc
Copy link
Contributor

After being merged to develop PR #5305 failed IT on anvilprod. IT was successful on both dev and anvildev however.

The 1st attempt failed during _assert_catalog_complete

======================================================================
FAIL: test_indexing (integration_test.IndexingIntegrationTest) [catalog_complete] (catalog='anvil-it')
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builds/ucsc/azul/test/integration_test.py", line 386, in subTest
    yield
  File "/builds/ucsc/azul/test/integration_test.py", line 1031, in _assert_catalog_complete
    self.assertSetEqual(indexed_fqids, expected_fqids)
AssertionError: Items in the first set but not the second:
AnvilBundleFQID(uuid='2e25fa2f-8c74-a7fe-be19-a6079ea32572', version='2022-06-01T00:00:00.000000Z', source=SourceRef(id='cc1c98a4-bfc4-45f2-b8dc-e920e5ca634d', spec=TDRSourceSpec(prefix=Prefix(common='', partition=2), project='datarepo-dev-43738c90', name='ANVIL_1000G_2019_Dev_20230302_ANV5_202303032342', is_snapshot=True)), entity_type=<BundleEntityType.primary: 'biosample'>)
Items in the second set but not the first:
AnvilBundleFQID(uuid='2e25fa2f-8c74-a7fe-be19-a6079ea32572', version='2022-06-01T00:00:00.000000Z', source=SourceRef(id='cc1c98a4-bfc4-45f2-b8dc-e920e5ca634d', spec=TDRSourceSpec(prefix=Prefix(common='', partition=2), project='datarepo-dev-43738c90', name='ANVIL_1000G_2019_Dev_20230302_ANV5_202303032342', is_snapshot=True)), entity_type=<BundleEntityType.supplementary: 'file'>)

Attempts #2 to #5 failed during assert_manifest

======================================================================
FAIL: test_indexing (integration_test.IndexingIntegrationTest) [managed_access_manifest]
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builds/ucsc/azul/test/integration_test.py", line 386, in subTest
    yield
  File "/builds/ucsc/azul/test/integration_test.py", line 1111, in _test_managed_access
    self._test_managed_access_manifest(catalog,
  File "/builds/ucsc/azul/test/integration_test.py", line 1280, in _test_managed_access_manifest
    assert_manifest({public_bundle, *managed_access_bundles})
  File "/builds/ucsc/azul/test/integration_test.py", line 1275, in assert_manifest
    self.assertEqual(expected_bundles, all_found_bundles)
AssertionError: Items in the first set but not the second:
'291afe95-9dad-a52d-bcdb-85d23f878057'

@achave11-ucsc
Copy link
Member

This happened again during anvilbox IT run. https://gitlab.anvil.gi.ucsc.edu/ucsc/azul/-/jobs/18689#L1044

@bvizzier-ucsc bvizzier-ucsc modified the milestones: AnVIL Public Release, AnVIL Beta Release Oct 4, 2023
@dsotirho-ucsc dsotirho-ucsc removed this from the AnVIL Beta Release milestone Oct 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
+ [priority] High bug [type] A defect preventing use of the system as specified no demo [process] Not to be demonstrated at the end of the sprint orange [process] Done by the Azul team spike:5 [process] Spike estimate of five points test [subject] Unit and integration test code
Projects
None yet
Development

No branches or pull requests

5 participants