Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extremely large database size when sync attempts are made but don’t complete #7270

Open
lyw07 opened this issue Jul 8, 2020 · 11 comments
Open
Assignees
Labels
APP: Device Re: Device App (content import/export, facility-syncing, user permissions, etc.) P0 - critical Priority: Release blocker or regression TODO: needs decisions Design or specifications are necessary

Comments

@lyw07
Copy link
Contributor

lyw07 commented Jul 8, 2020

Observed behavior

Reported by a user:

When sync attempts are made but don’t complete, the table morango_buffer in the database took 8.2 GB.

Expected behavior

According to Blaine, the table morango_buffer should be automatically cleared at the end of the sync

User-facing consequences

Users have an extremely large database

Errors and logs

Steps to reproduce

Context

Kolibri version: 0.14.0b8

@lyw07 lyw07 added this to the 0.14.0 milestone Jul 8, 2020
@intelliant01
Copy link

Additional information:

Sync is not working with the database shared with @lyw07 on Kolibri 0.13.3 or later despite several attempts.

I had cleared morango_store, morango_recordmaxcounter, morango_recordmaxcounterbuffer, morango_syncsession, morango_transfersession tables and got no error during Kolibri usage or its start.

But still the sync process never progresses when another server is trying to pull for over 8 hours. It always shows that connection has been successfully established and the sync has started.

Next reverted the database to the state where only morango_buffer is cleared and tried the sync. But the sync process is still just at Syncing has been initiated (this may take a while)...

@indirectlylit indirectlylit added the P0 - critical Priority: Release blocker or regression label Jul 10, 2020
@cyberorg
Copy link
Contributor

Next reverted the database to the state where only morango_buffer is cleared and tried the sync. But the sync process is still just at Syncing has been initiated (this may take a while)...

redis-cli FLUSHALL

Running this will get it going forward

@bjester bjester added the TODO: needs decisions Design or specifications are necessary label Jul 14, 2020
@bjester
Copy link
Member

bjester commented Jul 14, 2020

@intelliant01 The sync process can take a significant amount of time, as the message says. With changes coming in 0.14, we've added progress tracking that will provide better visibility on the sync. Using a smaller chunk size may help speed up the sync, depending on the machines and networks involved. You can adjust it with the --chunk-size parameter; the default is 500. Clearing the morango_buffers table on both the local and remote Kolibri instances, while no sync is running, is currently advisable if there have been several unsuccessful sync attempts.

@cyberorg
Copy link
Contributor

cyberorg commented Jul 16, 2020

Using beta11.

I was able to complete the sync only after clearing out morango_buffer as well as morango_transfersession, if morango_transfersession was not cleared then sync failed after downloading some data with error about already existing morango_transfersess_sync_session_id_0455b5bd_fk_morango_s(don't remember exact error).

After the successful sync and multiple reruns of sync, the morango_buffer is still full of data.

"kolibri manage flushsyncsessions" from #7313 clears out morango_buffer.

Not sure if this is a bug or something wrong in our database, https://pastebin.com/jXcSyNTx

lot of:

WARNING Error deserializing instance of XXXXXX with id XXXXX ['XXXXXXXX matching query does not exist.']

Want me to open another issue?

@bjester
Copy link
Member

bjester commented Jul 16, 2020

@cyberorg I don't see it in the pastebin, but was there a log entry at the very beginning of the warnings that said something like:

WARNING  Validation error for FacilityUser with id 6572fd4d5e34e2efcb482a1737e376f7: {'password': ['This field cannot be blank.']}

If so, I believe the next beta may resolve that issue for you, as some fixes from #7251 will be in it.

In that case, the Morango buffers still had data in it because Morango was unable to merge those records into your facility data, as shown by the many log messages. Re-running a sync after that will always attempt to merge buffered changes if possible. On the next beta, if the data is still in the buffers, it should be merged in then leaving the buffers empty.

@cyberorg
Copy link
Contributor

WARNING Validation error for FacilityUser with id....

No there is no such error, here is the beginning and ending:

INFO     Syncing has been initiated (this may take a while)...
INFO     Creating pull transfer session
  [----------------------------------------------------------------------------------------------------------------------------------------------------------------------]    0%INFO     Receiving data (0/1, 0B)
Receiving data (1/1, 12.74KB)  [#########################################################################################################################################]  100%INFO     Receiving data (1/1, 12.74KB)
  [##############################################################################------------------------------------------------------------------------------]   50%  00:00:01WARNING  Error deserializing instance of Lesson with id cfc5218d86c093eaf9a2cced25d683a5: ['FacilityUser matching query does not exist.']
WARNING  Error deserializing instance of LessonAssignment with id 50ddc8479bdca853d445f4023d961498: ['FacilityUser matching query does not exist.']
WARNING  Error deserializing instance of LessonAssignment with id 400142bf69af8a280b2dfb788127376a: ['FacilityUser matching query does not exist.']
---snipped---
WARNING  Error deserializing instance of AttemptLog with id 08c97d1c03926e9b67eb918718a4aa8e: ['ContentSessionLog matching query does not exist.']
WARNING  Error deserializing instance of Role with id 4bb130dbfde9fcc93d87eb50f44ae549: ['FacilityUser matching query does not exist.']
Locally integrating received data  [#####################################################################################################################################]  100%INFO     Completed pull transfer session
INFO     Syncing has been completed.

morango_buffer is now clean. morango_recordmaxcounter and morango_store has lot of rows though.

@cyberorg
Copy link
Contributor

If so, I believe the next beta may resolve that issue for you, as some fixes from #7251 will be in it.

It seems kolibri/core/auth/management/commands/deletefacility.py and kolibri/core/auth/management/utils.py changes are missing in beta 12.

  File "/usr/lib/python3/dist-packages/kolibri/core/auth/management/commands/flushsyncsessions.py", line 9, in <module>
    from kolibri.core.auth.management.utils import GroupDeletion
ImportError: cannot import name 'GroupDeletion'

@cyberorg
Copy link
Contributor

cyberorg commented Jul 17, 2020

Ran kolibri manage flushsyncsessions and redis-cli FLUSHALL then sync

Here is the full error:

INFO     Creating pull transfer session
Receiving data (15500/244295, 65.09MB)  [########--------------------------------------------------------------------------------------------------------------]    7%  01:21:16INFO     Receiving data (15500/244295, 65.09MB)
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/kolibri/dist/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "morango_buffer_transfer_session_id_model_uuid_2a7288db_uniq"
DETAIL:  Key (transfer_session_id, model_uuid)=(e00b22fa-f3b3-445b-aa29-1d471e320cce, 00a5088d-1ae0-f79a-0caa-051cb1f28cba) already exists.


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/bin/kolibri", line 11, in <module>
    load_entry_point('kolibri==0.14.0b12', 'console_scripts', 'kolibri')()
  File "/usr/lib/python3/dist-packages/kolibri/dist/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/kolibri/dist/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3/dist-packages/kolibri/dist/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/lib/python3/dist-packages/kolibri/utils/cli.py", line 277, in invoke
    return super(KolibriDjangoCommand, self).invoke(ctx)
  File "/usr/lib/python3/dist-packages/kolibri/dist/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3/dist-packages/kolibri/dist/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/kolibri/dist/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/lib/python3/dist-packages/kolibri/utils/cli.py", line 727, in manage
    execute_from_command_line(["kolibri manage"] + ctx.args)
  File "/usr/lib/python3/dist-packages/kolibri/dist/django/core/management/__init__.py", line 364, in execute_from_command_line
    utility.execute()
  File "/usr/lib/python3/dist-packages/kolibri/dist/django/core/management/__init__.py", line 356, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/lib/python3/dist-packages/kolibri/dist/django/core/management/base.py", line 283, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/lib/python3/dist-packages/kolibri/dist/django/core/management/base.py", line 330, in execute
    output = self.handle(*args, **options)
  File "/usr/lib/python3/dist-packages/kolibri/core/tasks/management/commands/base.py", line 113, in handle
    return self.handle_async(*args, **options)
  File "/usr/lib/python3/dist-packages/kolibri/core/auth/management/commands/sync.py", line 184, in handle_async
    self._handle_pull(
  File "/usr/lib/python3/dist-packages/kolibri/core/auth/management/commands/sync.py", line 269, in _handle_pull
    sync_client.run()
  File "/usr/lib/python3/dist-packages/kolibri/dist/morango/sync/syncsession.py", line 692, in run
    self._pull_records(callback=status.in_progress.fire)
  File "/usr/lib/python3/dist-packages/kolibri/dist/morango/sync/syncsession.py", line 757, in _pull_records
    validate_and_create_buffer_data(
  File "/usr/lib/python3/dist-packages/kolibri/dist/morango/sync/utils.py", line 146, in validate_and_create_buffer_data
    Buffer.objects.bulk_create(buffer_list)
  File "/usr/lib/python3/dist-packages/kolibri/dist/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/kolibri/dist/django/db/models/query.py", line 443, in bulk_create
    ids = self._batched_insert(objs_without_pk, fields, batch_size)
  File "/usr/lib/python3/dist-packages/kolibri/dist/django/db/models/query.py", line 1096, in _batched_insert
    inserted_id = self._insert(item, fields=fields, using=self.db, return_id=True)
  File "/usr/lib/python3/dist-packages/kolibri/dist/django/db/models/query.py", line 1079, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/usr/lib/python3/dist-packages/kolibri/dist/django/db/models/sql/compiler.py", line 1112, in execute_sql
    cursor.execute(sql, params)
  File "/usr/lib/python3/dist-packages/kolibri/dist/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
  File "/usr/lib/python3/dist-packages/kolibri/dist/django/db/utils.py", line 94, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/usr/lib/python3/dist-packages/kolibri/dist/django/utils/six.py", line 685, in reraise
    raise value.with_traceback(tb)
  File "/usr/lib/python3/dist-packages/kolibri/dist/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
django.db.utils.IntegrityError: duplicate key value violates unique constraint "morango_buffer_transfer_session_id_model_uuid_2a7288db_uniq"
DETAIL:  Key (transfer_session_id, model_uuid)=(e00b22fa-f3b3-445b-aa29-1d471e320cce, 00a5088d-1ae0-f79a-0caa-051cb1f28cba) already exists.

After truncate morango_transfersession cascade; at both ends, it failed with the same error, then clearing morango_transfersession again on client side and re-running sync continued.

Followed by long list of deserializing errors, ending with this:


WARNING  Error deserializing instance of AttemptLog with id 828f13760e9f0c0b271f62032a20da13: ['ContentSessionLog matching query does not exist.']
WARNING  Error deserializing instance of AttemptLog with id ac2b958c37046dfee253ed4dbb95a3be: ['ContentSessionLog matching query does not exist.']
WARNING  Error deserializing instance of Role with id 4bb130dbfde9fcc93d87eb50f44ae549: ['FacilityUser matching query does not exist.']
Locally integrating received data  [#####################################################################################################################################]  100%ERROR    HTTPError Reason: (no response)
Traceback (most recent call last):
  File "/usr/bin/kolibri", line 11, in <module>
    load_entry_point('kolibri==0.14.0b12', 'console_scripts', 'kolibri')()
  File "/usr/lib/python3/dist-packages/kolibri/dist/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/kolibri/dist/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3/dist-packages/kolibri/dist/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/lib/python3/dist-packages/kolibri/utils/cli.py", line 277, in invoke
    return super(KolibriDjangoCommand, self).invoke(ctx)
  File "/usr/lib/python3/dist-packages/kolibri/dist/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3/dist-packages/kolibri/dist/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/kolibri/dist/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/lib/python3/dist-packages/kolibri/utils/cli.py", line 727, in manage
    execute_from_command_line(["kolibri manage"] + ctx.args)
  File "/usr/lib/python3/dist-packages/kolibri/dist/django/core/management/__init__.py", line 364, in execute_from_command_line
    utility.execute()
  File "/usr/lib/python3/dist-packages/kolibri/dist/django/core/management/__init__.py", line 356, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/lib/python3/dist-packages/kolibri/dist/django/core/management/base.py", line 283, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/lib/python3/dist-packages/kolibri/dist/django/core/management/base.py", line 330, in execute
    output = self.handle(*args, **options)
  File "/usr/lib/python3/dist-packages/kolibri/core/tasks/management/commands/base.py", line 113, in handle
    return self.handle_async(*args, **options)
  File "/usr/lib/python3/dist-packages/kolibri/core/auth/management/commands/sync.py", line 184, in handle_async
    self._handle_pull(
  File "/usr/lib/python3/dist-packages/kolibri/core/auth/management/commands/sync.py", line 271, in _handle_pull
    sync_client.finalize()
  File "/usr/lib/python3/dist-packages/kolibri/dist/morango/sync/syncsession.py", line 710, in finalize
    self._close_transfer_session(allow_server_timeout=allow_server_timeout)
  File "/usr/lib/python3/dist-packages/kolibri/dist/morango/sync/syncsession.py", line 564, in _close_transfer_session
    self._close_server_transfer_session()
  File "/usr/lib/python3/dist-packages/kolibri/dist/morango/sync/syncsession.py", line 527, in _close_server_transfer_session
    return self.sync_connection._close_transfer_session(
  File "/usr/lib/python3/dist-packages/kolibri/dist/morango/sync/syncsession.py", line 365, in _close_transfer_session
    return self.session.delete(
  File "/usr/lib/python3/dist-packages/kolibri/dist/requests/sessions.py", line 615, in delete
    return self.request('DELETE', url, **kwargs)
  File "/usr/lib/python3/dist-packages/kolibri/dist/morango/sync/session.py", line 62, in request
    raise req_err
  File "/usr/lib/python3/dist-packages/kolibri/dist/morango/sync/session.py", line 52, in request
    response.raise_for_status()
  File "/usr/lib/python3/dist-packages/kolibri/dist/requests/models.py", line 940, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://prime.cyberorg.co.in/kolibri/api/morango/v1/transfersessions/a3b662d75c594c5a8850070443cd6ef9/

RE-running sync after above:

INFO     Syncing has been initiated (this may take a while)...
INFO     Creating pull transfer session
Remotely preparing data  [###############################################################################################################################################]  100%INFO     There are no records to transfer
INFO     Completed pull transfer session
INFO     Syncing has been completed.

Morango_buffer on client is clean, server still has lot of rows.

Sync also does not seem to be complete
synced-client
sync-server

@bjester
Copy link
Member

bjester commented Jul 17, 2020

If so, I believe the next beta may resolve that issue for you, as some fixes from #7251 will be in it.

It seems kolibri/core/auth/management/commands/deletefacility.py and kolibri/core/auth/management/utils.py changes are missing in beta 12.

  File "/usr/lib/python3/dist-packages/kolibri/core/auth/management/commands/flushsyncsessions.py", line 9, in <module>
    from kolibri.core.auth.management.utils import GroupDeletion
ImportError: cannot import name 'GroupDeletion'

Beta 12 does not have the flushsyncsessions.py command.

Regarding the other errors, I suggested only clearing the morango_buffer table. Deleting data from other Morango tables can lead to database corruption.

@bjester bjester modified the milestones: 0.14.0, upcoming patch Jul 23, 2020
@jonboiser jonboiser modified the milestones: upcoming patch, 0.15.0 Oct 19, 2020
@jonboiser jonboiser added the APP: Device Re: Device App (content import/export, facility-syncing, user permissions, etc.) label Feb 20, 2021
@nucleogenesis
Copy link
Member

@jamalex noted a larger project around this in Notion re: optimizing syncing - along with @rtibbles will need to generate an Epic and issues accordingly for a future sprint.

@sairina sairina removed this from the 0.15.0 milestone Nov 9, 2021
@bjester
Copy link
Member

bjester commented Nov 22, 2021

Morango and syncing enhancements have been made to address this on a targeted basis. To finalize this specifically, I propose we add a recurring scheduled task that's configurable, but is disabled by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
APP: Device Re: Device App (content import/export, facility-syncing, user permissions, etc.) P0 - critical Priority: Release blocker or regression TODO: needs decisions Design or specifications are necessary
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants