Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix backup/restore flow #36868

Closed
Tracked by #104
moroine opened this issue Jul 20, 2022 · 29 comments
Closed
Tracked by #104

Fix backup/restore flow #36868

moroine opened this issue Jul 20, 2022 · 29 comments
Assignees

Comments

@moroine
Copy link

moroine commented Jul 20, 2022

Self-Hosted Version

22.7.0

CPU Architecture

x86_64

Docker Version

Docker version 20.10.11, build dea9396

Docker Compose Version

docker-compose version 1.29.2, build unknown

Steps to Reproduce

Following to the doc https://develop.sentry.dev/self-hosted/backup/

  1. docker-compose run --rm -T -e SENTRY_LOG_LEVEL=CRITICAL web export > sentry/backup.json
  2. docker-compose run --rm -T web import /etc/sentry/backup.json
  3. It doesn't work

Expected Result

Valid backup.json file

Actual Result

First, the backup.json contains the following headers

Updating certificates in /etc/ssl/certs...
0 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
done.
sentry/requirements.txt is deprecated, use sentry/enhance-image.sh - see https://github.com/getsentry/self-hosted#enhance-sentry-image

After removing the logs in backup.json, I still have following error:

Creating sentry-self-hosted_web_run ... done
Updating certificates in /etc/ssl/certs...
0 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
done.
sentry/requirements.txt is deprecated, use sentry/enhance-image.sh - see https://github.com/getsentry/self-hosted#enhance-sentry-image
06:57:43 [INFO] sentry.plugins.github: apps-not-configured
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/django/core/serializers/python.py", line 143, in Deserializer
    data[field.name] = field.to_python(field_value)
  File "/usr/local/lib/python3.8/site-packages/sentry/db/models/fields/picklefield.py", line 54, in to_python
    return json.loads(value)
  File "/usr/local/lib/python3.8/site-packages/sentry/utils/json.py", line 114, in loads
    return _default_decoder.decode(value)
  File "/usr/local/lib/python3.8/site-packages/simplejson/decoder.py", line 370, in decode
    obj, end = self.raw_decode(s)
  File "/usr/local/lib/python3.8/site-packages/simplejson/decoder.py", line 392, in raw_decode
    raise TypeError("Input string must be text, not bytes")
TypeError: Input string must be text, not bytes

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/sentry", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/site-packages/sentry/runner/__init__.py", line 187, in main
    func(**kwargs)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/sentry/runner/decorators.py", line 29, in inner
    return ctx.invoke(f, *args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/sentry/runner/commands/backup.py", line 19, in import_
    for obj in serializers.deserialize("json", src, stream=True, use_natural_keys=True):
  File "/usr/local/lib/python3.8/site-packages/django/core/serializers/json.py", line 69, in Deserializer
    yield from PythonDeserializer(objects, **options)
  File "/usr/local/lib/python3.8/site-packages/django/core/serializers/python.py", line 145, in Deserializer
    raise base.DeserializationError.WithData(e, d['model'], d.get('pk'), field_value)
django.core.serializers.base.DeserializationError: Input string must be text, not bytes: (sentry.option:pk=1) field_value was 'False'
ERROR: 1

Workaround:

We can see DeserializationError: Input string must be text, not bytes: (sentry.option:pk=1) field_value was 'False'. If I check the backup.json file and search for sentry.option:pk=1 we have:

{
  "model": "sentry.option",
  "pk": 1,
  "fields": {
    "key": "auth.allow-registration",
    "value": false,
    "last_updated": "2019-09-06T06:05:23.633Z"
  }
}

I need to change false => "false" manually according to the error. Then you can try the restore command again, you'll pick the next DeserializationError and fix it manually until it works.

@emmatyping
Copy link
Contributor

Thank you for reporting this! This looks like an issue in Sentry itself, so I am going to move this issue there for the proper team to look into.

@emmatyping emmatyping transferred this issue from getsentry/self-hosted Jul 20, 2022
@getsentry-release
Copy link

Routing to @getsentry/app-backend for triage. ⏲️

@emmatyping
Copy link
Contributor

Ah, it seems that in #22341 we changed things to write in text mode, perhaps we can/should read in text mode too? (I'm unsure of the invariants here, mostly thinking out loud).

@paulkitt
Copy link

Facing the same issue in the middle of moving sentry. Any news here?

@github-actions
Copy link
Contributor

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you label it Status: Backlog or Status: In Progress, I will leave it alone ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

@sraillard
Copy link

Trying to move a self-hosted Sentry server 22.9.0, we have the same issue, but not on the same field:
django.core.serializers.base.DeserializationError: Input string must be text, not bytes: (sentry.option:pk=1) field_value was '1664883278.4451144'

@moroine
Copy link
Author

moroine commented Oct 4, 2022

@sraillard you have to fix the file manually... Find the sentry.option where primary key is 1 and change it to string... You'll have to resolve one by one.

@sraillard
Copy link

sraillard commented Oct 4, 2022

I have tried to add double quotes to all values but, there are other issues.

For example, for the "sentry:install-id" option, I have this JSON in my file:

{
  "model": "sentry.option",
  "pk": "3",
  "fields": {
    "key": "sentry:install-id",
    "value": "xxxx",
    "last_updated": "2021-01-22T12:40:35.710Z"
  }
},

And here is the content of the Postgres table:

postgres=# select * from "sentry_option" order by id;
 id |            key             |                                    value                                     |         last_updated
----+----------------------------+------------------------------------------------------------------------------+-------------------------------
  1 | sentry:last_worker_ping    | xxxx                                                                         | 2022-10-04 12:25:48.954083+00
  2 | sentry:last_worker_version | xxxx                                                                         | 2022-10-04 12:25:48.972749+00
  3 | system.admin-email         | xxxx                                                                         | 2022-09-21 14:16:57.734569+00
  4 | system.url-prefix          | xxxx                                                                         | 2022-09-21 14:16:57.752642+00
  5 | mail.port                  | xxxx                                                                         | 2022-09-21 14:16:57.762804+00
  6 | mail.username              | xxxx                                                                         | 2022-09-21 14:16:57.770782+00
  7 | mail.password              | xxxx                                                                         | 2022-09-21 14:16:57.778137+00
  8 | mail.use-tls               | xxxx                                                                         | 2022-09-21 14:16:57.786053+00
  9 | mail.use-ssl               | xxxx                                                                         | 2022-09-21 14:16:57.792262+00
 10 | auth.allow-registration    | xxxx                                                                         | 2022-09-21 14:16:57.801872+00
 11 | beacon.anonymous           | xxxx                                                                         | 2022-09-21 14:16:57.818217+00
 12 | sentry:version-configured  | xxxx                                                                         | 2022-09-21 14:16:57.828457+00
 13 | sentry:install-id          | xxxx                                                                         | 2022-09-21 15:09:56.150728+00
 14 | sentry:latest_version      | xxxx                                                                         | 2022-10-04 12:14:47.006184+00
(14 rows)

But I have this error when trying the import:

django.db.utils.IntegrityError: UniqueViolation('duplicate key value violates unique constraint "sentry_option_key_key"\nDETAIL: Key (key)=(sentry:install-id) already exists.\n') SQL: UPDATE "sentry_option" SET "key" = %s, "value" = %s, "last_updated" = %s WHERE "sentry_option"."id" = %s

The sentry:install-id key is existing in the database, as shown above, if the import script works correctly it should ignore or update the value (and the value is the same as in the JSON file...)

Edit: the id isn't the same in the datatabase and in the JSON file (13<>3)

Edit#2: it's finally working but ... it isn't an easy path:

  • the import is adding some extra lines at the top of the JSON file that must be removed (becoming an invalid JSON file)
  • all the values for the keys "value" must be double quoted
  • the sentry_option ids must be the same in the current Postgres database where you are doing the import and in the JSON file to be imported (in my case, I had to correct the ids)

Final words: good luck to everybody who try to import/export Sentry settings... it may work, or not.

@moroine
Copy link
Author

moroine commented Oct 4, 2022

Yeah, I did struggle a lot with this too

@UBA-NE
Copy link

UBA-NE commented Oct 7, 2022

what a struggle to restore a backup, never seen anything like this before 🤦

@sraillard
Copy link

Yes, just exporting/importing the most important information isn't that easy (at least to keep the users and the projects settings with their DSN).

The Sentry version should be the same on exporting and importing servers.

I have started from a clean Sentry installation, done the basic setup then I imported the backup.

Before importing the backup, the JSON has been modified:

  • First lines removed (not JSON)
  • Remove all the "sentry.option", "sentry.relayusage", "sentry.relay" nodes (as the new server already has its configuration)
  • Adding double quotes to all "value" values from "sentry.projectoption" nodes

Hope it may help

@afaianswq
Copy link

Why is this issue still jot fixed after months? Backup/restore functionality is important and it's completely broken now, i.e. there is no way to restore a backup automatically without spending hours fixing backup file manually. Can you please raise issue priority?

@hubertdeng123
Copy link
Member

@getsentry/app-backend would anyone be able to chime in here 🤔?

@hubertdeng123
Copy link
Member

I'll attempt to reproduce and investigate this issue

@chadwhitacre
Copy link
Member

Update on this, the @getsentry/open-source team will be taking over maintenance of the backup/restore functionality but we've got a few other projects on our plate before we can focus on this. Maybe next quarter?

@emmatyping
Copy link
Contributor

We also should make sure to add tests around backup/restore to our end-to-end tests when we take this on.

@Cactiw
Copy link

Cactiw commented Dec 17, 2022

While backup still not work as expected, I wrote quick script that makes auto-fix for sentry-generated backup.json file.
Hopefully this will save someone some time.

https://gist.github.com/Cactiw/c2397561966e2f343b0563a6c5b7f17f

@dmaphy

This comment was marked as off-topic.

@chadwhitacre

This comment was marked as off-topic.

@hubertdeng123
Copy link
Member

Seems like backup and restore is working in getsentry/sentry for 22.6.0, but breaks in 22.7.0

@hubertdeng123
Copy link
Member

The deserialization error should now be resolved in the latest image that will be published shortly

#44328

@xSirrioNx
Copy link

xSirrioNx commented Feb 16, 2023

@hubertdeng123 Hi!
In latest release (23.2.0) restore still not working

@hubertdeng123
Copy link
Member

@xSirrioNx Oh drat. Is it because there are still headers in the backup file? That is something I'm aware of but the deserialization errors should now be fixed.

@xSirrioNx
Copy link

xSirrioNx commented Feb 16, 2023

@xSirrioNx Oh drat. Is it because there are still headers in the backup file? That is something I'm aware of but the deserialization errors should now be fixed.

No, unqoted fields again
How to reproduce it:

  1. Backup from old sentry instance (get JSON with unquoted fields)
  2. Restore with this JSON in new instance
  3. Get error like django.core.serializers.base.DeserializationError: Input string must be text, not bytes: (sentry.option:pk=1) field_value was '1664883278.4451144'

PS: I upgrade new instance from 23.1.1 to 23.2.0 through ./install.sh if that matters

@matthewbyrne
Copy link

^^ Sounds like the sentry:last_worker_ping key value. We had that issue too, and yes, wrapping in quotes resolved the restore process.

@hubertdeng123
Copy link
Member

@xSirrioNx I've just double checked the backup/restore process with my own setup on 23.2.0 and it isn't giving me Deserialization errors anymore. I actually got the same error as you when I checked out 23.1.1. Are you sure you're on 23.2.0?

@xSirrioNx
Copy link

xSirrioNx commented Feb 16, 2023

@hubertdeng123
Oh, Just checked my version in UI and it is 23.1.1...
I'm really sorry...

@hubertdeng123
Copy link
Member

hubertdeng123 commented Feb 16, 2023

@xSirrioNx Great to hear 😁
And no problem haha

@hubertdeng123
Copy link
Member

I'm going to close this since the Deserialization errors are now resolved. Investigating db integrity errors here:
getsentry/self-hosted#1972

@github-actions github-actions bot locked and limited conversation to collaborators Mar 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests