Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Session Replay not showing any replays #2002

Closed
edgolub opened this issue Mar 1, 2023 · 54 comments
Closed

Session Replay not showing any replays #2002

edgolub opened this issue Mar 1, 2023 · 54 comments

Comments

@edgolub
Copy link

edgolub commented Mar 1, 2023

Self-Hosted Version

23.3.0.dev0

CPU Architecture

x86_64

Docker Version

20.10.21

Docker Compose Version

1.29.2

Steps to Reproduce

  1. Updated to the latest version by pulling the master branch
  2. Enabled session replay, and session replay ui under features in the config file
  3. Followed prompt to configure frontend client to send replays
  4. Sentry reporting they received their first replay, but nothing shows up in the list of Replays, and waiting for an hour did not pull them up either

Expected Result

The Replays list in the Sentry dashboard is not showing anything for the project.

After checking logs, I see a couple of errors in snuba-replays-consumer, attached below.

Actual Result

snuba-replays-consumer_1 | 2023-03-01T08:32:52.220558504Z snuba.clickhouse.errors.ClickhouseWriterError: Cannot parse JSON string: expected opening quote: (while read the value of key title): (at row 1)
snuba-replays-consumer_1 | 2023-03-01T08:32:53.236009713Z 2023-03-01 08:32:53,235 Initializing Snuba...
snuba-replays-consumer_1 | 2023-03-01T08:32:56.125327791Z 2023-03-01 08:32:56,125 Snuba initialization took 2.8903919691219926s
snuba-replays-consumer_1 | 2023-03-01T08:32:56.627888461Z 2023-03-01 08:32:56,627 Initializing Snuba...
snuba-replays-consumer_1 | 2023-03-01T08:32:59.388766641Z 2023-03-01 08:32:59,388 Snuba initialization took 2.761911698617041s
snuba-replays-consumer_1 | 2023-03-01T08:32:59.396832190Z 2023-03-01 08:32:59,396 Consumer Starting
snuba-replays-consumer_1 | 2023-03-01T08:32:59.397231560Z 2023-03-01 08:32:59,397 librdkafka log level: 6
snuba-replays-consumer_1 | 2023-03-01T08:33:02.245125423Z 2023-03-01 08:33:02,244 New partitions assigned: {Partition(topic=Topic(name='ingest-replay-events'), index=0): 21}
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056774048Z 2023-03-01 08:33:04,055 Caught exception, shutting down...
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056819569Z Traceback (most recent call last):
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056827338Z File "/usr/local/lib/python3.8/site-packages/arroyo/processing/processor.py", line 175, in run
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056832589Z self._run_once()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056837249Z File "/usr/local/lib/python3.8/site-packages/arroyo/processing/processor.py", line 215, in _run_once
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056842118Z self.__processing_strategy.poll()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056846189Z File "/usr/local/lib/python3.8/site-packages/arroyo/processing/strategies/dead_letter_queue/dead_letter_queue.py", l ine 39, in poll
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056867938Z self.__next_step.poll()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056872649Z File "/usr/local/lib/python3.8/site-packages/arroyo/processing/strategies/run_task.py", line 62, in poll
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056876729Z self.__next_step.poll()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056880258Z File "/usr/local/lib/python3.8/site-packages/arroyo/processing/strategies/reduce.py", line 140, in poll
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056884258Z self.__next_step.poll()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056887909Z File "/usr/local/lib/python3.8/site-packages/arroyo/processing/strategies/run_task.py", line 127, in poll
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056892018Z result = future.result()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056895678Z File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 437, in result
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056899538Z return self.__get_result()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056903438Z File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056907438Z raise self._exception
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056911038Z File "/usr/local/lib/python3.8/concurrent/futures/thread.py", line 57, in run
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056915129Z result = self.fn(*self.args, **self.kwargs)
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056919329Z File "/usr/src/snuba/snuba/consumers/strategy_factory.py", line 120, in flush_batch
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056923878Z message.payload.close()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056928089Z File "/usr/src/snuba/snuba/consumers/consumer.py", line 288, in close
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056932449Z self.__insert_batch_writer.close()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056936629Z File "/usr/src/snuba/snuba/consumers/consumer.py", line 127, in close
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056941158Z self.__writer.write(
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056945009Z File "/usr/src/snuba/snuba/clickhouse/http.py", line 328, in write
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056949098Z batch.join()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056952978Z File "/usr/src/snuba/snuba/clickhouse/http.py", line 266, in join
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056957389Z raise ClickhouseWriterError(message, code=code, row=row)
snuba-replays-consumer_1 | 2023-03-01T08:33:04.056961458Z snuba.clickhouse.errors.ClickhouseWriterError: Cannot parse JSON string: expected opening quote: (while read the value of key title): (at row 1)
snuba-replays-consumer_1 | 2023-03-01T08:33:04.065375948Z 2023-03-01 08:33:04,065 Closing <arroyo.backends.kafka.consumer.KafkaConsumer object at 0x7fc5cb4fda00>...
snuba-replays-consumer_1 | 2023-03-01T08:33:04.065903508Z 2023-03-01 08:33:04,065 Partitions revoked: [Partition(topic=Topic(name='ingest-replay-events'), index=0)]
snuba-replays-consumer_1 | 2023-03-01T08:33:04.067680488Z 2023-03-01 08:33:04,067 Processor terminated
snuba-replays-consumer_1 | 2023-03-01T08:33:04.072085497Z Traceback (most recent call last):
snuba-replays-consumer_1 | 2023-03-01T08:33:04.072105348Z File "/usr/local/bin/snuba", line 33, in
snuba-replays-consumer_1 | 2023-03-01T08:33:04.072897027Z sys.exit(load_entry_point('snuba', 'console_scripts', 'snuba')())
snuba-replays-consumer_1 | 2023-03-01T08:33:04.072924827Z File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1130, in call
snuba-replays-consumer_1 | 2023-03-01T08:33:04.072941897Z return self.main(*args, **kwargs)
snuba-replays-consumer_1 | 2023-03-01T08:33:04.072946087Z File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1055, in main
snuba-replays-consumer_1 | 2023-03-01T08:33:04.073385877Z rv = self.invoke(ctx)
snuba-replays-consumer_1 | 2023-03-01T08:33:04.073402197Z File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
snuba-replays-consumer_1 | 2023-03-01T08:33:04.074031247Z return _process_result(sub_ctx.command.invoke(sub_ctx))
snuba-replays-consumer_1 | 2023-03-01T08:33:04.074044957Z File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
snuba-replays-consumer_1 | 2023-03-01T08:33:04.074668317Z return ctx.invoke(self.callback, **ctx.params)
snuba-replays-consumer_1 | 2023-03-01T08:33:04.074683677Z File "/usr/local/lib/python3.8/site-packages/click/core.py", line 760, in invoke
snuba-replays-consumer_1 | 2023-03-01T08:33:04.074689787Z return __callback(*args, **kwargs)
snuba-replays-consumer_1 | 2023-03-01T08:33:04.074694847Z File "/usr/src/snuba/snuba/cli/consumer.py", line 189, in consumer
snuba-replays-consumer_1 | 2023-03-01T08:33:04.075384857Z consumer.run()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.075398927Z File "/usr/local/lib/python3.8/site-packages/arroyo/processing/processor.py", line 175, in run
snuba-replays-consumer_1 | 2023-03-01T08:33:04.075405047Z self._run_once()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.075409717Z File "/usr/local/lib/python3.8/site-packages/arroyo/processing/processor.py", line 215, in _run_once
snuba-replays-consumer_1 | 2023-03-01T08:33:04.075414377Z self.__processing_strategy.poll()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.075418757Z File "/usr/local/lib/python3.8/site-packages/arroyo/processing/strategies/dead_letter_queue/dead_letter_queue.py", l ine 39, in poll
snuba-replays-consumer_1 | 2023-03-01T08:33:04.075423727Z self.__next_step.poll()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.075428297Z File "/usr/local/lib/python3.8/site-packages/arroyo/processing/strategies/run_task.py", line 62, in poll
snuba-replays-consumer_1 | 2023-03-01T08:33:04.075432817Z self.__next_step.poll()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.075437377Z File "/usr/local/lib/python3.8/site-packages/arroyo/processing/strategies/reduce.py", line 140, in poll
snuba-replays-consumer_1 | 2023-03-01T08:33:04.076195707Z self.__next_step.poll()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.076210517Z File "/usr/local/lib/python3.8/site-packages/arroyo/processing/strategies/run_task.py", line 127, in poll
snuba-replays-consumer_1 | 2023-03-01T08:33:04.076216767Z result = future.result()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.076220997Z File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 437, in result
snuba-replays-consumer_1 | 2023-03-01T08:33:04.076225547Z return self.__get_result()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.076230177Z File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
snuba-replays-consumer_1 | 2023-03-01T08:33:04.076234877Z raise self._exception
snuba-replays-consumer_1 | 2023-03-01T08:33:04.076239027Z File "/usr/local/lib/python3.8/concurrent/futures/thread.py", line 57, in run
snuba-replays-consumer_1 | 2023-03-01T08:33:04.076965347Z result = self.fn(*self.args, **self.kwargs)
snuba-replays-consumer_1 | 2023-03-01T08:33:04.076981907Z File "/usr/src/snuba/snuba/consumers/strategy_factory.py", line 120, in flush_batch
snuba-replays-consumer_1 | 2023-03-01T08:33:04.077002947Z message.payload.close()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.077007447Z File "/usr/src/snuba/snuba/consumers/consumer.py", line 288, in close
snuba-replays-consumer_1 | 2023-03-01T08:33:04.077011197Z self.__insert_batch_writer.close()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.077014777Z File "/usr/src/snuba/snuba/consumers/consumer.py", line 127, in close
snuba-replays-consumer_1 | 2023-03-01T08:33:04.077018467Z self.__writer.write(
snuba-replays-consumer_1 | 2023-03-01T08:33:04.077021987Z File "/usr/src/snuba/snuba/clickhouse/http.py", line 328, in write
snuba-replays-consumer_1 | 2023-03-01T08:33:04.077648317Z batch.join()
snuba-replays-consumer_1 | 2023-03-01T08:33:04.077660327Z File "/usr/src/snuba/snuba/clickhouse/http.py", line 266, in join
snuba-replays-consumer_1 | 2023-03-01T08:33:04.077663597Z raise ClickhouseWriterError(message, code=code, row=row)
snuba-replays-consumer_1 | 2023-03-01T08:33:04.077666007Z snuba.clickhouse.errors.ClickhouseWriterError: Cannot parse JSON string: expected opening quote: (while read the value

Event ID

No response

@edgolub edgolub changed the title Session Replay not showing any replaces Session Replay not showing any replays Mar 1, 2023
@TsubasaBE
Copy link

Same here.... But it worked yesterday, so I'll try uisng one of the previously pushed snuba images.

@hubertdeng123
Copy link
Member

Just to be sure, you ran the install script after pulling master so your images are up to date?

@hieunhit
Copy link

hieunhit commented Mar 2, 2023

same issue here, only saw the first session

2023-03-02 09:19:53,864 Snuba initialization took 2.5034212339669466s
2023-03-02 09:19:54,269 Initializing Snuba...
2023-03-02 09:19:56,821 Snuba initialization took 2.5527064446359873s
2023-03-02 09:19:56,827 Consumer Starting
2023-03-02 09:19:56,827 librdkafka log level: 6
2023-03-02 09:19:59,711 New partitions assigned: {Partition(topic=Topic(name='ingest-replay-events'), index=0): 3}
2023-03-02 09:20:01,741 Caught exception, shutting down...
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/arroyo/processing/processor.py", line 179, in run
    self._run_once()
  File "/usr/local/lib/python3.8/site-packages/arroyo/processing/processor.py", line 219, in _run_once
    self.__processing_strategy.poll()
  File "/usr/local/lib/python3.8/site-packages/arroyo/processing/strategies/dead_letter_queue/dead_letter_queue.py", line 39, in poll
    self.__next_step.poll()
  File "/usr/local/lib/python3.8/site-packages/arroyo/processing/strategies/run_task.py", line 62, in poll
    self.__next_step.poll()
  File "/usr/local/lib/python3.8/site-packages/arroyo/processing/strategies/reduce.py", line 140, in poll
    self.__next_step.poll()
  File "/usr/local/lib/python3.8/site-packages/arroyo/processing/strategies/run_task.py", line 127, in poll
    result = future.result()
  File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 437, in result
    return self.__get_result()
  File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/src/snuba/snuba/consumers/strategy_factory.py", line 120, in flush_batch
    message.payload.close()
  File "/usr/src/snuba/snuba/consumers/consumer.py", line 288, in close
    self.__insert_batch_writer.close()
  File "/usr/src/snuba/snuba/consumers/consumer.py", line 127, in close
    self.__writer.write(
  File "/usr/src/snuba/snuba/clickhouse/http.py", line 328, in write
    batch.join()
  File "/usr/src/snuba/snuba/clickhouse/http.py", line 266, in join
    raise ClickhouseWriterError(message, code=code, row=row)
snuba.clickhouse.errors.ClickhouseWriterError: Cannot parse JSON string: expected opening quote: (while read the value of key title): (at row 1)

@edgolub
Copy link
Author

edgolub commented Mar 2, 2023

Just to be sure, you ran the install script after pulling master so your images are up to date?

Yes. I in fact created a new fresh server build and installed sentry again from the start. To see if that works.

Replays worked!

...for like 2 minutes, then scuba started throwing these errors again.

@hubertdeng123
Copy link
Member

I've attempted to reproduce this with the react SDK. Replays have been going through for over 30 minutes and nothing is breaking 🤔. I've dropped a line internally so this will be looked at shortly by the team that owns Session Replay

@edgolub
Copy link
Author

edgolub commented Mar 2, 2023

I've attempted to reproduce this with the react SDK. Replays have been going through for over 30 minutes and nothing is breaking 🤔. I've dropped a line internally so this will be looked at shortly by the team that owns Session Replay

For reference, the first env I tested in was using the Vue.js SDK, while the new instance I pulled on on a fresh server was hooked up with two projects, one using Vue, the other React.

@TsubasaBE
Copy link

Same here.... But it worked yesterday, so I'll try uisng one of the previously pushed snuba images.

Re-deploying older images doesn't fix the problem.
Resetting the Kafka consumers and removing the messages in the Kafka topics temporarily fixes the problem.
I'm guessing the Javascript Replay SDK sometimes generates specific events which triggers the issue.

@edgolub
Copy link
Author

edgolub commented Mar 3, 2023

Same here.... But it worked yesterday, so I'll try uisng one of the previously pushed snuba images.

Re-deploying older images doesn't fix the problem. Resetting the Kafka consumers and removing the messages in the Kafka topics temporarily fixes the problem. I'm guessing the Javascript Replay SDK sometimes generates specific events which triggers the issue.

I haven't tested it yet, but I believe it only stopped working for me when I caught a rest api exception. I save parts of the JSON output and send it to sentry via extra data as a key:value pair.

I'm guessing that this is what might be causing issues in Replay. I'll get around to testing it later today when I get some time.

@TsubasaBE
Copy link

TsubasaBE commented Mar 3, 2023

I haven't tested it yet, but I believe it only stopped working for me when I caught a rest api exception. I save parts of the JSON output and send it to sentry via extra data as a key:value pair.

I'm guessing that this is what might be causing issues in Replay. I'll get around to testing it later today when I get some time.

Interesting indeed. I use the same approach where i store the json output of the rest call as extra data in the JS SDK sentry object.

I'n guessing Snuba doesn't escape the json string correctly when trying to store data in Clickhouse.

@edgolub
Copy link
Author

edgolub commented Mar 6, 2023

Can confirm this is happening even with the replay on sentry.io, so not really limited to to self hosted version of Replays.

Where would be the correct place to report this? @hubertdeng123 Can you forward this new info to that internal ticket?
Here is the Replay ID I used in the test account I made in sentry.io, if that helps debug it: javascript-vue:224700ccd4e544e295eaa37ad71c9d35

@JoshFerge
Copy link
Member

@edgolub @TsubasaBE could you provide an example of a JSON structure you're storing (PII removed if adding here)? is it nested?

@edgolub
Copy link
Author

edgolub commented Mar 7, 2023

Hi @JoshFerge , yes it can be nested in some cases. I save both the request and response JSON for my XHR requests as extra data for easy lookup.
Response
{"success":false,"errorMessage":"[ERROR MSG]!"}
Request
{ requestId: 'BCD', items: [ id: 'SOMEID', subItem: [ someFlag: 1, id: "ABC" ] ] }
(just an example, the actual JSON data is too sensitive to post here)

These are added to the "Additional Data" section of an Issue:
image

@JoshFerge
Copy link
Member

Hi @JoshFerge , yes it can be nested in some cases. I save both the request and response JSON for my XHR requests as extra data for easy lookup. Response {"success":false,"errorMessage":"[ERROR MSG]!"} Request { requestId: 'BCD', items: [ id: 'SOMEID', subItem: [ someFlag: 1, id: "ABC" ] ] } (just an example, the actual JSON data is too sensitive to post here)

These are added to the "Additional Data" section of an Issue: image

thanks for that information. and on your javascript SDK, what methods are you calling to set these values? also, what javascript version of the SDK are you on?

@JoshFerge
Copy link
Member

Can confirm this is happening even with the replay on sentry.io, so not really limited to to self hosted version of Replays.

Where would be the correct place to report this? @hubertdeng123 Can you forward this new info to that internal ticket? Here is the Replay ID I used in the test account I made in sentry.io, if that helps debug it: javascript-vue:224700ccd4e544e295eaa37ad71c9d35

and i'm noticing that we ingested the replay on our side -- we don't currently support adding extra fields onto a replay, but I do see that we ingested your replay, so I don't think in production we are actually erroring out. does that make sense?

@DarkByteZero
Copy link

I have exactly the same problem. I saw a replay, was excited, but then snuba-replays-consumer runs into the problem and replays stop working.

@edgolub
Copy link
Author

edgolub commented Mar 9, 2023

@JoshFerge I've actually sent multiple sessions, not just that first one. The first replay is simply the only one that is visible.
I'm quite sure I tried it again after half an hour, and nothing was visible in sentry.io, same behavior that I get when I run it locally.

I actually disabled extra data for our issue events, and I actually saw a new replay a few hours ago, in our self hosted sentry.

@bruno-garcia
Copy link
Member

@DarkByteZero got 2 replays that crash in Self Hosted sent to this project in SaaS: https://sentry-sdks.sentry.io/replays/?project=4504833789001728

@gierschv
Copy link

gierschv commented Mar 15, 2023

I have the exact same logs after trying in dev the replays with SDK from last week (sentry.javascript.vue/7.42.0).

We use various extra properties with nested objects and breadcrumbs with objects as well, so not sure with the current trace what is causing the crash. Let me know if you need more details or debug from this service.

2023-03-15 22:21:33,896 Initializing Snuba...
2023-03-15 22:21:37,334 Snuba initialization took 3.438622100977227s
2023-03-15 22:21:37,803 Initializing Snuba...
2023-03-15 22:21:41,139 Snuba initialization took 3.3382527821231633s
2023-03-15 22:21:41,147 Consumer Starting
2023-03-15 22:21:41,147 librdkafka log level: 7
2023-03-15 22:21:41,152 Starting
2023-03-15 22:21:42,371 New partitions assigned: {Partition(topic=Topic(name='ingest-replay-events'), index=0): 17}
2023-03-15 22:21:42,372 Initialized processing strategy: <arroyo.processing.strategies.dead_letter_queue.dead_letter_queue.DeadLetterQueue object at 0x7f8e0bb93f10>
2023-03-15 22:21:43,386 Starting new HTTP connection (1): clickhouse:8123
2023-03-15 22:21:43,387 Finished sending data from <HTTPWriteBatch: [] rows (25744 bytes)>.
2023-03-15 22:21:43,389 http://clickhouse:8123 "POST /?load_balancing=in_order&insert_distributed_sync=1&query=INSERT+INTO+default.replays_local++FORMAT+JSONEachRow HTTP/1.1" 400 None
2023-03-15 22:21:43,390 Received response for <HTTPWriteBatch: [] rows (25744 bytes)>.
2023-03-15 22:21:44,386 Caught exception, shutting down...
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/arroyo/processing/processor.py", line 181, in run
    self._run_once()
  File "/usr/local/lib/python3.8/site-packages/arroyo/processing/processor.py", line 221, in _run_once
    self.__processing_strategy.poll()
  File "/usr/local/lib/python3.8/site-packages/arroyo/processing/strategies/dead_letter_queue/dead_letter_queue.py", line 39, in poll
    self.__next_step.poll()
  File "/usr/local/lib/python3.8/site-packages/arroyo/processing/strategies/run_task.py", line 71, in poll
    self.__next_step.poll()
  File "/usr/local/lib/python3.8/site-packages/arroyo/processing/strategies/reduce.py", line 149, in poll
    self.__next_step.poll()
  File "/usr/local/lib/python3.8/site-packages/arroyo/processing/strategies/run_task.py", line 151, in poll
    result = future.result()
  File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 437, in result
    return self.__get_result()
  File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/src/snuba/snuba/consumers/strategy_factory.py", line 120, in flush_batch
    message.payload.close()
  File "/usr/src/snuba/snuba/consumers/consumer.py", line 296, in close
    self.__insert_batch_writer.close()
  File "/usr/src/snuba/snuba/consumers/consumer.py", line 135, in close
    self.__writer.write(
  File "/usr/src/snuba/snuba/clickhouse/http.py", line 328, in write
    batch.join()
  File "/usr/src/snuba/snuba/clickhouse/http.py", line 266, in join
    raise ClickhouseWriterError(message, code=code, row=row)
snuba.clickhouse.errors.ClickhouseWriterError: Cannot parse JSON string: expected opening quote: (while read the value of key title): (at row 1)
2023-03-15 22:21:44,392 Terminating <arroyo.processing.strategies.dead_letter_queue.dead_letter_queue.DeadLetterQueue object at 0x7f8e0bb93f10>...
2023-03-15 22:21:44,393 Terminating <arroyo.processing.strategies.dead_letter_queue.policies.produce.ProduceInvalidMessagePolicy object at 0x7f8e0bb93eb0>...
2023-03-15 22:21:44,393 Terminating <arroyo.processing.strategies.transform.TransformStep object at 0x7f8e0bb93700>...
2023-03-15 22:21:44,393 Closing <arroyo.backends.kafka.consumer.KafkaConsumer object at 0x7f8e0e941a90>...
2023-03-15 22:21:44,393 Partitions revoked: [Partition(topic=Topic(name='ingest-replay-events'), index=0)]
2023-03-15 22:21:44,395 Processor terminated

@davidflypei
Copy link

Same error here. I somehow got one replay from production. There are no errors attached to it. Just a random session. But that's it and its been running for 4 hours. Even with sessions sampling set to 1.0 in dev I still wasn't getting anything.

I'm assuming the recording data is compressed? After all the event info and context its just garbled characters in the browser network tab and postman interceptor.

I might set session sampling to 1 on production and see if I get any more replays. Maybe even some from my session so i can look at the data its sending.

@DarkByteZero
Copy link

Yeah, one replay goes through, after that, it dies. But I provided 2 Replays to help, maybe it will be fixed soon.

@davidflypei
Copy link

Hopefully. The one replay I got looks awesome. I cant wait to actually use it.

@jonhassall
Copy link

I had one Replay appear on the dashboard, and no further replays after that. I also upgraded to the latest release. I look forward to using it as that replay looked very useful.

@kgonella
Copy link

Got same error here, only one record then Cannot parse JSON string: expected opening quote: (while read the value of key title): (at row 1) in snuba-replays-consumer

@JoshFerge
Copy link
Member

was able to reproduce and put a PR fix up. had trouble replicating because of clickhouse version, but figured it out. see getsentry/snuba#3878

@hubertdeng123
Copy link
Member

We are currently avoiding upgrading Clickhouse for self-hosted users due to the risk where ingested data on 20.3 may not be compatible with Clickhouse 21.8. So please be aware of this risk and it'd be helpful to know if you all are experiencing issues after upgrading to 21.8.

@DarkByteZero
Copy link

DarkByteZero commented Mar 20, 2023

I mean i have no problems, but in fact iam missing some issue events, idk if that is because of Clickhouse. But they are just gone, lol. If i open URLs from E-Mails, they just say: "The issue you were looking for was not found."

@edgolub
Copy link
Author

edgolub commented Mar 20, 2023

I just upgraded clickhouse a few hours ago, and not seeing any issues at all so far.

Session Replays works now! So cool!

@davidflypei
Copy link

davidflypei commented Mar 20, 2023 via email

@jonhassall
Copy link

It did not fix the replays issue for me. Looking forward to official release.

@davidflypei
Copy link

davidflypei commented Mar 20, 2023 via email

@jonhassall
Copy link

I get
[WARNING] root: Recording segment was already processed.
Is this relevant?

@davidflypei
Copy link

davidflypei commented Mar 20, 2023 via email

@jonhassall
Copy link

I didn't get a message like that. I am thinking of reinstalling.

@TsubasaBE
Copy link

So far I haven't noticed any issues. Replays are working great.
Thanks everyone for your help.

@jonhassall
Copy link

Replays working great for me too now I reinstalled. Thanks all and I hope the team gets this integrated into the release.

@JoshFerge
Copy link
Member

Yep! this will be included in the april calver release of self hosted :) @jonhassall

@hubertdeng123
Copy link
Member

hubertdeng123 commented Mar 21, 2023

We've decided to cut a 23.3.1 release to fix this for you all. This will go out later this week

@hubertdeng123
Copy link
Member

v23.3.1 is now out, issue is still open but just pending some documentation PR reviews now

@joacub
Copy link

joacub commented Mar 24, 2023

We are currently avoiding upgrading Clickhouse for self-hosted users due to the risk where ingested data on 20.3 may not be compatible with Clickhouse 21.8. So please be aware of this risk and it'd be helpful to know if you all are experiencing issues after upgrading to 21.8.

hi im experiencing so much issues using that version of click house, how I can restore this or make this working again ?

the error receiving is this

Container sentry-self-hosted-snuba-subscription-consumer-transactions-1  Started
Updating certificates in /etc/ssl/certs...
0 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
done.
!! Configuration error: ConfigurationError("NameError: name 'v' is not defined")
Error in install/set-up-and-migrate-database.sh:12.
'$dcr web upgrade' exited with status 1
-> ./install.sh:main:33
--> install/set-up-and-migrate-database.sh:source:12

@alexphili
Copy link

alexphili commented Mar 24, 2023

Hi, thanks for the update, I'm still having issues with kafka topic "ingest-replay-events" and the container replay-consumer (the container keeps restarting) and the offset out or range error, how can I clear the content ? Thanks

Log of the snuba-replays-consumer container :

2023-03-24 08:32:48,087 Initializing Snuba...
2023-03-24 08:32:52,917 Snuba initialization took 4.831170398974791s
2023-03-24 08:32:53,589 Initializing Snuba...
2023-03-24 08:32:56,607 Snuba initialization took 3.0193493450060487s
2023-03-24 08:32:56,616 Consumer Starting
2023-03-24 08:32:56,617 librdkafka log level: 6
2023-03-24 08:32:59,409 New partitions assigned: {Partition(topic=Topic(name='ingest-replay-events'), index=0): 1}
2023-03-24 08:32:59,415 Caught exception, shutting down...
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/arroyo/processing/processor.py", line 181, in run
    self._run_once()
  File "/usr/local/lib/python3.8/site-packages/arroyo/processing/processor.py", line 212, in _run_once
    self.__message = self.__consumer.poll(timeout=1.0)
  File "/usr/local/lib/python3.8/site-packages/arroyo/backends/kafka/consumer.py", line 407, in poll
    raise OffsetOutOfRange(str(error))
arroyo.errors.OffsetOutOfRange: KafkaError{code=_AUTO_OFFSET_RESET,val=-140,str="fetch failed due to requested offset not available on the broker: Broker: Offset out of range (broker 1001)"}

EDIT: found a fix thanks to #478 (comment)

  1. Stop all containers
    docker stop $(docker ps -a -q)
  2. Launch only zookeeper and kafka
    docker start sentry-self-hosted_zookeeper_1
    docker start sentry-self-hosted_kafka_1
  3. Log into kafka container
    docker exec -it sentry-self-hosted_kafka_1 bash
  4. Set offset to latest and execute
    kafka-consumer-groups --bootstrap-server 127.0.0.1:9092 --group snuba-consumers --topic ingest-replay-events --reset-offsets --to-latest --execute
  5. Stop all containers
    docker stop $(docker ps -a -q)
  6. Start all containers
    docker start $(docker ps -a -q)

After that replay work without problem.

@hubertdeng123
Copy link
Member

hi im experiencing so much issues using that version of click house, how I can restore this or make this working again ?

Unfortunately, if you've upgraded to Clickhouse 21.8, we don't have a recommendation on how to restore your previous data unless you have a backup of your docker volumes. You may need to consider a clean install otherwise

@hubertdeng123
Copy link
Member

great to hear that it's working for you @alexphili

@joacub
Copy link

joacub commented Mar 24, 2023

hi im experiencing so much issues using that version of click house, how I can restore this or make this working again ?

Unfortunately, if you've upgraded to Clickhouse 21.8, we don't have a recommendation on how to restore your previous data unless you have a backup of your docker volumes. You may need to consider a clean install otherwise

Not even fresh and removing all volúmenes works, I execute the script to test integrations and that change something that I never be able to reinstall again, I end up reinstalling the whole server and making a fresh installation and everything works perfect

@hubertdeng123
Copy link
Member

going to close this as it seems to be resolved now

@sashamorozov
Copy link

Hi,

please confirm that with this config

replaysSessionSampleRate: 0,
replaysOnErrorSampleRate: 1.0

replays should still be sent on errors. Because as soon as I set replaysSessionSampleRate = 0 (it was more than 0 for testing), replays stopped coming at all, although errors continue to come as usual, but they don't have replays.

I only want sessions with errors, I'm not interested in normal replays

Thanks :)

@JoshFerge
Copy link
Member

JoshFerge commented Mar 27, 2023

@sashamorozov the issue you are describing is possibly an SDK problem, and not related to this issue. please report in https://github.com/getsentry/sentry-javascript if you are still having issues after upgrading your SDK to the latest version (7.45.0). thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Archived in project
Development

No branches or pull requests