Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

long export time when the db is big #100

Closed
aniel300 opened this issue Aug 24, 2024 · 23 comments
Closed

long export time when the db is big #100

aniel300 opened this issue Aug 24, 2024 · 23 comments
Labels
low priority This will be worked on but with low priority

Comments

@aniel300
Copy link

aniel300 commented Aug 24, 2024

this is in reference to #72 (comment) I was trying to capture an interesting line of the log that day but i could not do it, but i think i just bumped into by mistake the line is this 2024/08/23 22:57:17 [DEBUG] Cache miss. Retrieving streams from Redis... not sure if this will be helpful or not.

@aniel300 aniel300 added the bug Something isn't working label Aug 24, 2024
@aniel300
Copy link
Author

here is some more logs. it looks like it is suck however i just confirmed the two groups are present in the db/m3u so we good there. now the only problem is when the the db is big
image

@aniel300
Copy link
Author

please let me know if am leaking info in this screenshot and if u can a way to prevent info leak when using the debug will be awesome because manually making sure and removing the URL is a pain.

@sonroyaalmerol
Copy link
Owner

Can you test and see if #124 results to faster M3U returns? I've reworked the caching behind the scenes. It still should have a similar behavior where the cache needs to be built completely on first request. Also CACHE_ON_SYNC will require BASE_URL to be set as well.

BASE_URL sets the base URL for the stream URLs in the M3U file to be generated (e.g. http://192.168.1.10:8080). This data is usually gathered from the HTTP requests done by the client. However, CACHE_ON_SYNC does its thing in the background without any HTTP requests so it needs to be given the base URL manually.

@aniel300
Copy link
Author

aniel300 commented Aug 25, 2024

base url is where the proxy server lives?

@sonroyaalmerol
Copy link
Owner

Yep! Not the base URL of the source streams.

@aniel300
Copy link
Author

ok let me test and report back

@aniel300
Copy link
Author

every time u mentions a specific pr does it means it is also available in dev tag as well?

@sonroyaalmerol
Copy link
Owner

Only if the PR has been merged. There are specific PRs where I try not to merge them immediately as it changes a lot of components.

@aniel300
Copy link
Author

i have tried all possible combinations such as http://172.18.0.1:8086 and http://172.18.0.1:8080 and still getting 2024/08/25 18:55:43 [DEBUG] Cache miss. Retrieving streams from Redis... in the logs how do i know the new implementation is taking effect? also SAFE_LOGS goes as SAFE_LOGS=true?

@sonroyaalmerol
Copy link
Owner

Well, the fact that you're seeing that log means that you're not using the image of the PR I mentioned. PRs that I mention will have a comment from a bot containing the image URL for that specific PR.

For #124, it should be this comment. Use that image URL instead of the usual sonroyaalmerol/m3u-stream-merger-proxy:dev instead to test a specific PR.

SAFE_LOGS=true is correct. However, the base urls you provided doesn't seem like the IP addresses I would expect in a local setup.

To make things simpler for you, the base URL can be derived from the URL you use to access the generated M3U.

For example, if you access the M3U with this URL: http://192.168.2.5:8080/playlist.m3u, then the base URL would be http://192.168.2.5:8080. You use that as the value for BASE_URL when trying out the PR image.

@aniel300
Copy link
Author

rest assured am trying the correct image and those ip are from the docker network on my debain 12 linux machine. i will share screenshot of everything once i get back home, thank you.

@sonroyaalmerol
Copy link
Owner

Sure. Do double check as it is actually impossible for that PR image to return that log line. It doesn't exist in the code of that PR.

@aniel300
Copy link
Author

aniel300 commented Aug 26, 2024

current image being use and docker ip/network.
image
image
image

@sonroyaalmerol
Copy link
Owner

I guess it's the workflow not doing what it's supposed not building the right image anymore in PRs for some reason. 🤦 I'll merge it to dev. You can test it from there instead.

@aniel300
Copy link
Author

ok thanks

@aniel300
Copy link
Author

unfortunately, the issue has not been fixed. here is some screenshot of what it has been doping. db is around 300mb btw. i left it all night because i had to go to sleep.
msedge_W8oaqPtISz
msedge_0pcR1aO8XW

@sonroyaalmerol
Copy link
Owner

Can you give me more context: what makes you say it doesn't work? What is the output when you request for the M3U url?

@aniel300
Copy link
Author

it keeps loading in the browser or if i try to download the m3u via idm it never downloads. there is a little of cpu spike on the db container and proxy container (I believe) but that is about it. for example, the small channel sample am sing for debugging (~38 channels) works as expected, sync is quick and same goes for export but when u give it multiple url/providers that comes with vod, etc which all together is around 300mb in size then export doesn't seem to work. i dont the sync taking time the firs t time but the exports after that i would like to be quick.

@sonroyaalmerol
Copy link
Owner

I did more work for this issue. I've merged the changes to the dev build if you want to try it out again.

@aniel300

This comment was marked as outdated.

@aniel300
Copy link
Author

aniel300 commented Aug 27, 2024

could the slowness be caused by the sorting? it still doing a lot of this and there is no progress bar so idk how much time is left for this to finish
image

@sonroyaalmerol
Copy link
Owner

Please be patient and wait for everything to be processed to cache. We're talking about ~300mb worth of strings being processed. How I wish it is as easy as simply setting the "amount of CPU" to be used. That is just not how software works. Most of the process required for the proxy to work are single-threaded and cannot be parallelized.

Also, do disable debugging mode if you want the maximum performance possible. Logging affects performance more than you might expect especially in cases like these. Logging is a single-threaded process which pretty much forces even the parallelized processes to wait for the log to be printed to the terminal before the next job is executed.

The sorting is index-based and does not add any time complexity at all. Once everything is processed, the M3U will be stored in plain-text, both in memory and as a file. At that point, the only bottleneck would be your RAM speed and disk i/o.

The function for time complexity will always be directly proportional to the amount of ingress data.

@sonroyaalmerol sonroyaalmerol added low priority This will be worked on but with low priority and removed bug Something isn't working planned This feature is planned labels Aug 27, 2024
@sonroyaalmerol
Copy link
Owner

I've just merged to dev more optimizations. This will probably be something that will improve over time as I fix other issues as a side effect. I won't be focusing on this anymore in the near future. Converting this to a discussion instead.

Repository owner locked and limited conversation to collaborators Aug 27, 2024
@sonroyaalmerol sonroyaalmerol converted this issue into discussion #138 Aug 27, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
low priority This will be worked on but with low priority
Projects
None yet
Development

No branches or pull requests

2 participants