Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid making several identical HTTPS calls when a single would be enough #28

Open
Benjamin-Loison opened this issue Sep 15, 2022 · 4 comments
Labels
discussion enhancement New feature or request medium priority A high-priority issue noticeable by the user but he can still work around it. medium A task that should take less than a day to complete.

Comments

@Benjamin-Loison
Copy link
Owner

Benjamin-Loison commented Sep 15, 2022

YouTube operational API Videos: list endpoint is a good example of code requiring such optimization.

Should also make clear what combinations of parameters are possible, if not all are. Should also think how to manage if support multiple process by the YouTube operational API of data retrieved from multiple YouTube Data UI endpoints as it may involve multiple page tokens and currently we return one per request to the YouTube operational API.

Could easily use a cache for instance.

Related to #69.

@Benjamin-Loison Benjamin-Loison added the enhancement New feature or request label Sep 15, 2022
@Benjamin-Loison Benjamin-Loison added discussion low priority Nice to have feature. medium A task that should take less than a day to complete. labels Oct 16, 2022
@Benjamin-Loison
Copy link
Owner Author

Implementing a per YouTube operational API request cache seems to solve this issue and is the most appropriate solution I would say.

@Benjamin-Loison
Copy link
Owner Author

Related to #112.

@Benjamin-Loison Benjamin-Loison added medium priority A high-priority issue noticeable by the user but he can still work around it. and removed low priority Nice to have feature. labels Apr 6, 2024
@Benjamin-Loison
Copy link
Owner Author

Seems that at least have to pay attention if English or local language request is performed I think.

@Benjamin-Loison
Copy link
Owner Author

Benjamin-Loison commented Apr 6, 2024

I originally thought about this issue to solve this Discord user one but the considered URLs are different, hence the caching would not solve this issue in theory. Actually solved this issue finally thanks to a5ecbdf.

import matplotlib.pyplot as plt
import numpy as np
import requests
from tqdm import tqdm

CHANNEL_ID = 'UCWeg2Pkate69NFdBeuRFTAw'

url = 'http://localhost/YouTube-operational-API/channels'
params = {
    'part': ','.join(['community', 'about']),
    'id': CHANNEL_ID,
}
    
def getRequestTime():
    response = requests.get(url, params = params)
    data = response.json()
    item = data['items'][0]
    looksCorrect = item['community'][0]['channelId'] == CHANNEL_ID and item['about']['details']['location'] == 'France'
    assert looksCorrect, 'The response does not looks correct!'
    return response.elapsed.total_seconds()

def getRequestsTime():
    return [getRequestTime() for _ in tqdm(range(25))]

beforeCaching = getRequestsTime()
# Manually add caching feature to the concerned YouTube operational API instance.
afterCaching = getRequestsTime()

all_data = [beforeCaching, afterCaching]
labels = ['Before caching', 'After caching']

# rectangular box plot
bplot1 = plt.boxplot(all_data,
                     patch_artist=True,  # fill with color
                     labels=labels)  # will be used to label x-ticks
plt.title('Request time before and after caching')

# fill with colors
colors = ['pink', 'lightblue']
for patch, color in zip(bplot1['boxes'], colors):
    patch.set_facecolor(color)

# adding horizontal grid lines
plt.grid(True)
plt.xlabel('Before and after caching')
plt.ylabel('Request time')

plt.show()

Modified from https://matplotlib.org/3.8.0/gallery/statistics/boxplot_color.html

Typical results without actually implementing caching but computing {before,after}Caching one after the other just to show what possibly caching done on YouTube servers result in.

diff --git a/common.php b/common.php
index d74359b..f0bee99 100644
--- a/common.php
+++ b/common.php
@@ -61,6 +61,21 @@
         return $headers;
     }
 
+    $cache = [];
+
+    function fileGetContentsWithCaching($url, $context)
+    {
+        global $cache;
+        $cacheKey = serialize([$url, stream_context_get_options($context)]);
+        if(array_key_exists($cacheKey, $cache))
+        {
+            return $cache[$cacheKey];
+        }
+        $result = file_get_contents($url, false, $context);
+        $cache[$cacheKey] = $result;
+        return $result;
+    }
+
     function fileGetContentsAndHeadersFromOpts($url, $opts)
     {
         if(HTTPS_PROXY_ADDRESS !== '')
@@ -79,7 +94,7 @@
             }
         }
         $context = getContextFromOpts($opts);
-        $result = file_get_contents($url, false, $context);
+        $result = fileGetContentsWithCaching($url, $context);
         return [$result, $http_response_header];
     }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion enhancement New feature or request medium priority A high-priority issue noticeable by the user but he can still work around it. medium A task that should take less than a day to complete.
Projects
None yet
Development

No branches or pull requests

1 participant