You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Sep 7, 2023. It is now read-only.
One way to improve the result time is to call "on_result" as soon one engine gives some results. To do that, ResultContainer must call on_result: no global wait at the end of the process.
classSearchWithPlugins(Search):
...
defprocessResult(result):
plugins.call(self.ordered_plugin_list, 'on_result', self.request, self, result)
...
classResultContainer(object):
def__init__(self, process_result):
...
self.process_result=process_resultdefextend(self, engine_name, results):
...
# if there is no duplicate found, append resultelse:
result['positions'] = [position]
self.process_result (result)
withRLock():
self._merged_results.append(result)
''' should be thread safe? is not, the same URL may be processed multiple time (should not be a problem) '''defprocessUrl(url):
ifurlinself.processedUrl# self.ordered_plugin_list and self.request, self.search must be initialized.self.processedUrl['url'] =plugins.call(self.ordered_plugin_list, 'on_result', self.request, self.search, url)
returnself.processedUrl['url']
I'm not very happy with this solution because ResultContainer gets references from everything.
One solution : define processUrl(url) inside search.py, and ResultContainer gets a reference to that function.
Implement a third type of plugin: asynchronous notification
Use case: plugins which won't modify the results but which are slow. Example: store the results, make some statistics on the results, etc...
Note: most probably these use cases break the stateless status of searx.
as @asciimoo has suggested in #1224, it can be done using the current plugin architecture :
it creates one thread per plugin and per request (decrease the global response time)
the plugin is not sure to have the fully processed URL (some other plugins may be modified the URL after that call)
If this is a problem, a solution: we can imagine one Queue, and one thread processing the results. The thread gets result from the Queue, and calls the on_notify_result function of the plugins (on_notify_result can't modify the results, but can be slow).
The drawback of the solution: if one plugin hangs indefinitely, the asynchronize processing stops.
The solution of the solution: one Queue/thread per plugin.
To sum up
search.py:
classSearchWithPlugins(Search):
def__init__(self, search_query):
# init varssuper(Search, self).__init__()
self.search_query=search_queryself.result_container=ResultContainer(self.processResult)
defprocessResult(result):
plugins.call(self.ordered_plugin_list, 'on_result', self.request, self, result)
defsearch(self):
ifplugins.call(self.ordered_plugin_list, 'pre_search', self.request, self):
super(SearchWithPlugins, self).search()
plugins.call(self.ordered_plugin_list, 'post_search', self.request, self)
# ResultContainer will call processResult which will call on_result for each plugins.results=self.result_container.get_ordered_results()
forresultinresults:
plugins.async_on_result(self.ordered_plugin_list, self.request, self, result)
returnself.result_container
results.py:
classResultContainer(object):
def__init__(self, process_result):
...
self.process_result=process_resultdefextend(self, engine_name, results):
...
# if there is no duplicate found, append resultelse:
result['positions'] = [position]
self.process_result (result)
withRLock():
self._merged_results.append(result)
Right now, all plugins are synchronized (see SearchWithPlugins) :
The on_result can takes time, and may not modify the results (the current implementation of the HTTPS everywhere plugin for example).
There are two type of plugins for now.
A first type modifies the results :
A second type modifies the UX/UI using javascript / css :
We can imagine a third type: slow but doesn't modify the results or the UI/UX. The plugin is just notify of the results.
Improve the response time
The
on_result
functions inside the plugins are after all results have been collected by the engines. See the call in search.py, at this line :One way to improve the result time is to call "on_result" as soon one engine gives some results. To do that, ResultContainer must call
on_result
: no global wait at the end of the process.I'm not very happy with this solution because ResultContainer gets references from everything.
One solution : define
processUrl(url)
inside search.py, and ResultContainer gets a reference to that function.Implement a third type of plugin: asynchronous notification
Use case: plugins which won't modify the results but which are slow. Example: store the results, make some statistics on the results, etc...
Note: most probably these use cases break the stateless status of searx.
as @asciimoo has suggested in #1224, it can be done using the current plugin architecture :
The drawbacks :
If this is a problem, a solution: we can imagine one
Queue
, and one thread processing the results. The thread getsresult
from theQueue
, and calls theon_notify_result
function of the plugins (on_notify_result
can't modify the results, but can be slow).The drawback of the solution: if one plugin hangs indefinitely, the asynchronize processing stops.
The solution of the solution: one Queue/thread per plugin.
To sum up
search.py:
results.py:
plugins.py:
The text was updated successfully, but these errors were encountered: