Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Results in random order should be displayed without replacement #1667

Closed
alanfgh opened this issue Sep 11, 2018 · 9 comments
Closed

Results in random order should be displayed without replacement #1667

alanfgh opened this issue Sep 11, 2018 · 9 comments
Labels
enhancement Issue that describes a problem that requires a change in the current functionalities of Tatoeba.
Milestone

Comments

@alanfgh
Copy link
Contributor

alanfgh commented Sep 11, 2018

When I do a search for English-Russian sentence pairs where the English is "push":

"push" in Russian

the same sentence:

sentence 252087

shows up on multiple pages of the result. It can be seen on page 2, page 3, and probably more. It would be better if a result only showed up on one page, since displaying it on multiple pages is inefficient and distracting.

@jiru jiru added the enhancement Issue that describes a problem that requires a change in the current functionalities of Tatoeba. label Sep 11, 2018
@jiru
Copy link
Member

jiru commented Sep 11, 2018

Well, when the order is random, it is not kept among different pages. Going to a different page shows the results in a different random order, so the same sentence may show up again. To put it differently, reloading page 2 will always show different results. I understand this is not very intuitive though.

I can think of a way to solve this problem. We could keep the order among pages of a given randomly sorted result set by using Sphinx’s rand_seed option:

  1. When clicking on the search button, generate a random value and give it to Sphinx as rand_seed.
  2. Add this value to the GET parameters of all the page links.
  3. Run the Sphinx query with this value, if provided as GET parameter.

With this design, when clicking on the search button, you’ll see the results of page 1 randomly sorted. If you click again, or reload the page using F5, it will show different results. From there, if you go to page 2 (or 3, 4 etc.), sentences of page 1 won’t show up, and if you reload page 2, it will show the same sentences again. From there, if you go to page 1 again, you will see the same results as when you last clicked on the search button. Reloading it won’t load different results. The only way to bring a new random order is to click on the search button.

What do you think?

@alanfgh
Copy link
Contributor Author

alanfgh commented Sep 12, 2018

That sounds fine.

@jiru
Copy link
Member

jiru commented Sep 24, 2018

Actually it’s not easy to implement because the rand_seed option is only available with SphinxQL whereas we’re using the PHP API.

@trang
Copy link
Member

trang commented Nov 8, 2018

Is there actually any use case where pagination is needed when the order is random? Should we maybe replace the pagination by a "Reload" button?

@alanfgh
Copy link
Contributor Author

alanfgh commented Nov 8, 2018

Yes, pagination is still useful when the order is random, and even when the same string might appear on multiple pages. The reason is that it helps users pace themselves. If I've gotten 60 search results, sorted randomly, for a query, and if I have 20 results per page, I know that if I'm on page 3, I've already looked at about 60 results, even if actually I've only looked at 56 unique results because there were some duplicates. That's a lot different from trying to remember how many times I've pressed the "Reload" button.

@trang
Copy link
Member

trang commented Nov 9, 2018

If I understand properly, the pagination basically helps you keep track of how many sentences you've looked at. Is there a specific reason for this?

Is it because you set yourself some goal or some limit on how many sentences to look at? For instance you tell yourself "today I want to check 200 sentences", so you know that once you've reached page 10 (assuming 20 sentences per page), you're done.

Or is it because you want to measure your performance? In the sense that you don't have a specific number to reach during your session, but you have a time limit and you want to have an idea how much you've accomplished during this time.

Or perhaps something else?

When you order by random, do you ever feel the need to go back previous pages? Or do you always move forward and only want to count how many sentences you've looked at?
I suppose if there are 1000 results, you're not likely to look at all of them. What is the highest amount of sentences you would look at within one session?

@alanfgh
Copy link
Contributor Author

alanfgh commented Nov 9, 2018

Is it because you set yourself some goal or some limit on how many sentences to look at?
Or is it because you want to measure your performance?

Both of these.

I have the number of sentences per page set to 20. It's conceivable that I would look at 10 pages, so that's 200 sentences. One example might be if I want to look for sentences containing a specific word. I would mentally discard a bunch of the sentences quickly, so 10 pages is reasonable.

@jiru
Copy link
Member

jiru commented Aug 2, 2020

@alanfgh I solved this issue, you can try out the new random search on https://dev.tatoeba.org/.

@alanfgh
Copy link
Contributor Author

alanfgh commented Aug 2, 2020

@jiru, this works very nicely now. I performed a search for "=push" in English, with translations shown in Russian, limited to sentences with Russian translation. I verified with several examples that:

  • a sentence found on the first page does not appear on any other page of the result
  • the results on a given page remained stable when I moved from one page to another and back again
  • the results appeared in a different order when I performed a new search

This is exactly the behavior I wanted, so I'm happy to see it. Many thanks, @jiru !

@jiru jiru closed this as completed Aug 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Issue that describes a problem that requires a change in the current functionalities of Tatoeba.
Projects
None yet
Development

No branches or pull requests

3 participants