Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter out private posts before limiting the number of results #1334

Conversation

tpokorra
Copy link

@tpokorra tpokorra commented Jan 2, 2018

fixes #1333

@dsevillamartin
Copy link
Member

What if the user has permission to see the private posts?
Maybe we want to do something like in https://github.com/flarum/core/blob/ea353a2f2c8557e05a2f8f5c08f53e4a4776e87f/src/Post/PostPolicy.php#L73-L82

@tobyzerner
Copy link
Contributor

@tpokorra I see the issue, but @datitisev is right, this solution is flawed as now private posts will never be returned even if the user has permission to see them.

It's hard, because the only way to properly filter post visibility is to get the discussions for all of the matched posts, which will not be performant before they are limited. 🤔

@tpokorra
Copy link
Author

tpokorra commented Jan 3, 2018

I agree with you. I was not sure if support for private messages was implemented yet at all because I could not find references in the code where is_private is set or used other than in PostPolicy.php.

But my solution is not clean, I agree.

Another thought: perhaps in the call /api/posts?filter%5Buser%5D=12345&filter%5Btype%5D=comment&page%5Blimit%5D=20&sort=-time we could add another parameter is_private=false, which would be used in getPostIds in ListPostsController. For users with privileges to see private posts we would need to leave out is_private in the api call.

Should I try to prepare a pull request for that?

@dsevillamartin
Copy link
Member

I like your idea. Maybe use private=1 to say that you do want private posts to appear in the results, as an opt-in instead of an opt-out.
is_private is not referenced anywhere else in the code apart from the migrations and that call because it’s purpose is to be used by extensions. That way, once an extension using it is disabled, the posts aren’t suddenly made publicly visible.

@franzliedke
Copy link
Contributor

Do I misunderstand something or would that mean anybody could see private posts by sending private=1 to the API? We still need to take care of this on the server side.

Honestly, the beginning of all this trouble seems to be implementing PMs as a subtype of discussions - I am becoming convinced that this causes too much pain / hacks - the current being is_private.

@dsevillamartin
Copy link
Member

dsevillamartin commented Jan 3, 2018

@franzliedke Sorry if that was confusing, it was an idea to not always show private posts in the results (if any available). The user/application could add private=1 as an argument to get the private posts as well, of course only the ones they have permission to see.

@luceos
Copy link
Member

luceos commented Jan 3, 2018

@tpokorra

For users with privileges to see private posts we would need to leave out is_private in the api call.

Byobu is an implementation of is_private. Only the extension knows which users and groups are recipients (authorized) of a private discussion. The frontend doesn't necessarily know who is authorized to see specifically what post, especially not on the user profile page.

@tobyzerner
Copy link
Contributor

tobyzerner commented Jan 4, 2018

Can I just try and clarify what is_private is and how it works so we're all on the same page.

is_private is a flag which extensions can set on discussions/posts which signals to Flarum that they should be hidden from ALL users unless an extension adds specific logic to show it.

So, for example, byobu (private discussions). When you create a private discussion, byobu sets is_private = 1 on the discussion, and then listens for the ScopePrivateDiscussionVisibility event and adds logic to allow the discussion to be seen if the user is in the list of recipients.

If you disable byobu, no one will be able to see any private discussions, because they've been flagged as is_private = 1.

Ideally we probably should add this same logic to the Tags extension, so that if you restrict a tag's visibility and then accidentally disable the Tags extension, all of the discussions in private tags won't be revealed to the public.

That's discussions.is_private. Now, posts.is_private has a different use-case. It's to mark an individual post as private in the context of any (private or non-private) discussion. For example, say I wanted to make a "Whispers" extension where you can post a reply in a public discussion, but the reply is only visible to a certain person/group.

@luceos Note that byobu should not actually be marking posts as private, that's not the way it's intended to be used. In byobu, the posts themselves are not private - rather, the posts are public in the context of a private discussion.

When Flarum is querying posts it makes sure the user is allowed to view all the discussions that the posts are in, so you don't have to worry about controlling the visibility of each individual post. It currently does this by:

  1. Getting a page of post IDs (query 1)
  2. Getting the discussions for each of these posts, filtering out the discussions that the user can't see (query 2)
  3. Getting the remaining posts, also filtering out posts that the user can't see in the context of their discussions (query 3)

It's not feasible to do that for all of the posts in the resultset, flag or no flag.

That ScopePostVisibility event is the real killer. It means we can't construct a join or subquery to put query 2 into query 1 (which would otherwise solve the issue). But it has a legitimate use-case in the Approval extension, to hide unapproved posts from users who don't have permission to moderate a discussion. For example:

  • Unapproved Post A is in Discussion A which is in Tag A which gives User A permission to approve posts, hence they can see it
  • Unapproved Post B is in Discussion B which is in Tag B which does not give User A permission to approve posts, hence they can't see it

Perhaps we need to try and move that logic from PHP into SQL though, so we can do complex queries like this...

@ffl096
Copy link

ffl096 commented Jan 4, 2018

It's hard, because the only way to properly filter post visibility is to get the discussions for all of the matched posts, which will not be performant before they are limited. 🤔

One can be a bit smarter about it: Lets say we want to display 20 posts. You really only need to load all posts until you have 20 public posts (of public discussions). This number is reasonable low when the ratio of private posts to all posts is small.
Then check for all loaded posts if they are visible to the user and limit the list to the first 20 elements that pass the check.

This ensures exactly 20 entries with just minor changes to the current architecture and keeps the processing involved low (statistically, of course). As a safeguard for many private posts that push the number of loaded posts high, its probably good to also add a hard limit of e.g. 100. If we hit this limit, we have to accept displaying too few posts, but this should rarely be the case.

@tobyzerner
Copy link
Contributor

tobyzerner commented Jan 7, 2018

@ShatteredMotion

This number is reasonable low when the ratio of private posts to all posts is small.

Can't really guarantee this. What if 99% of a forum is private (via restricted tags) and there's only a small public area?

Doing a while (count($results) < 20) { /* get more results */ } (or equivalent) sets alarm bells ringing for me. Not to mention the problems it would cause pagination.

No, we really do need to be able to get the posts, with all relevant permission logic, in a single query. I'm currently working on improving search performance (#1042) and I've run into basically the same issue.

I'm going to look into how we can change our API so that the ScopePostVisibility event no longer receives a $discussion model.

@ffl096
Copy link

ffl096 commented Jan 7, 2018

Can't really guarantee this. What if 99% of a forum is private (via restricted tags) and there's only a small public area?

You can filter for that case directly: Just get all tags that are accessible to the user and then filter for those discussions.

Doing a while (count($results) < 20) { /* get more results */ } (or equivalent) sets alarm bells ringing for me. Not to mention the problems it would cause pagination.

That's not exactly what I meant. It's still only one query, but instead of a simple LIMIT 20 (or equivalent) you would limit the number of returned entries such that the set has exactly 20 public posts plus all the non-public posts before the 20th public post. This can be achieved with one extra subquery.

But you are right, this approach does not play well with pagination. A simple solution could be to replace the page number with something like lastPostId to indicate the last post that was already fetched. The returned set then contains the next 20 posts after the post with that id.

However, thinking more about it, this probably is harder to implement for search. The assumption that the ratio of private posts is low cannot easily be made here, depending on the search query.

@luceos
Copy link
Member

luceos commented Jan 8, 2018

@ShatteredMotion great thoughts! But let me point out assumptions like these kill the flexibility Flarum tries to guarantee. I'm going to push back to one of the examples I've been using for Byobu (private discussions), which is a role playing forum. Each and every discussion might be private, restricted to a random set of users. As such (depending on the number of posts, but with role playing that might escalate pretty fast) over 99% of all posts might be private.

You can filter for that case directly: Just get all tags that are accessible to the user and then filter for those discussions.

Assuming all forums use the tags extension isn't very wise either while implementing a generic query mechanism. I wouldn't be surprised to see a classical category based hierarchy extension before (or soon after) a stable release.

@tobyzerner
Copy link
Contributor

I've made some serious headway on solving this. Will hopefully push in the next few days.

@tobyzerner
Copy link
Contributor

Closing in favour of #1342

@tobyzerner tobyzerner closed this Jan 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The posts list of a user shows no posts if there are too many posts marked as is_private
6 participants