-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: classify emails by importance based on subjects #10277
base: main
Are you sure you want to change the base?
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
ebbed72
to
21d45eb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First impression: looks good 😎
lib/Service/Classification/FeatureExtraction/SubjectExtractor.php
Outdated
Show resolved
Hide resolved
lib/Service/Classification/FeatureExtraction/SubjectExtractor.php
Outdated
Show resolved
Hide resolved
lib/Service/Classification/FeatureExtraction/SubjectExtractor.php
Outdated
Show resolved
Hide resolved
lib/Service/Classification/FeatureExtraction/SubjectExtractor.php
Outdated
Show resolved
Hide resolved
I'll observe my account for a while. So far the stats are (suspiciously) excellent:
|
Signed-off-by: Christoph Wurst <[email protected]>
Signed-off-by: Christoph Wurst <[email protected]>
Signed-off-by: Christoph Wurst <[email protected]>
Signed-off-by: Christoph Wurst <[email protected]>
This reverts commit fb2475f.
a7ea9c0
to
f4fc5bc
Compare
I will keep an eye on the classification ;) However, I haven't cleaned my inbox in a while and fear the countless undeleted "update xyz announcements" or "calendar notifications" might influence the model 🙈 |
mail/lib/Service/Classification/ImportanceClassifier.php Lines 157 to 159 in e2215d9
|
} catch (DoesNotExistException $e) { | ||
public function loadLatest(Account $account): ?ClassifierPipeline { | ||
$cached = $this->getCached((string)$account->getId()); | ||
if ($cached == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if ($cached == null) { | |
if ($cached === null) { |
'exception' => $e, | ||
]); | ||
} | ||
|
||
if ($estimator === null) { | ||
if ($pipeline === null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously this condition was only met for new accounts where no training has happened yet. Now servers with no memory cache will always run into the rule-based classifier. I don't think this is desirable, because results will be bad. The rules based classifier is only meant to cold start the classification.
How about we skip classification all together if there is no distributed cache available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, instead of skipping, we generate a pipeline on the fly? It's slow but gives good results
Partly addresses #3968
Closes #8257
Part 5: The great rebasing ...
Improvements
oc_mail_classifiers
tableHow to tests?
occ mail:account:train -vvv <account-id>
(extract account id withocc mail:account:export <user-id>
)