Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a custom repository method for indexation with "meilisearch:import" command #345

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

remigarcia
Copy link
Contributor

Pull Request

Related issue

Fixes #344

What does this PR do?

  • add a configuration setting repository_method in the indices
  • when the command meilisearch:import is invoked, this method is used instead of "findBy"

PR checklist

Please check if your PR fulfills the following requirements:

  • Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
  • Have you read the contributing guidelines?
  • Have you made sure that the title is accurate and descriptive of the changes?

I will be happy to hear any feedback and make changes to this PR if needed.

Copy link

codecov bot commented Aug 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.17%. Comparing base (e110588) to head (e9863b8).
Report is 1 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main     #345      +/-   ##
============================================
+ Coverage     87.75%   88.17%   +0.41%     
  Complexity        1        1              
============================================
  Files            20       21       +1     
  Lines           874      905      +31     
============================================
+ Hits            767      798      +31     
  Misses          107      107              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@norkunas
Copy link
Collaborator

norkunas commented Aug 9, 2024

Thanks,but I'm not sure it's worth it to introduce this option that we'll need to deprecate it, because the goal is to allow supporting indexing from more than orm source, which will open better extensibility.

@remigarcia
Copy link
Contributor Author

remigarcia commented Aug 14, 2024

Thanks,but I'm not sure it's worth it to introduce this option that we'll need to deprecate it, because the goal is to allow supporting indexing from more than orm source, which will open better extensibility.

@norkunas what about an interface like ObjectRepositoryInterface that would be implemented by the user with a method to retrieve the objects? Then the user would use his own logic and it would be independant of any ORM. By default it would be a "DoctrineObjectRepository" which would call the repository of the entity, to keep the actual behaviour, but the user would be able to set an ObjectRepositoryInterface service in the configuration of an index to call its own logic.
That way it would be more futureproof?

If you think this would be ok for you, I'll be glad to update this PR.

@norkunas
Copy link
Collaborator

norkunas commented Aug 14, 2024

Basically it's the same as data provider.
Don't forget that indexing is done in batches, so data provider must support it

@remigarcia
Copy link
Contributor Author

Basically it's the same as data provider. Don't forget that indexing is done in batches, so data provider must support it

@norkunas I named it "DataProvider" and it handles batches as you recommended.
Let me know what you think about it ;)

Copy link
Collaborator

@norkunas norkunas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I've started to make something similar to this, I saw that current SearchService depends on the orm.. I think it's a subject for a deprecation and introduction for a new interface without dependency on it.

Also one of my goals was to introduce Importer service or something similarly named, which would be injected to import command, and we'd have possibility to import manually outside of cli

*/
private function getEntities($prefixedIndexName, $entityClassName, $batchSize, $page): array
{
$dataProvider = $this->searchService->getDataProvider($this->getIndexNameWithoutPrefix($prefixedIndexName));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use ServiceLocator from Symfony instead of coupling this to the SearchService..

Comment on lines +62 to +65
->scalarNode('data_provider')
->info('Method of the entity repository called when the meilisearch:import command is invoked.')
->defaultNull()
->end()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should introduce something like:

persistence:
    driver: orm # later odm, custom (default to orm, to not introduce BC break immediately)
    data_provider: null # if set, use it, otherwise register the default orm provider and add it to service locator

*
* @return array
*/
public function getAll(int $limit = 100, int $offset = 0): array;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public function getAll(int $limit = 100, int $offset = 0): array;
public function provide(int $limit = 100, int $offset = 0): array;

maybe ? :)


namespace Meilisearch\Bundle\DataProvider;

interface DataProvider
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
interface DataProvider
/**
* @template T of object
*/
interface DataProvider

* @param int $limit
* @param int $offset
*
* @return array
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @return array
* @return array<T>

Comment on lines +19 to +22
public function setEntityClassName(string $entityClassName): void
{
$this->entityClassName = $entityClassName;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why making this stateful? I'd register this provider as a separate service for each index and pass the class name via constructor

@norkunas
Copy link
Collaborator

Basically it's the same as data provider. Don't forget that indexing is done in batches, so data provider must support it

@norkunas I named it "DataProvider" and it handles batches as you recommended. Let me know what you think about it ;)

sorry, I'm on a paternal leave, and rushed to reply, instead I wanted to say that the best fit would be DocumentProvider like meilisearch and most other search systems describes this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants