-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unnecessary string distance calculation when using @Field [DATAMONGO-1991] #2862
Comments
Oliver Drotbohm commented Can you please clarify in how far this affects Spring Data MongoDB? Given the information provided so far, I don't see any connection here. I briefly checked and we only use |
Ludek Novotny commented Yes, it's not caused by |
Oliver Drotbohm commented Can you please provide more information on what code you're executing? It feels unusual that you have code that's triggering that exception repeatedly |
Ludek Novotny commented This is the simplified example. We use an account entity which has around 70 fields, most of them annotated with import lombok.Data;
import org.springframework.data.mongodb.core.mapping.Document;
import org.springframework.data.mongodb.core.mapping.Field;
@Document(collection = "accounts")
@Data
public class AccountTest {
@Field("field1")
private String fieldOne;
} Each entity is loaded from db and processed. We have to process millions unique entities several times and in process, we generate new ones. Let's say that in total, we have to load 100.000.000 entities from db. And this is the test we used to debug it. The first query doesn't trigger exception because we use field name. The second query triggers string distance calculation. @RunWith(SpringRunner.class)
@SpringBootTest
public class FieldTest {
@Autowired
private MongoTemplate template;
@Before
public void setup(){
template.dropCollection("accounts");
}
@Test
public void test(){
AccountTest account = new AccountTest();
account.setFieldOne("123");
template.save(account);
List<AccountTest> list = template.find(Query.query(Criteria.where("fieldOne").is("123")), AccountTest.class);
List<AccountTest> list2 = template.find(Query.query(Criteria.where("field1").is("123")), AccountTest.class);
}
} |
Oliver Drotbohm commented Thanks for the detailed writeup, Ludek. I have a couple of follow-up questions:
|
Ludek Novotny commented Oh, this test doesn't represent our backend, it's just something we put together to debug and identify the sequence of calls which leads to string distance calculation. Maybe I should have posted here the actual backend at the first place. Sorry about that. So this is how it actually works: We get the Stream<Document> of all documents to be processed. One matching criteria batchId is used to identify a set of documents. MongoTemplate template;
.......
MongoDatabase db = template.getDb();
MongoCollection<Document> collection = db.getCollection("account");
FindIterable<Document> cursor = collection.find(new BasicDBObject("batchId", batchId));
return StreamSupport.stream(cursor.spliterator(), false); Each document from stream is then converted to account. We don't have custom Bson2Account converter. It relies only on Spring and driver. MongoTemplate template;
.....
public Account ConvertBsonDocument2Account(Document accountObj) {
return template.getConverter().read(Account.class, accountObj);
} When the conversion happens, the distance is calculated |
If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed. |
Superseded by spring-projects/spring-data-commons#2837 |
Ludek Novotny opened DATAMONGO-1991 and commented
When
@Field
annotation is used to have different Mongo Document field name than bean field name, string distance (org.springframework.beans.PropertyMatches.calculateStringDistance) is calculated for all combinations of fields. If the distance is too big, name from annotation is used as fallback.This causes a big performance hit in our application. Our solution was to implement cache in PropertyMatches but more permanent solution would be appreciated as we don't really want to maintain our version of spring-beans. We also believe the cache isn't the best solution. Is there a reason why field name from
@Field
isn't used with highest priority and string distance would be fallback?This issue is somehow related to BATCH-1876. But our use case is with Mongo. Our application is running on Spring Boot 2.0.0-RELEASE
Reference URL: https://jira.spring.io/browse/BATCH-1876
The text was updated successfully, but these errors were encountered: