Skip to content

Commit

Permalink
[ML] Use map and filter instead of flatMap in find_file_structure (#4…
Browse files Browse the repository at this point in the history
…2534)

Using map and filter avoids the garbage from all the
Stream.of calls that flatMap necessitated. Performance
is better when there are masses of fields.
  • Loading branch information
droberts195 authored May 24, 2019
1 parent 5eb38ec commit 5720a32
Showing 1 changed file with 2 additions and 5 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -187,11 +187,8 @@ static Tuple<SortedMap<String, Object>, SortedMap<String, FieldStats>> guessMapp

for (String fieldName : uniqueFieldNames) {

List<Object> fieldValues = sampleRecords.stream().flatMap(record -> {
Object fieldValue = record.get(fieldName);
return (fieldValue == null) ? Stream.empty() : Stream.of(fieldValue);
}
).collect(Collectors.toList());
List<Object> fieldValues = sampleRecords.stream().map(record -> record.get(fieldName)).filter(fieldValue -> fieldValue != null)
.collect(Collectors.toList());

Tuple<Map<String, String>, FieldStats> mappingAndFieldStats =
guessMappingAndCalculateFieldStats(explanation, fieldName, fieldValues, timeoutChecker);
Expand Down

0 comments on commit 5720a32

Please sign in to comment.