Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional fix for columns with cardinality 0 #88

Merged
merged 1 commit into from
Feb 15, 2013
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 20 additions & 8 deletions index-common/src/main/java/com/metamx/druid/index/v1/IndexIO.java
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@

import com.fasterxml.jackson.databind.ObjectMapper;
import com.google.common.base.Preconditions;
import com.google.common.base.Predicate;
import com.google.common.collect.ImmutableMap;
import com.google.common.collect.Iterables;
import com.google.common.collect.Lists;
Expand Down Expand Up @@ -62,7 +63,6 @@
import com.metamx.druid.utils.SerializerUtils;
import it.uniroma3.mat.extendedset.intset.ConciseSet;
import it.uniroma3.mat.extendedset.intset.ImmutableConciseSet;

import org.joda.time.Interval;

import java.io.ByteArrayOutputStream;
Expand Down Expand Up @@ -369,8 +369,8 @@ public static void convertV8toV9(File v8Dir, File v9Dir) throws IOException
);
}

LinkedHashSet<String> skippedFiles = Sets.newLinkedHashSet();
Set<String> skippedDimensions = Sets.newLinkedHashSet();
final LinkedHashSet<String> skippedFiles = Sets.newLinkedHashSet();
final Set<String> skippedDimensions = Sets.newLinkedHashSet();
for (String filename : v8SmooshedFiles.getInternalFilenames()) {
log.info("Processing file[%s]", filename);
if (filename.startsWith("dim_")) {
Expand Down Expand Up @@ -570,25 +570,37 @@ public int size()
final ByteBuffer indexBuffer = v8SmooshedFiles.mapFile("index.drd");

indexBuffer.get(); // Skip the version byte
final GenericIndexed<String> dims = GenericIndexed.read(
final GenericIndexed<String> dims8 = GenericIndexed.read(
indexBuffer, GenericIndexed.stringStrategy
);
final GenericIndexed<String> dims9 = GenericIndexed.fromIterable(
Iterables.filter(
dims8, new Predicate<String>()
{
@Override
public boolean apply(String s)
{
return !skippedDimensions.contains(s);
}
}
),
GenericIndexed.stringStrategy
);
final GenericIndexed<String> availableMetrics = GenericIndexed.read(
indexBuffer, GenericIndexed.stringStrategy
);
final Interval dataInterval = new Interval(serializerUtils.readString(indexBuffer));

Set<String> columns = Sets.newTreeSet();
columns.addAll(Lists.newArrayList(dims));
columns.addAll(Lists.newArrayList(dims9));
columns.addAll(Lists.newArrayList(availableMetrics));
columns.removeAll(skippedDimensions);

GenericIndexed<String> cols = GenericIndexed.fromIterable(columns, GenericIndexed.stringStrategy);

final int numBytes = cols.getSerializedSize() + dims.getSerializedSize() + 16;
final int numBytes = cols.getSerializedSize() + dims9.getSerializedSize() + 16;
final SmooshedWriter writer = v9Smoosher.addWithSmooshedWriter("index.drd", numBytes);
cols.writeToChannel(writer);
dims.writeToChannel(writer);
dims9.writeToChannel(writer);
serializerUtils.writeLong(writer, dataInterval.getStartMillis());
serializerUtils.writeLong(writer, dataInterval.getEndMillis());
writer.close();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@

import java.io.File;
import java.util.Arrays;
import java.util.List;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to import java.util.List when you are already using Guava?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya srsly Gian, what's up with that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Joshu began the study of Zen when he was sixty years old and continued until he was eighty, when he realized Zen.
He taught from the age of eighty until he was one hundred and twenty.
A student once asked him: "If I haven't anything in my mind, what shall I do?"
Joshu replied: "Throw it out."
"But if I haven't anything, how can I throw it out?" continued the questioner.
"Well," said Joshu, "then carry it out."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Joshu is the senior engineer giving advice to the junior engineer. The jr engineer is all like 'wtf did Joshu just say' and Joshu is all like 'so is what I asked for coded up yet?'.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That java.util.List import is stray. The code isn't using List. I accepted the merged and just committed a cleanup commit to remove the import.


/**
*/
Expand Down Expand Up @@ -116,29 +117,47 @@ public void testPersistMergeCaseInsensitive() throws Exception
@Test
public void testPersistEmptyColumn() throws Exception
{
final IncrementalIndex toPersist = new IncrementalIndex(0L, QueryGranularity.NONE, new AggregatorFactory[]{});
final File tmpDir = Files.createTempDir();
final IncrementalIndex toPersist1 = new IncrementalIndex(0L, QueryGranularity.NONE, new AggregatorFactory[]{});
final IncrementalIndex toPersist2 = new IncrementalIndex(0L, QueryGranularity.NONE, new AggregatorFactory[]{});
final File tmpDir1 = Files.createTempDir();
final File tmpDir2 = Files.createTempDir();
final File tmpDir3 = Files.createTempDir();

try {
toPersist.add(
toPersist1.add(
new MapBasedInputRow(
1L,
ImmutableList.of("dim1", "dim2"),
ImmutableMap.<String, Object>of("dim1", ImmutableList.of(), "dim2", "foo")
)
);

toPersist2.add(
new MapBasedInputRow(
1L,
ImmutableList.of("dim1", "dim2"),
ImmutableMap.<String, Object>of("dim1", ImmutableList.of(), "dim2", "bar")
)
);

final QueryableIndex index1 = IndexIO.loadIndex(IndexMerger.persist(toPersist1, tmpDir1));
final QueryableIndex index2 = IndexIO.loadIndex(IndexMerger.persist(toPersist1, tmpDir2));
final QueryableIndex merged = IndexIO.loadIndex(
IndexMerger.persist(toPersist, tmpDir)
IndexMerger.mergeQueryableIndex(Arrays.asList(index1, index2), new AggregatorFactory[]{}, tmpDir3)
);

Assert.assertEquals(1, index1.getTimeColumn().getLength());
Assert.assertEquals(ImmutableList.of("dim2"), ImmutableList.copyOf(index1.getAvailableDimensions()));

Assert.assertEquals(1, index2.getTimeColumn().getLength());
Assert.assertEquals(ImmutableList.of("dim2"), ImmutableList.copyOf(index2.getAvailableDimensions()));

Assert.assertEquals(1, merged.getTimeColumn().getLength());
Assert.assertEquals(ImmutableList.of("dim1", "dim2"), ImmutableList.copyOf(merged.getAvailableDimensions()));
Assert.assertEquals(null, merged.getColumn("dim1"));
Assert.assertEquals(ImmutableList.of("dim2"), ImmutableList.copyOf(merged.getAvailableDimensions()));
} finally {
FileUtils.deleteQuietly(tmpDir);
FileUtils.deleteQuietly(tmpDir1);
FileUtils.deleteQuietly(tmpDir2);
FileUtils.deleteQuietly(tmpDir3);
}


}
}