Skip to content

Commit

Permalink
Merge pull request #26 from hkuich/attributes_in_relation
Browse files Browse the repository at this point in the history
Attributes in relation
  • Loading branch information
hkuich authored Apr 8, 2021
2 parents 81b0e4e + 7fc3cb6 commit f49a1af
Show file tree
Hide file tree
Showing 27 changed files with 521 additions and 251 deletions.
17 changes: 10 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Use GraMi (**Gra**kn**Mi**grator) to take care of your data migration for you. G
- supports any tabular data file with your separator of choice (i.e.: csv, tsv, whatever-sv...)
- supports gzipped files
- ignores unnecessary columns
- [Entity](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Entities), [Relation](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Relations), and [Nested Relations](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Nested-Relations) Migration:
- [Attribute](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Attributes), [Entity](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Entities), [Relation](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Relations), [Nested Relations](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Nested-Relations), [Attribute-Player Relations](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Attribute-Player-Relations) Migration:
- migrate required/optional attributes of any grakn type (string, boolean, long, double, datetime)
- migrate required/optional role players (entity & relations)
- migrate list-like attribute columns as n attributes (recommended procedure until attribute lists are fully supported by Grakn)
Expand Down Expand Up @@ -53,9 +53,11 @@ The processor configuration file describes how you want data to be migrated give

Depending on what you would like to migrate, see here:

- [Attribute Processor Example](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Attributes#processor-config)
- [Entity Processor Example](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Entities#processor-config)
- [Relation Processor Example](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Relations#processor-config)
- [Nested Relation Processor Example](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Nested-Relations#processor-config)
- [Attribute-Player Relation](https://github.com/bayer-science-for-a-better-life/grami/wiki/Attribute-Player-Relations#processor-config)

### Data Configuration

Expand All @@ -65,10 +67,13 @@ A good point to start the performance optimization is to set the number of threa

See Example here:

- [Attribute Data Config Example](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Attributes#data-config)
- [Entity Data Config Example](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Entities#data-config)
- [Relation Data Config Example](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Relations#data-config)
- [Nested Relation - Match by Attribute(s) - Data Config Example](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Nested-Relations#data-config---attribute-matching)
- [Nested Relation - Match by Player(s) - Data Config Example](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Nested-Relations#data-config---player-matching)
- [Attribute-Player Relation - Data Config Example](https://github.com/bayer-science-for-a-better-life/grami/wiki/Attribute-Player-Relations#data-config)
- [Custom Migration Order](https://github.com/bayer-science-for-a-better-life/grami/wiki/Custom-Migration-Order)

### Migrate Data

Expand Down Expand Up @@ -104,7 +109,7 @@ public class Migration {

public static void main(String[] args) throws IOException {
GraknMigrator mig = new GraknMigrator(migrationConfig, migrationStatus, true);
mig.migrate(true, true, true, true);
mig.migrate();
}
}
```
Expand All @@ -122,13 +127,11 @@ There is this [example repository](https://github.com/bayer-science-for-a-better

## Compatibility

GraMi version >= 0.1.0-alpha-12 is tested for:
- [grakn-core](https://github.com/graknlabs/grakn) >= 2.0-alpha-9
- using [client-java](https://github.com/graknlabs/client-java) >= 2.0.0-alpha-12
GraMi version == 0.1.1 is tested for:
- [grakn-core](https://github.com/graknlabs/grakn) == 2.0.1

GraMi version < 0.1.0 is tested for:
- [grakn-core](https://github.com/graknlabs/grakn) >= 1.8.2
- [client-java](https://github.com/graknlabs/client-java) >= 1.8.3
- [grakn-core](https://github.com/graknlabs/grakn) >= 1.8.2

Find the Readme for GraMi for grakn < 2.0 [here](https://github.com/bayer-science-for-a-better-life/grami/blob/b3d6d272c409d6c40254354027b49f90b255e1c3/README.md)

Expand Down
2 changes: 1 addition & 1 deletion build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ plugins {
}

group 'com.github.bayer-science-for-a-better-life'
version '0.1.0'
version '0.1.1'

repositories {
mavenCentral()
Expand Down
19 changes: 2 additions & 17 deletions src/main/java/cli/GramiCLI.java
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

import java.io.IOException;

@CommandLine.Command(description="Welcome to the CLI of GraMi - your grakn data migration tool", name = "grami", version = "0.1.0-alpha-12", mixinStandardHelpOptions = true)
@CommandLine.Command(description="Welcome to the CLI of GraMi - your grakn data migration tool", name = "grami", version = "0.1.1", mixinStandardHelpOptions = true)
public class GramiCLI {

public static void main(String[] args) {
Expand Down Expand Up @@ -46,9 +46,6 @@ class MigrateCommand implements Runnable {
@CommandLine.Option(names = {"-cm", "--cleanMigration"}, description = "optional - delete old schema and data and restart migration from scratch - default: continue previous migration, if exists")
private boolean cleanMigration;

@CommandLine.Option(names = {"-sc", "--scope"}, description = "optional - set migration scope: 0 - apply schema only (Note: this has no effect unless you also set the cleanMigration flag to true.); 1 - migrate entities; 2 - migrate entities & relations; 3 - migrate entites, relations, & relation-with-relations; everything else defaults to 4 - migrate all (entities, relations, relation-with-relations, append-attributes")
private int scope = 4;

@Override
public void run() {
spec.commandLine().getOut().println("############## GraMi migration ###############");
Expand All @@ -60,24 +57,12 @@ public void run() {
spec.commandLine().getOut().println("\tdatabase: " + databaseName);
spec.commandLine().getOut().println("\tgrakn server: " + graknURI);
spec.commandLine().getOut().println("\tdelete database and all data in it for a clean new migration?: " + cleanMigration);
spec.commandLine().getOut().println("\tmigration scope: " + scope);

final MigrationConfig migrationConfig = new MigrationConfig(graknURI, databaseName, schemaFilePath, dataConfigFilePath, processorConfigFilePath);

try {
GraknMigrator mig = new GraknMigrator(migrationConfig, migrationStatusFilePath, cleanMigration);

if (scope != 0 && scope != 1 && scope != 2 && scope != 3) {
scope = 4;
}

switch (scope) {
case 0: mig.migrate(false, false, false, false); break;
case 1: mig.migrate(true, false, false, false); break;
case 2: mig.migrate(true, true, false, false); break;
case 3: mig.migrate(true, true, true, false); break;
case 4: mig.migrate(true, true, true, true); break;
}
mig.migrate();
} catch (IOException e) {
e.printStackTrace();
}
Expand Down
10 changes: 10 additions & 0 deletions src/main/java/configuration/DataConfigEntry.java
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ public class DataConfigEntry {
private DataConfigGeneratorMapping[] relationPlayers;
private int batchSize;
private int threads;
private Integer orderBefore;
private Integer orderAfter;

public String getDataPath() {
return dataPath;
Expand Down Expand Up @@ -46,6 +48,14 @@ public int getThreads() {
return threads;
}

public Integer getOrderBefore() {
return orderBefore;
}

public Integer getOrderAfter() {
return orderAfter;
}

public ArrayList<DataConfigEntry.DataConfigGeneratorMapping> getMatchAttributes() {
ArrayList<DataConfigEntry.DataConfigGeneratorMapping> matchAttributes = new ArrayList<>();
for (DataConfigEntry.DataConfigGeneratorMapping attributeMapping: getAttributes()) {
Expand Down
2 changes: 1 addition & 1 deletion src/main/java/configuration/ProcessorConfigEntry.java
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ public ConceptGenerator getRelationPlayerGenerator(String key) {

public HashMap<String,ConceptGenerator> getRelationRequiredPlayers() {
HashMap<String,ConceptGenerator> relationPlayers = new HashMap<>();
if (processorType.equals("relation") || processorType.equals("relation-with-relation")) {
if (processorType.equals("relation") || processorType.equals("nested-relation") || processorType.equals("attribute-relation")) {
HashMap<String, ConceptGenerator> playerGenerators = getConceptGenerators().get("players");
for (Map.Entry<String, ConceptGenerator> pg: playerGenerators.entrySet()) {
if (pg.getValue().isRequired()) {
Expand Down
27 changes: 15 additions & 12 deletions src/main/java/generator/AppendAttributeGenerator.java
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,15 @@
import java.util.HashMap;
import java.util.Map;

import static generator.GeneratorUtil.*;
import static generator.GeneratorUtil.addAttribute;
import static generator.GeneratorUtil.malformedRow;

public class AppendAttributeGenerator extends InsertGenerator {

private final DataConfigEntry dce;
private final ProcessorConfigEntry pce;
private static final Logger appLogger = LogManager.getLogger("com.bayer.dt.grami");
private static final Logger dataLogger = LogManager.getLogger("com.bayer.dt.grami.data");
private final DataConfigEntry dce;
private final ProcessorConfigEntry pce;

public AppendAttributeGenerator(DataConfigEntry dataConfigEntry,
ProcessorConfigEntry processorConfigEntry) {
Expand All @@ -33,24 +34,25 @@ public AppendAttributeGenerator(DataConfigEntry dataConfigEntry,
}

public HashMap<String, ArrayList<ArrayList<ThingVariable<?>>>> graknAppendAttributeInsert(ArrayList<String> rows,
String header) throws Exception {
String header, int rowCounter) throws Exception {
HashMap<String, ArrayList<ArrayList<ThingVariable<?>>>> matchInsertPatterns = new HashMap<>();

ArrayList<ArrayList<ThingVariable<?>>> matchPatterns = new ArrayList<>();
ArrayList<ArrayList<ThingVariable<?>>> insertPatterns = new ArrayList<>();

int insertCounter = 0;

int batchCounter = 1;
for (String row : rows) {
ArrayList<ArrayList<ThingVariable<?>>> tmp = graknAppendAttributeQueryFromRow(row, header, insertCounter);

ArrayList<ArrayList<ThingVariable<?>>> tmp = graknAppendAttributeQueryFromRow(row, header, insertCounter, rowCounter + batchCounter);
if (tmp != null) {
if (tmp.get(0) != null && tmp.get(1) != null) {
matchPatterns.add(tmp.get(0));
insertPatterns.add(tmp.get(1));
insertCounter++;
}
}

batchCounter = batchCounter + 1;
}
matchInsertPatterns.put("match", matchPatterns);
matchInsertPatterns.put("insert", insertPatterns);
Expand All @@ -59,7 +61,8 @@ public HashMap<String, ArrayList<ArrayList<ThingVariable<?>>>> graknAppendAttrib

public ArrayList<ArrayList<ThingVariable<?>>> graknAppendAttributeQueryFromRow(String row,
String header,
int insertCounter) throws Exception {
int insertCounter,
int rowCounter) throws Exception {
String fileSeparator = dce.getSeparator();
String[] rowTokens = row.split(fileSeparator);
String[] columnNames = header.split(fileSeparator);
Expand All @@ -77,7 +80,7 @@ public ArrayList<ArrayList<ThingVariable<?>>> graknAppendAttributeQueryFromRow(S
Thing appendAttributeMatchPattern = addEntityToMatchPattern(insertCounter);
for (DataConfigEntry.DataConfigGeneratorMapping generatorMappingForMatchAttribute : dce.getAttributes()) {
if (generatorMappingForMatchAttribute.isMatch()) {
appendAttributeMatchPattern = addAttribute(rowTokens, appendAttributeMatchPattern, columnNames, generatorMappingForMatchAttribute, pce, generatorMappingForMatchAttribute.getPreprocessor());
appendAttributeMatchPattern = addAttribute(rowTokens, appendAttributeMatchPattern, columnNames, rowCounter, generatorMappingForMatchAttribute, pce, generatorMappingForMatchAttribute.getPreprocessor());
}
}
matchPatterns.add(appendAttributeMatchPattern);
Expand All @@ -87,7 +90,7 @@ public ArrayList<ArrayList<ThingVariable<?>>> graknAppendAttributeQueryFromRow(S
Thing appendAttributeInsertPattern = null;
for (DataConfigEntry.DataConfigGeneratorMapping generatorMappingForAppendAttribute : dce.getAttributes()) {
if (!generatorMappingForAppendAttribute.isMatch()) {
appendAttributeInsertPattern = addAttribute(rowTokens, thingVar, columnNames, generatorMappingForAppendAttribute, pce, generatorMappingForAppendAttribute.getPreprocessor());
appendAttributeInsertPattern = addAttribute(rowTokens, thingVar, rowCounter, columnNames, generatorMappingForAppendAttribute, pce, generatorMappingForAppendAttribute.getPreprocessor());
}
}
if (appendAttributeInsertPattern != null) {
Expand All @@ -100,10 +103,10 @@ public ArrayList<ArrayList<ThingVariable<?>>> graknAppendAttributeQueryFromRow(S


if (isValid(assembledPatterns)) {
appLogger.debug("valid query: <" + assembleQuery(assembledPatterns).toString() + ">");
appLogger.debug("valid query: <" + assembleQuery(assembledPatterns) + ">");
return assembledPatterns;
} else {
dataLogger.warn("in datapath <" + dce.getDataPath() + ">: skipped row b/c does not contain at least one match attribute and one insert attribute. Faulty tokenized row: " + Arrays.toString(rowTokens));
dataLogger.warn("in datapath <" + dce.getDataPath() + ">: skipped row " + rowCounter + " b/c does not contain at least one match attribute and one insert attribute. Faulty tokenized row: " + Arrays.toString(rowTokens));
return null;
}
}
Expand Down
12 changes: 7 additions & 5 deletions src/main/java/generator/AttributeInsertGenerator.java
Original file line number Diff line number Diff line change
Expand Up @@ -31,23 +31,25 @@ public AttributeInsertGenerator(DataConfigEntry dataConfigEntry, ProcessorConfig
}

public ArrayList<ThingVariable<?>> graknAttributeInsert(ArrayList<String> rows,
String header) throws IllegalArgumentException {
String header, int rowCounter) throws IllegalArgumentException {
ArrayList<ThingVariable<?>> patterns = new ArrayList<>();
int batchCount = 1;
for (String row : rows) {
try {
ThingVariable<?> temp = graknAttributeQueryFromRow(row, header);
ThingVariable<?> temp = graknAttributeQueryFromRow(row, header, rowCounter + batchCount);
if (temp != null) {
patterns.add(temp);
}
} catch (Exception e) {
e.printStackTrace();
}
batchCount = batchCount + 1;
}
return patterns;
}

public ThingVariable<Attribute> graknAttributeQueryFromRow(String row,
String header) throws Exception {
String header, int rowCounter) throws Exception {
String fileSeparator = dce.getSeparator();
String[] rowTokens = row.split(fileSeparator);
String[] columnNames = header.split(fileSeparator);
Expand All @@ -58,7 +60,7 @@ public ThingVariable<Attribute> graknAttributeQueryFromRow(String row,
Attribute attributeInsertStatement = null;

for (DataConfigEntry.DataConfigGeneratorMapping generatorMappingForAttribute : dce.getAttributes()) {
attributeInsertStatement = addValue(rowTokens, attributeInitialStatement, columnNames, generatorMappingForAttribute, pce, generatorMappingForAttribute.getPreprocessor());
attributeInsertStatement = addValue(rowTokens, attributeInitialStatement, rowCounter, columnNames, generatorMappingForAttribute, pce, generatorMappingForAttribute.getPreprocessor());
}

if (attributeInsertStatement != null) {
Expand All @@ -68,7 +70,7 @@ public ThingVariable<Attribute> graknAttributeQueryFromRow(String row,
appLogger.debug("valid query: <insert " + attributeInsertStatement.toString() + ";>");
return attributeInsertStatement;
} else {
dataLogger.warn("in datapath <" + dce.getDataPath() + ">: skipped row b/c does not have a proper <isa> statement or is missing required attributes. Faulty tokenized row: " + Arrays.toString(rowTokens));
dataLogger.warn("in datapath <" + dce.getDataPath() + ">: skipped row " + rowCounter + " b/c does not have a proper <isa> statement or is missing required attributes. Faulty tokenized row: " + Arrays.toString(rowTokens));
return null;
}
} else {
Expand Down
Loading

0 comments on commit f49a1af

Please sign in to comment.