The DS4DM-Backend is a webservice which works in conjunction with the Data Search for Data Mining (DS4DM) RapidMiner Extension. The memory-intensive and processing-intensive functionalities of the DS4DM RapidMiner Extension have been outsourced to the DS4DM Backend. This includes various Data Searches, Data pre-processing functions, Data Repository management functions,... - for more informatioin, please refer to the website of the DSDM Backend.
- Download this GitHub repository
- Go to the releases page (https://github.com/BenediktKleppmann/DS4DM-Backend/releases) and download the three jar-files: "CreateCorrespondenceFiles-0.0.1-SNAPSHOT.jar", "CreateLuceneIndex-0.0.1-SNAPSHOT-jar-with-dependencies.jar" and "winter-1.0-jar-with-dependencies.jar".
Copy them to the following location in the downloaded GitHub repository: DS4DM-Backend\DS4DM-Webservice\DS4DM_webservice\lib - Make sure that the environment variable JAVA_HOME points to a jdk_8... -folder
- Open the a terminal and execute:
cd <path_to_downloaded_folder>/DS4DM-Backend/DS4DM-Webservice/DS4DM_webservice
java -Xms1024m -Xmx1024m -XX:MetaspaceSize=64m -XX:MaxMetaspaceSize=256m -jar activator-launch-1.2.12.jar "run -Dhttp.port=9004"
- In your RapidMiner-process you now set url-Parameter of the Data Search operator to "http://localhost:9004".
- From the Project's Google Drive repository download the virtual machine image 'Ubuntu Server 16.04.4 (32bit).vdi'
- Launch the virtual machine
- Log on to the user: 'osboxes.org', password: 'osboxes.org'
- open a terminal and execute the following commands:
cd /home/osboxes/Desktop/DS4DM-Backend-master/DS4DM-Webservice/DS4DM_webservice
java -Xms1024m -Xmx1024m -XX:MetaspaceSize=64m -XX:MaxMetaspaceSize=256m -jar activator-launch-1.2.12.jar "run -Dhttp.port=9004"
This backend component contains methods for finding/creating correspondences between tables. These methods are used by DS4DM-Webservice (the main backend component). For this the CreateCorrespondenceFiles-maven-project is compiled into a jar-file. The jar file with dependencies is saved to the folder DS4DM-Backend/DS4DM-Webservice/DS4DM_webservice/lib/ and added to the Build Path of the DS4DM_webservice-maven-project
This backend component contains methods for indexing tables. These methods are also used by the DS4DM-Webservice. As with CreateCorrespondences, the CreateLuceneIndex-maven-project is compiled to a jar file and the jar file with dependencies is saved to DS4DM-Backend/DS4DM-Webservice/DS4DM_webservice/lib/, from where it is included in the Build Path of DS4DM_webservice.
This is the main Backend component. The maven-project is structured according to the Java-Play-framework-guidelines. This allows the program activator-launch-1.2.12.jar to provide an API endpoint which calls various methods in this backend component. The File DS4DM-Backend/DS4DM-Webservice/DS4DM_webservice/conf/routes specifies the API calls that are possible and which methods these call. The majority of the called methods are in the class DS4DM-Backend/DS4DM-Webservice/DS4DM_webservice/app/controllers/ExtendTable.java. (All of the executed code is in the folder DS4DM-Backend/DS4DM-Webservice/DS4DM_webservice/app). The DS4DM-Webservices uses repositories of tables. These repositories are in the folder DS4DM-Backend/DS4DM-Webservice/DS4DM_webservice/public/repositories. Each repository has one folder containing csv-tables, one folder containing Indexes and another folder containing Correspondences, as well as a file with repository statistics.
This isn't a backend component, but a collection of the csv files that were used for the evaluations. For more information on the evaluations, please refer to http://web.informatik.uni-mannheim.de/ds4dm/#evaluation.