Merge pull request #1982 from jplag/feature/documentationCore

Updated documentation
jplag · Sep 22, 2024 · fc4f6a1 · fc4f6a1
2 parents 96c5119 + 82a6dc4
commit fc4f6a1
Show file tree

Hide file tree

Showing 4 changed files with 204 additions and 168 deletions.
diff --git a/docs/1.-How-to-Use-JPlag.md b/docs/1.-How-to-Use-JPlag.md
@@ -119,7 +119,15 @@ The report will always be zipped unless there is an error during the zipping pro
 
 ## Viewing Reports
 
-The newest version of the report viewer is always accessible at https://jplag.github.io/JPlag/. Drop your `result.zip` folder on the page to start inspecting the results of your JPlag run. Your submissions will neither be uploaded to a server nor stored permanently. They are saved in the application as long as you view them. Once you refresh the page, all information will be erased.
+Starting with version v6.0.0, the report viewer is bundled with JPlag and will be launched automatically. The `--mode` option controls this behavior.
+By default, JPlag will process the input files and produce a zipped result file. After that, the report viewer is launched (on localhost), and the report will be shown in your browser.
+
+The option `--mode show` will only open the report viewer.
+This allows you to view existing reports.
+You can optionally provide the path to a report file to immediately display it in the viewer; otherwise, the viewer will require you to select a report, just like the online version.
+By specifying `--mode run`, JPlag will run but generate the zipped report but will not open the report viewer.
+
+An online version of the viewer is still hosted at https://jplag.github.io/JPlag/ in order to view pre-v6.0.0 reports. Your submissions will neither be uploaded to a server nor stored permanently. They are stored as long as you view them. Once you refresh the page, all information will be erased.
 
 
 ## Basic Concepts

diff --git a/docs/3.-Contributing-to-JPlag.md b/docs/3.-Contributing-to-JPlag.md
@@ -21,10 +21,12 @@ Please try to make well-documented and clearly structured submissions:
 
 ## Building from sources 
 1. Download or clone the code from this repository.
+
 ### Core
 2. Run `mvn clean package` from the root of the repository to compile and build all submodules.
    Run `mvn clean package assembly:single -P with-report-viewer` instead if you need the full jar, which includes all dependencies.
 3. You will find the generated JARs in the subdirectory `jplag.cli/target`.
+
 ### Report Viewer
 2. Run `npm install` to install all dependencies.
 3. Run `npm run dev` to launch the development server. The report viewer will be available at `http://localhost:8080/`.

diff --git a/docs/4.-Adding-New-Languages.md b/docs/4.-Adding-New-Languages.md
@@ -84,7 +84,7 @@ For example, if ANTLR is used, the setup is as follows:
 | Lexer and Parser           | `Lexer`, `Parser` (ANTLR)      | transform code into AST            | generated from grammar files by antlr4-maven-plugin                                          |
 | Traverser                  | `ParseTreeWalker` (ANTLR)      | traverses AST and calls listener   | included in antlr4-runtime library, can be used as is                                        |
 | TraverserListener class    | `ParseTreeListener` (ANTLR)    | creates tokens when called         | **implement new**                                                                            |
-| ParserAdapter class        | `de.jplag.AbstractParser`      | sets up Parser and calls Traverser | copy with small adjustments                                                                  | 
+| ParserAdapter class        | `de.jplag.AbstractAntlrParser` | sets up Parser and calls Traverser | copy with small adjustments                                                                  | 
 
 As the table shows, much of a language module can be reused, especially when using ANTLR. The only parts left to implement specifically for each language module are
  - the ParserAdapter (for custom parsers)
@@ -95,7 +95,130 @@ As the table shows, much of a language module can be reused, especially when usi
   - It should still be rather easy to implement the ParserAdapter from the library documentation.
   - Instead of using a listener pattern, the library may require you to do the token extraction in a _Visitor subclass_. In that case, there is only one method call per element, called e.g. `traverseClassDeclaration`. The advantage of this version is that the traversal of the subtrees can be controlled freely. See the Scala language module for an example.
 
-### Basic procedure outline
+## Setting up a new language module with ANTLR
+
+JPlag provides a small framework to make it easier to implement language modules with ANTLR
+
+### Create the Language class
+
+Extends the AbstractAntlrLanguage class and implements all required methods. There are two options for creating the parser.
+It can either be passed to the superclass in the constructor, as shown below, or created later by overriding the initializeParser method.
+The latter option should be used if the parser requires dynamic parameters.
+
+```java
+public class TestLanguage extends AbstractAntlrLanguage {
+    public TestLanguage() {
+        super(new TestParserAdapter());
+    }
+
+    @Override
+    public String[] suffixes() {
+        return new String[] {"expression"}; //return a list of file suffixes for your language
+    }
+
+    @Override
+    public String getName() {
+        return "Test"; //return the name of the language (e.g. Java). Can be anything that describes the language module shorty
+    }
+
+    @Override
+    public String getIdentifier() {
+        return "test"; //return the identifier for the language (e.g. java). Should be something simple and unique
+    }
+
+    @Override
+    public int minimumTokenMatch() {
+        return 9; //The minimum number of tokens required to form a match. Leave this at 9 if your module doesn't require anything different
+    }
+}
+```
+
+### Implement the parser adapter
+
+The generated code by ANTLR always looks slightly different. The AbstractAntlrParserAdapter class is able to perform most of the required steps automatically.
+The implementation only needs to call the correct generated methods. They should be named roughly the same as the example. The javadoc of each method contains additional information.
+
+```java
+public class TestParserAdapter extends AbstractAntlrParserAdapter<TestParser> {
+    private static final TestListener listener = new TestListener();
+
+    @Override
+    protected Lexer createLexer(CharStream input) {
+        return new TestLexer(input);
+    }
+
+    @Override
+    protected TestParser createParser(CommonTokenStream tokenStream) {
+        return new TestParser(tokenStream);
+    }
+
+    @Override
+    protected ParserRuleContext getEntryContext(TestParser parser) {
+        return parser.expressionFile();
+    }
+
+    @Override
+    protected AbstractAntlrListener getListener() {
+        return listener;
+    }
+}
+```
+
+### Implement the token type enum
+
+This is the same as non ANTLR modules. The enum should look something like this:
+
+```java
+public enum TestTokenType implements TokenType {
+    TOKEN_NAME("TOKEN_DESCRIPTION"); //the description works as a visual name. Look at other language modules for examples 
+
+    private final String description;
+
+    TestTokenType(String description) {
+        this.description = description;
+    }
+
+    @Override
+    public String getDescription() {
+        return description;
+    }
+}
+```
+
+### Implement the listener
+
+In contrast to the java module, the framework for the ANTLR module a set of extraction rules has to be defined instead of a traditional listener.
+All rules are independent of each other, which makes it easier to debug the token extraction.
+
+The basic structure looks like this:
+
+```java
+class TestListener extends AbstractAntlrListener {
+
+    TestListener() {
+        //add rules
+    }
+}
+```
+
+To make the class easier to read the constructor should only call methods which contain the rules. These methods shouldn't be too long and contain the rules for a specific category of token.
+
+Extraction rules can be very complicated, but in most cases simple ones will suffice. The easiest option is to directly map antlr tokens to JPlag tokens:
+
+```java
+visit(VarDefContext.class).map(VARDEF);
+```
+
+There are some different variants of map, which determine the length of the tokens. The javadoc contains details on that. Map can also receive two JPlag token types, which creates one JPlag token for the start of the context and one for the end.
+visit can also receive a type of ANTLR terminal node to create tokens from terminal nodes.
+
+Additional features for rules:
+
+1. Condition - Can be passed as a second argument to visit. The rule only applies if the condition returns true (see CPP language module for examples)
+2. Semantics - Can be passed by using withSemantics after the map call (see CPP language module for examples)
+3. Delegate  - To have more precise control over the token position and length a delegated visitor can be used (see Go language module for examples)
+
+## Basic procedure outline
 
 ```mermaid
 flowchart LR
@@ -113,7 +236,7 @@ flowchart LR
 Note: In existing language modules, the token list is managed by the ParserAdapter, and from there it is returned to the
 Language class and then to JPlag.
 
-### Integration into JPlag
+## Integration into JPlag
 
 The following adjustments have to be made beyond creating the language module submodule itself: