-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Doc] Provide Java's reference library, documentation for users and developers #242
Merged
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
fd8bc44
Generate javadoc referencej
Thespica 168f0f4
Merge branch 'alibaba:main' into java-doc
Thespica 3460a5b
Merge branch 'alibaba:main' into java-doc
Thespica 280a28d
Finish java docs
Thespica 46e84f7
fix javadoc
Thespica c0652b2
fix javadoc
Thespica 1df457f
refine command to generate javadoc and scaladoc
Thespica 2a996c7
Fix
Thespica 52eb574
Merge branch 'alibaba:main' into java-doc
Thespica 06d1803
Refine doc and set java code style as AOSP
Thespica File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
Java Development | ||
================ | ||
|
||
Introduction | ||
------------ | ||
|
||
GraphAr Java library based on GraphAr C++ library and an efficient FFI | ||
for Java and C++ called | ||
`FastFFI <https://github.com/alibaba/fastFFI>`__. | ||
|
||
Source Code Level | ||
~~~~~~~~~~~~~~~~~ | ||
|
||
- Interface | ||
|
||
- Class | ||
|
||
- JNI code | ||
|
||
- GraphAr C++ library | ||
|
||
If you want to use classes or functions of GraphAr C++ library through JAVA SDK, you only need to write interfaces with | ||
annotations. After the interfaces are ready, the java code for the interfaces and the the C++ code which include JNI | ||
code for native methods will be automatically generated by FastFFI.For | ||
annotation's usage, please refer to | ||
`FastFFI <https://github.com/alibaba/fastFFI>`__. | ||
|
||
|
||
|
||
Runtime Level | ||
~~~~~~~~~~~~~ | ||
|
||
Interfaces and classes will be compiled to bytecode. Usually, JNI code will be compiled to bitcode as a part of | ||
dynamic library which can be called by native methods directly. | ||
If llvm4jni is enable, suitable method in JNI will be transferred to bytecode. | ||
|
||
For decoupling the implementation of C++ and Java, we use a bridge dynamic library called gar-jni to connect them, it | ||
will integrate all C++ dependencies(e.g. JNI code, GraphAr C++ library and arrow C++) | ||
and can be called by native methods in Java directly. | ||
Most JNI code is generated by FastFFI, but some JNI code is written by ourselves, such as JNI code for | ||
transferring VectorSchemaRoot into arrow::Table. | ||
|
||
To build the bridge dynamic library, here is main part of our CMakeLists.txt: | ||
|
||
.. code-block:: cmake | ||
|
||
# set auto-generated JNI code and handwriting JNI code as source files | ||
file(GLOB SOURCES "${CMAKE_CURRENT_SOURCE_DIR}/target/generated-sources/annotations/*.cc" "${CMAKE_CURRENT_SOURCE_DIR}/target/generated-test-sources/test-annotations/*.cc" | ||
"${CMAKE_CURRENT_SOURCE_DIR}/src/main/cpp/ffi/*.cc") | ||
# remove auto-generated JNI code for specific method cause we have handwriting JNI code for it | ||
list(REMOVE_ITEM SOURCES "${CMAKE_CURRENT_SOURCE_DIR}/target/generated-sources/annotations/jni_com_alibaba_graphar_arrow_ArrowTable_Static_cxx_0x58c7409.cc") | ||
|
||
set(LIBNAME "gar-jni") | ||
|
||
# find JNI related libraries | ||
find_package(JNI REQUIRED) | ||
include_directories(SYSTEM ${JAVA_INCLUDE_PATH}) | ||
include_directories(SYSTEM ${JAVA_INCLUDE_PATH2}) | ||
|
||
# some JNI code depends on arrow | ||
find_package(Arrow REQUIRED) | ||
# build graphar-cpp in specific version | ||
include(graphar-cpp) | ||
build_graphar_cpp() | ||
|
||
# build the bridge JNI library | ||
add_library(${LIBNAME} SHARED ${SOURCES}) | ||
# include graphar-cpp headers | ||
target_include_directories(${LIBNAME} SYSTEM BEFORE PRIVATE ${GAR_INCLUDE_DIR}) | ||
# link graphar-cpp and arrow | ||
target_link_libraries(${LIBNAME} ${CMAKE_JNI_LINKER_FLAGS} gar_shared) | ||
target_link_libraries(${LIBNAME} ${CMAKE_JNI_LINKER_FLAGS} Arrow::arrow_static) | ||
|
||
More about usage of CMake, please refer to `CMake's official website <https://cmake.org/>`__. | ||
|
||
Building GraphAr Java | ||
--------------------- | ||
|
||
Please refer to `GraphAr Java Library user guide <../user-guide/java-lib.html>`__. | ||
|
||
Code Style | ||
---------- | ||
|
||
We follow `AOSP Java code | ||
style <https://source.android.com/docs/setup/contribute/code-style>`__. To ensure | ||
CI for checking code style will pass, please ensure check below is | ||
success: | ||
|
||
.. code-block:: bash | ||
|
||
mvn spotless:check | ||
|
||
If there are violations, running command below to automatically format: | ||
|
||
.. code-block:: bash | ||
|
||
mvn spotless:apply |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
Java API Reference (javadoc) | ||
============================== | ||
|
||
Stub page for the Java reference docs; actual source is located in the java-api/ directory. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,218 @@ | ||
GraphAr Java Library | ||
==================== | ||
|
||
Overview | ||
-------- | ||
|
||
Based on an efficient FFI for Java and C++ called | ||
`fastFFI <https://github.com/alibaba/fastFFI>`__, the GraphAr Java | ||
library allows users to write Java for generating, loading and | ||
transforming GAR files. It consists of several components: | ||
|
||
- **Information Classes**: As same with in the C++ library, the | ||
information classes are implemented to construct and access the meta | ||
information about the **graphs**, **vertices** and **edges** in | ||
GraphAr. | ||
|
||
- **Writers**: The GraphAr Java writer provides a set of interfaces | ||
that can be used to write Apache Arrow VectorSchemaRoot into GAR | ||
files. Every time it takes a VectorSchemaRoot as the logical table | ||
for a type of vertices or edges, then convert it to ArrowTable, and | ||
then dumps it to standard GAR files (CSV, ORC or Parquet files) under | ||
the specific directory path. | ||
|
||
- **Readers**: The GraphAr Java reader provides a set of interfaces | ||
that can be used to read GAR files. It reads a collection of vertices | ||
or edges at a time and assembles the result into the ArrowTable. | ||
Similar with the reader in the C++ library, it supports the users to | ||
specify the data they need, e.g., reading a single property group | ||
instead of all properties. | ||
|
||
Get GraphAr Java Library | ||
------------------------ | ||
|
||
Building from source | ||
~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Only support installing from source currently, but we will support | ||
installing from Maven in the future. | ||
|
||
Firstly, install llvm-11. ``LLVM11_HOME`` should point to the home of | ||
LLVM 11. In Ubuntu, it is at ``/usr/lib/llvm-11``. Basically, the build | ||
procedure the following binary: | ||
|
||
- ``$LLVM11_HOME/bin/clang++`` | ||
|
||
- ``$LLVM11_HOME/bin/ld.lld`` | ||
|
||
- ``$LLVM11_HOME/lib/cmake/llvm`` | ||
|
||
Tips: | ||
|
||
- Use Ubuntu as example: | ||
|
||
.. code-block:: bash | ||
|
||
$ sudo apt-get install llvm-11 clang-11 lld-11 libclang-11-dev libz-dev -y | ||
$ export LLVM11_HOME=/usr/lib/llvm-11 | ||
|
||
- Or compile from source with this | ||
`script <https://github.com/alibaba/fastFFI/blob/main/docker/install-llvm11.sh>`__: | ||
|
||
.. code-block:: bash | ||
|
||
$ export LLVM11_HOME=/usr/lib/llvm-11 | ||
$ export LLVM_VAR=11.0.0 | ||
$ sudo ./install-llvm11.sh | ||
|
||
Make the graphar-java-library directory as the current working | ||
directory: | ||
|
||
.. code-block:: bash | ||
|
||
$ git clone https://github.com/alibaba/GraphAr.git | ||
$ cd GraphAr | ||
$ git submodule update --init | ||
$ cd java | ||
|
||
Compile package: | ||
|
||
.. code-block:: bash | ||
|
||
$ mvn clean install -DskipTests | ||
|
||
Then set GraphAr as a dependency in maven project: | ||
|
||
.. code-block:: xml | ||
|
||
<dependencies> | ||
<dependency> | ||
<groupId>com.alibaba.graphar</groupId> | ||
<artifactId>gar-java</artifactId> | ||
<version>0.1.0</version> | ||
</dependency> | ||
</dependencies> | ||
|
||
How to use | ||
---------- | ||
|
||
Information classes | ||
~~~~~~~~~~~~~~~~~~~ | ||
|
||
The Java library for GraphAr provides distinct information classes for | ||
constructing and accessing meta information about graphs, vertices, and | ||
edges. These classes act as essential parameters for constructing | ||
readers and writers, and they can be built either from the existing meta | ||
files (in the Yaml format) or in-memory from scratch. | ||
|
||
To construct information from a Yaml file, please refer to the following | ||
example code. | ||
|
||
.. code-block:: java | ||
|
||
// read graph yaml and construct information | ||
String path = ...; // the path to the yaml file | ||
Result<GraphInfo> graphInfoResult = GraphInfo.load(path); | ||
if (!graphInfoResult.hasError()) { | ||
GraphInfo graphInfo = graphInfoResult.value(); | ||
// use information classes | ||
StdMap<StdString, VertexInfo> vertexInfos = graphInfo.getVertexInfos(); | ||
StdMap<StdString, EdgeInfo> edgeInfos = graphInfo.getEdgeInfos(); | ||
} | ||
|
||
See `test for | ||
graphinfo <https://github.com/alibaba/GraphAr/tree/main/java/src/test/java/com/alibaba/graphar/graphinfo>`__ | ||
for the complete example. | ||
|
||
Writers | ||
~~~~~~~ | ||
|
||
The GraphAr Java writers wrap C++ interfaces to write arrow::Table into GraphAr | ||
formatted files in a batch-import fashion. But arrow::Table is not easy | ||
to build in Java. Instead, the GraphAr Java library provide a static | ||
method to convert VectorSchemaRoot into arrow::Table. Warning: There are | ||
some problems concerning this method which lead to memory leaks. We will | ||
fix it or rewrite writers with Apache arrow Java. | ||
|
||
With the VertexWriter, users can specify a particular property group to | ||
be written into its corresponding chunks, or choose to write all | ||
property groups. For edge chunks, besides the meta data (edge info), the | ||
adjList type should also be specified. The adjList/properties can be | ||
written alone, or alternatively, all adjList, properties, and the offset | ||
(for CSR and CSC format) chunks can be written simultaneously. | ||
|
||
To utilize the GAR Java writer, please refer to the following example | ||
code. | ||
|
||
.. code-block:: java | ||
|
||
// common steps to construct VectorSchemaRoot | ||
String uri = "file:" + ...; // data source | ||
ScanOptions options = new ScanOptions(/*batchSize*/ 32768); | ||
StdSharedPtr<ArrowTable> table = null; | ||
try (BufferAllocator allocator = new RootAllocator(); | ||
DatasetFactory datasetFactory = | ||
new FileSystemDatasetFactory( | ||
allocator, NativeMemoryPool.getDefault(), FileFormat.PARQUET, uri); | ||
Dataset dataset = datasetFactory.finish(); | ||
Scanner scanner = dataset.newScan(options); | ||
ArrowReader reader = scanner.scanBatches()) { | ||
while (reader.loadNextBatch()) { | ||
try (VectorSchemaRoot root = reader.getVectorSchemaRoot()) { | ||
// convert VectorSchemaRoot to ArrowTable | ||
table = ArrowTable.fromVectorSchemaRoot(allocator, root, reader); | ||
} | ||
} | ||
} catch (Exception e) { | ||
e.printStackTrace(); | ||
} | ||
|
||
// construct writer object | ||
String path = ...; // file to be wrote | ||
StdString edgeMetaFile = StdString.create(path); | ||
StdSharedPtr<Yaml> edgeMeta = Yaml.loadFile(edgeMetaFile).value(); | ||
EdgeInfo edgeInfo = EdgeInfo.load(edgeMeta).value(); | ||
EdgeChunkWriter writer = EdgeChunkWriter.factory.create( | ||
edgeInfo, StdString.create("/tmp/"), AdjListType.ordered_by_source); | ||
|
||
// write table with writer object | ||
writer.sortAndWriteAdjListTable(table, 0, 0); // Write adj list of vertex chunk 0 to files | ||
|
||
See `test for | ||
writers <https://github.com/alibaba/GraphAr/tree/main/java/src/test/java/com/alibaba/graphar/writers>`__ | ||
for the complete example. | ||
|
||
Readers | ||
~~~~~~~ | ||
|
||
The GraphAr Java reader provides an extensive set of interfaces to read | ||
GAR files. It reads a collection of vertices or edges at a time as | ||
ArrowTable. Similar with the reader in C++ library, it supports the | ||
users to specify the data they need, e.g., a single property group. | ||
|
||
To utilize the GAR Java reader, please refer to the following example | ||
code. | ||
|
||
.. code-block:: java | ||
|
||
// construct vertex chunk reader | ||
GraphInfo graphInfo = ...; // load graph meta info | ||
StdString label = StdString.create("person"); | ||
StdString propertyName = StdString.create("id"); | ||
if (graphInfo.getVertexInfo(label).hasError()) { | ||
// throw Exception or do other things | ||
} | ||
PropertyGroup group = graphInfo.getVertexPropertyGroup(label, propertyName).value(); | ||
Result<VertexPropertyArrowChunkReader> maybeReader = | ||
GrapharStaticFunctions.INSTANCE.constructVertexPropertyArrowChunkReader( | ||
graphInfo, label, group); | ||
// check reader's status if needed | ||
VertexPropertyArrowChunkReader reader = maybeReader.value(); | ||
Result<StdSharedPtr<ArrowTable>> result = reader.getChunk(); | ||
// check table's status if needed | ||
StdSharedPtr<ArrowTable> table = result.value(); | ||
StdPair<Long, Long> range = reader.getRange().value(); | ||
|
||
See `test for | ||
readers <https://github.com/alibaba/GraphAr/tree/main/java/src/test/java/com/alibaba/graphar/readers>`__ | ||
for the complete example. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the maven project has not set, so maybe we should not include this part. We can add this part back if maven is ready
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
User needs to add this part in pom.xml so the project can import the gar-java library
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean after building the project, the user can use the project directly?