Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General Design question #3138

Closed
lonnietc opened this issue Oct 18, 2021 · 12 comments
Closed

General Design question #3138

lonnietc opened this issue Oct 18, 2021 · 12 comments
Labels
type/question Type: question about the product

Comments

@lonnietc
Copy link

Hi All,

Hope that everyone is going well today.

I am working on a type of P2P eCommerce store system such that various merchants can run a small single-app node (application) which will effectively join in a cluster of nodes that will store data across nodes and balance data as needed.

The user application could be an independent application that makes calls to a cluster node, or alternatively if the node has a web UI already built in then perhaps that could be modified to just work within that local node. I'm still trying to determine the best approach here, but maybe an independent node and independent application UI (i.e. external web service) would be a good approach.

Ultimately I need stability and horizontal scaling as more nodes come online but at the same time some nodes may drop offline and that data should be still available and accessible from within the cluster from an API call.

Currently, I am investigating Graph database solutions and am trying to determine which offers the simplest implementation while still being able to potentially handle millions of graph nodes and edges while also easily allowing me to add more API functionality (Golang payment processing libraries, User & Merchant Authentication, Searching libraries, etc.) by extending the existing graph database API so that the external application just makes simple API calls.

The systems that I have so far seem to hold the most promise for are:

  1. GraphikDB ---- https://github.com/graphikDB/graphik
  2. Dgraph ---- https://github.com/dgraph-io/dgraph
  3. Nebula ---- https://github.com/vesoft-inc/nebula

Each brings similar graph database functionality to the table but also each is a bit different and unique in their own way so I am trying to get some information and a feel for which might be best to start with for a P2P eCommerce based solution that can grow over time to handle massive physical nodes as well as massive graphs and user/merchant data.

Any feedback or ideas would be greatly appreciated.

Best Regards and have a great day

@wey-gu
Copy link
Contributor

wey-gu commented Oct 19, 2021

Dear @lonnietc!

Welcome to the nebula community, we are happy to have you here~~~~

Nebula is excellent at hyper-scale data and designed to be scalable, you could try it with docker-compose/binary installation/ k8s operator :).

Nebula core doesn't support HTTP Access, for now, thus it could be either called/accessed by your backend application (go/java/python via lib/SDK) or via a proxy: nebula-HTTP-gateway(https://github.com/vesoft-inc/nebula-http-gateway)

For also easily allowing me to add more API functionality..., if I understanding it correctly, you could create the backend service yourself with nebula go client(as you mentioned golang) and in different functions, it makes nGQL calls to nebula graph, which are all Graph APIs for nebula graph.

Also, Zhihu Inc. had open-sourced and is maintaining a Golang ORM for nebula graph here: https://github.com/zhihu/norm

Thanks!

@lonnietc
Copy link
Author

Hello @wey-gu,

Thanks so very much for getting back to me on my inquiry.

Yesterday, I ran a compile on the Nebula sources on a Ubuntu 20.04 (AMD) systems that I have here for testing and it took what seemed to be about 5 hours to compile, but it did compile successfully. After the compile, there were about 8 binaries created each of which was about 400+ MB in size.

For this particular P2P project, those would be just too large for commodity systems and would definitely need dedicated servers with some good resources to run all of that on the system.

Instead, I also looked at your Docker setup and fired it up.

REPOSITORY TAG IMAGE ID CREATED SIZE
vesoft/nebula-metad nightly 311dc01778e2 4 hours ago 282MB
vesoft/nebula-storaged nightly bcde7bcc1ee9 4 hours ago 283MB
vesoft/nebula-graphd nightly 9a7f34bb5c1d 4 hours ago 277MB

the sizes are definitely better, but it seems that I would need to run 3 containers for each system, if I understand correctly which is not too desirable on this project. Generally a single docker container for each node would be most desirable.

Actually, for the next Hybrid-P2P project that I will be working on once this one is established then this may be an option since that project will depend upon massive scaling for the graph database so I truly have great promise for Nebula in that regard.

I am still looking into the best design approach for the current project and may also be able to work out a backend service via Golang or C/C++ (the major languages that I use) in the way that you describe to make nGQL calls to dedicated nebula graph servers which is a possibility as well. Or a hybrid approach but will have to see.

Thanks again,
Lonnie

@jievince
Copy link
Contributor

After the compile, there were about 8 binaries created each of which was about 400+ MB in size.

Did you compile it in Debug mode?
In Release mode, all the binaries add up to about 70 MB

@lonnietc
Copy link
Author

It was probably debug mode since I just:

  1. created a build directory
  2. then did a "cmake .." from within there and did a "make"

Maybe I should try again to do a release build.

@lonnietc
Copy link
Author

Hello @jievince

I just completed the release build of Nebula and have these binaries created:

-rwxrwxr-x 1 lonnie lonnie 57442088 Oct 19 11:43 db_dump
-rwxrwxr-x 1 lonnie lonnie 57561952 Oct 19 11:44 db_upgrader
-rwxrwxr-x 1 lonnie lonnie 57453240 Oct 19 11:43 meta_dump
-rwxrwxr-x 1 lonnie lonnie 52784504 Oct 19 11:20 nebula-graphd
-rwxrwxr-x 1 lonnie lonnie 56633304 Oct 19 11:43 nebula-metad
-rwxrwxr-x 1 lonnie lonnie 57105824 Oct 19 11:15 nebula-storaged
-rwxrwxr-x 1 lonnie lonnie 58311600 Oct 19 11:42 storage_integrity
-rwxrwxr-x 1 lonnie lonnie 58427488 Oct 19 11:43 storage_perf

Although they are much smaller than the Debug build versions, each one is about 50+ MB in size.

I am also still trying to determine exactly which ones are needed for:

  1. Control Node
  2. Data Storage node.

The idea would be that I have a main control node server and then as other P2P nodes join in, they become storage and API nodes that can be called.

Thanks again

@wey-gu
Copy link
Contributor

wey-gu commented Oct 20, 2021

Dear @lonnietc ,

I am also still trying to determine exactly which ones are needed for:

  • Control Node
  • Data Storage node.

Some refs:

Also for deployment, you can check this out:

Thanks!!

@lonnietc
Copy link
Author

Thanks for the references and I am really starting to dig into Nebula since I truly like some of the features and especially the scaling information that I have read about so far.

One thing that may be a problem is that the storage nodes would be running on commodity hardware and connecting into a P2P mesh which would probably not be a problem in itself, but what I just thought about is the for a particular P2P design in which I would want the storage nodes to contribute to the total graph space, I would also need to have a Windows version of Nebula which it seems that it does not support.

Maybe it could be cross-compiled on Linux to create some Windows executables, but that might take some time to develop and work out the details.

Also, I would need to consider how much horsepower the user commodity computer hardware might be and if it could actually support running a Nebula node or scaled-down Nebula node.

These are some additional thoughts that I have just come across and would like to get any input or suggestions that you may be willing to offer.

Thanks again as this discussion has already been extremely helpful

@lonnietc
Copy link
Author

lonnietc commented Nov 8, 2021

Hello Wey-gu,

Hope that all is well.

The project that I am working on needs to have:

  1. Massively scalable (thousands of nodes) capabilities.
  2. Role-Based ACL for database users
  3. Support different platforms Linux, Windows, Mac, etc..

I am wondering if Nebula Graph will be able to have a Windows version anytime soon?

I know that I could run it in a Docker container, but I need to be able to run it natively on Windows, Linux, and Mac if at all possible.

Any thoughts on this?
Best Regards

@wey-gu
Copy link
Contributor

wey-gu commented Nov 9, 2021

Dear @lonnietc :)

  1. Nebula is excellent at scale, by thousands of nodes do you mean thousands of VMs as storageD/graphD workload machine?

  2. RBAC is supported: https://docs.nebula-graph.io/2.6.1/7.data-security/1.authentication/3.role-list/

  3. For server side, nebula is released on linux(arm in docker image/x86 in binary and docker image) now. For client side, there are windows/mac version in nebula-console.

I am using a m1 Macbook and I can run it via docker now for the nightly build, do you really need the natively-build for macOS and windows? If that is the case nebula cannot be done so now, but you can raise the issue on this requirements, for most cases, running via docker can fulfill the requirements as i understand :-P.

Thanks!

@lonnietc
Copy link
Author

lonnietc commented Nov 9, 2021

Hello @wey-gu ,

Thanks for getting back to me so quickly and I REALLY do like what I have read about Nebula since I think that it holds a huge potential for my use case.

Maybe you can shoot me an email and I can explain more of what I am trying to do away from this list since the core idea is still in development.

Please contact me at [email protected] so that we can chat just a bit more and we what might be possible, ok.

Thanks and have a great night

@wey-gu
Copy link
Contributor

wey-gu commented Nov 9, 2021

Hi @lonnietc ,
I will ping you via slack as I see you there ;-)
Thank you!

@lonnietc
Copy link
Author

lonnietc commented Nov 9, 2021

That would be perfect. I will be able to chat more tomorrow as it is getting late here, but really look forward to speaking with you on slack, my friend. Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/question Type: question about the product
Projects
None yet
Development

No branches or pull requests

4 participants