Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with Multithreading in cayley #117

Closed
igormarfin opened this issue Aug 13, 2014 · 4 comments · Fixed by #120
Closed

Problem with Multithreading in cayley #117

igormarfin opened this issue Aug 13, 2014 · 4 comments · Fixed by #120

Comments

@igormarfin
Copy link

Hi All,

First of all, thanks a lot for this very nice soft. After testing it for a while, I've come across the problem with "cayley http" mode which I can't solve myself. I have developed a Fraud Prevention System, which utilizes cayley with the Levenstein-automata (which works as an external c++ http server), solving the entity resolution problem in input data (bookings). Also I have created the small Gremlin API realizing a few methods of the graph theory, for example, so called "walkman" algorithms to find all/some bookings connected via there properties.

Everything was working like a charm until I has switched the Virtual Machine (http://downloads.sourceforge.net/virtualboximage/debian_6.0.6.vdi.7z) to the "many-cpu" mode.
Then cayley server was allowed to had several threads in the parallel.

This has brought the following problem:

running this code consequently in two opened browsers with Web UI,

function getsizeID()
{
var counter=0;
g.V().Has("is","ID").ForEach(
function (d) {
counter++;
}
);
return {"id":"1", "source":counter,"target":counter};

}
var num_ids = getsizeID(); g.Emit(num_ids);

gives no problems. But If I start it simultaneously in two opened browsers with Web UI, it will be crashed and responded as

<title>502 Bad Gateway</title>

502 Bad Gateway


nginx/1.6.0

The log file can be viewed here
https://drive.google.com/file/d/0B5OwgVT-YmdbdGhLUWdWMkR4Mzg/edit?usp=sharing

This query

g.V().Has("is","ID").All();

also causes crash if I started it simultaneously.
However, if I send the two "g.V().Has("is","ID").All(); " with an delay of >=10 s between them, there is no crash.

Also, If I switch VM to the single-cpu mode, I can send the "getsizeID()" request and get responses back in parallel without cayley's crash.

The format of data w/o connection between bookings looks like

69759_ID has "William Kennedy_69759ID_Booker" .
69759_ID has "[email protected]_69759ID_BookerEmail" .
69759_ID has "Jesse Harrison_69759ID_Traveler" .
69759_ID has "[email protected]_69759ID_TravelerEmail" .
69759_ID is ID .
"William Kennedy_69759ID_Booker" is Booker .
"Jesse Harrison_69759ID_Traveler" is Traveler .
"[email protected]_69759ID_BookerEmail" is BookerEmail .
"[email protected]_69759ID_TravelerEmail" is TravelerEmail .

The size of data is about 150K bookings.
The issue can be tested at
http://43bd3b8a.ngrok.com/

During all these tests, the limit of memory of VM has not been reached.
I have tested 0.3.0 and 0.3.1, the situation is the same.

Any ideas, why it happens?

Cheers,
Igor.

@barakmich
Copy link
Member

First of all, thank you for the detailed bug report! This is immensely helpful; in fact, I can easily repro it on my machine with the sample dataset. Same thing, loaded into LevelDB, with a simple query of
g.V().Out().All() in two browser windows.

Given that the panic is in convertStringToByteHash it looks like the hasher isn't threadsafe. Which is a pity, but I'm looking at it now.

@kortschak
Copy link
Contributor

Yeah, that's not thread-safe. The hash.Hash has an internal buffer.

Try using sync.Pool[1] instead of keeping the Hasher in the value. This will limit our compatibility though - it was introduced in go1.3.

Ihar, if you are running from source, can you rebuild with -race and run this again, post the race deterctor report - it should be pretty obvious.

[1]http://golang.org/pkg/sync/#Pool

@igormarfin
Copy link
Author

Hi All,

At least, I can recompile 0.3.0 with -race, I am running from source. I let you know details asap.

Cheers,
Igor.

@igormarfin
Copy link
Author

If it still actual, I've recompiled 0.3.0 with -race. The race detector report regarding the problem can be found here:
https://drive.google.com/file/d/0B5OwgVT-YmdbQkd1RFlSM0h4ZUk/edit?usp=sharing

Anyway, thanks for commits #118, try to to test them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants