Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error parsing WhatsApp DB (Comparison method violates its general contract) #2337

Closed
wladimirleite opened this issue Oct 15, 2024 · 4 comments · Fixed by #2338 or #2354
Closed

Error parsing WhatsApp DB (Comparison method violates its general contract) #2337

wladimirleite opened this issue Oct 15, 2024 · 4 comments · Fixed by #2338 or #2354
Assignees
Labels

Comments

@wladimirleite
Copy link
Member

Another user reported the following error (it happens both with 4.1.x and master).
Analysing the database, the issue is caused by the way messages are sorted.
It may fail when a very specific combination of records (with zeroes or null values in the columns used) is present, which is the case of the triggering database.
I will submit a fix shortly.

2024-10-15 09:32:48	[ERROR]	[parsers.whatsapp.WhatsAppParser]			Error parsing WhatsApp: Item: XXX/msgstore.db type: sqlite size: 61648896
org.apache.tika.exception.TikaException: WAExtractorException Exception
	at iped.parsers.whatsapp.WhatsAppParser.parseWhatsappMessages(WhatsAppParser.java:399) ~[iped-parsers-impl-4.2-snapshot.jar:?]
	at iped.parsers.whatsapp.WhatsAppParser.parse(WhatsAppParser.java:259) [iped-parsers-impl-4.2-snapshot.jar:?]
	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298) [tika-core-2.4.0-p1.jar:2.4.0]
	at iped.parsers.standard.StandardParser.parse(StandardParser.java:245) [iped-parsers-impl-4.2-snapshot.jar:?]
	at iped.engine.io.ParsingReader$BackgroundParsing.run(ParsingReader.java:247) [iped-engine-4.2-snapshot.jar:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [?:?]
	at java.util.concurrent.FutureTask.run(Unknown Source) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?]
	at java.lang.Thread.run(Unknown Source) [?:?]
Caused by: java.lang.IllegalArgumentException: Comparison method violates its general contract!
	at java.util.ComparableTimSort.mergeHi(Unknown Source) ~[?:?]
	at java.util.ComparableTimSort.mergeAt(Unknown Source) ~[?:?]
	at java.util.ComparableTimSort.mergeForceCollapse(Unknown Source) ~[?:?]
	at java.util.ComparableTimSort.sort(Unknown Source) ~[?:?]
	at java.util.Arrays.sort(Unknown Source) ~[?:?]
	at java.util.Arrays.sort(Unknown Source) ~[?:?]
	at java.util.ArrayList.sort(Unknown Source) ~[?:?]
	at java.util.Collections.sort(Unknown Source) ~[?:?]
	at iped.parsers.whatsapp.ExtractorAndroidNew.extractChatList(ExtractorAndroidNew.java:150) ~[iped-parsers-impl-4.2-snapshot.jar:?]
	at iped.parsers.whatsapp.Extractor.getChatList(Extractor.java:34) ~[iped-parsers-impl-4.2-snapshot.jar:?]
	at iped.parsers.whatsapp.WhatsAppParser.parseWhatsappMessages(WhatsAppParser.java:391) ~[iped-parsers-impl-4.2-snapshot.jar:?]
	... 9 more

@wladimirleite wladimirleite self-assigned this Oct 15, 2024
@wladimirleite
Copy link
Member Author

Out of curiosity, the problem is caused when we try to sort items and some of them don't have a "linear" order (A > B, B > C and C > A).
The comparator used by the WhatsAppParser is something like:

public int compareTo(Message o) {
    if (a != 0 && o.a != 0) {
        int cmp = Integer.compare(a, o.a);
        if (cmp != 0) return cmp;
    }
    if (b != 0 && o.b != 0) {
        int cmp = Integer.compare(b, o.b);
        if (cmp != 0) return cmp;
    }
    return Integer.compare(c, o.c);
}

If we have items like X = {a=0, b=2, c=1}, Y = {a=2, b=1, c=0}, Z = {a=1, b=0, c=2}, then:
X > Y, Y > Z and Z > X, which will cause an exception if we try to sort them (in fact, there must be at least 32 items, so the merge function is used).

@lfcnassif
Copy link
Member

Closed by #2352.

@wladimirleite
Copy link
Member Author

Sorry @lfcnassif, but I will reopen this once again, as there is still an issue when backups are merged.
Merging process sort messages, but it is not possible to use Message.sort() as later there are binary searches that rely on the "regular" sorting (Collections.sort(), which uses only the Comparator implemented by Message class).
The solution I found is to keep the merging code as it is, and after the merging process, call Message.sort().
I will submit a PR with this additional fix.

@lfcnassif
Copy link
Member

lfcnassif commented Oct 31, 2024

Don't worry and thank you @wladimirleite for continuously checking the changes. I thought this could be very tricky when merging DBs with different sorting criteria, but didn't test the changes, I'm sorry about that.

lfcnassif added a commit that referenced this issue Oct 31, 2024
Sort messages properly, after merging process (Additional fix to #2337)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment