Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync against davmail results in several 'Server sent item twice' and an exception #144

Closed
benklop opened this issue Dec 2, 2014 · 44 comments

Comments

@benklop
Copy link

benklop commented Dec 2, 2014

I've been trying to figure out why kcalendar / akonadi is missing some of my appointments, but thunderbird's calendar component, lightning, is not missing them. As part of this troubleshooting, I decided to use vdirsyncer to more easily examine the data the clients are being sent.

Upon trying to sync, I got this output:

ben@ben-Latitude-E6430:~/Projects/vdirsyncer > vdirsyncer sync        
Syncing my_contacts
Syncing my_calendar
warning: Server sent item twice: /users/[email protected]/calendar/AAMkADEzMzY4MjQ0LWNkOGItNGY5Ny04ODFlLTA2Yzg5MmVlNGU4YwBGAAAAAACE1jVdIr03Q7Tv-mnsJ6JrBwCtv0MJ6-AqSJPr2WwcLyGJAAAAAAEOAACtv0MJ6-AqSJPr2WwcLyGJAAAq_CPEAAA%3D.EML
warning: Server sent item twice: /users/[email protected]/calendar/AAMkADEzMzY4MjQ0LWNkOGItNGY5Ny04ODFlLTA2Yzg5MmVlNGU4YwBGAAAAAACE1jVdIr03Q7Tv-mnsJ6JrBwCtv0MJ6-AqSJPr2WwcLyGJAAAAAAEOAACtv0MJ6-AqSJPr2WwcLyGJAAAq_CPdAAA%3D.EML
warning: Server sent item twice: /users/[email protected]/calendar/AAMkADEzMzY4MjQ0LWNkOGItNGY5Ny04ODFlLTA2Yzg5MmVlNGU4YwBGAAAAAACE1jVdIr03Q7Tv-mnsJ6JrBwCtv0MJ6-AqSJPr2WwcLyGJAAAAAAEOAACtv0MJ6-AqSJPr2WwcLyGJAAAq_COAAAA%3D.EML
warning: Server sent item twice: /users/[email protected]/calendar/AAMkADEzMzY4MjQ0LWNkOGItNGY5Ny04ODFlLTA2Yzg5MmVlNGU4YwBGAAAAAACE1jVdIr03Q7Tv-mnsJ6JrBwCtv0MJ6-AqSJPr2WwcLyGJAAAAAAEOAACtv0MJ6-AqSJPr2WwcLyGJAAAq_COrAAA%3D.EML
Unhandled exception occured while syncing my_calendar.
Traceback (most recent call last):
  File "/home/ben/.local/lib/python2.7/site-packages/vdirsyncer/cli.py", line 472, in sync_collection
    force_delete=status_name in force_delete
  File "/home/ben/.local/lib/python2.7/site-packages/vdirsyncer/sync.py", line 129, in sync
    b_idents = _prepare_idents(storage_b, storage_a, b_href_to_status)
  File "/home/ben/.local/lib/python2.7/site-packages/vdirsyncer/sync.py", line 76, in _prepare_idents
    _prefetch(storage, hrefs, download)
  File "/home/ben/.local/lib/python2.7/site-packages/vdirsyncer/sync.py", line 86, in _prefetch
    for href, item, etag in storage.get_multi(hrefs):
  File "/home/ben/.local/lib/python2.7/site-packages/vdirsyncer/storage/dav.py", line 345, in get_multi
    raise exceptions.NotFoundError(href)
NotFoundError: /users/[email protected]/calendar/AAMkADEzMzY4MjQ0LWNkOGItNGY5Ny04ODFlLTA2Yzg5MmVlNGU4YwBGAAAAAACE1jVdIr03Q7Tv-mnsJ6JrBwCtv0MJ6-AqSJPr2WwcLyGJAAAAAAEOAACtv0MJ6-AqSJPr2WwcLyGJAAAq_CPDAAA%3D.EML

In order to see what was going on a little better, I decided to change the exception raise to a warning:

dav.py line 344:

        for href in hrefs_left:
        dav_logger.warning('Server did not send item: {}'
                                       .format(href))
        return rv

This gave me the output:

warning: Server sent item twice: /users/[email protected]/calendar/AAMkADEzMzY4MjQ0LWNkOGItNGY5Ny04ODFlLTA2Yzg5MmVlNGU4YwBGAAAAAACE1jVdIr03Q7Tv-mnsJ6JrBwCtv0MJ6-AqSJPr2WwcLyGJAAAAAAEOAACtv0MJ6-AqSJPr2WwcLyGJAAAq_CPEAAA%3D.EML
warning: Server sent item twice: /users/[email protected]/calendar/AAMkADEzMzY4MjQ0LWNkOGItNGY5Ny04ODFlLTA2Yzg5MmVlNGU4YwBGAAAAAACE1jVdIr03Q7Tv-mnsJ6JrBwCtv0MJ6-AqSJPr2WwcLyGJAAAAAAEOAACtv0MJ6-AqSJPr2WwcLyGJAAAq_CPdAAA%3D.EML
warning: Server sent item twice: /users/[email protected]/calendar/AAMkADEzMzY4MjQ0LWNkOGItNGY5Ny04ODFlLTA2Yzg5MmVlNGU4YwBGAAAAAACE1jVdIr03Q7Tv-mnsJ6JrBwCtv0MJ6-AqSJPr2WwcLyGJAAAAAAEOAACtv0MJ6-AqSJPr2WwcLyGJAAAq_COAAAA%3D.EML
warning: Server sent item twice: /users/[email protected]/calendar/AAMkADEzMzY4MjQ0LWNkOGItNGY5Ny04ODFlLTA2Yzg5MmVlNGU4YwBGAAAAAACE1jVdIr03Q7Tv-mnsJ6JrBwCtv0MJ6-AqSJPr2WwcLyGJAAAAAAEOAACtv0MJ6-AqSJPr2WwcLyGJAAAq_COrAAA%3D.EML
warning: Server did not find item: /users/[email protected]/calendar/AAMkADEzMzY4MjQ0LWNkOGItNGY5Ny04ODFlLTA2Yzg5MmVlNGU4YwBGAAAAAACE1jVdIr03Q7Tv-mnsJ6JrBwCtv0MJ6-AqSJPr2WwcLyGJAAAAAAEOAACtv0MJ6-AqSJPr2WwcLyGJAAAq_CPDAAA%3D.EML
warning: Server did not find item: /users/[email protected]/calendar/AAMkADEzMzY4MjQ0LWNkOGItNGY5Ny04ODFlLTA2Yzg5MmVlNGU4YwBGAAAAAACE1jVdIr03Q7Tv-mnsJ6JrBwCtv0MJ6-AqSJPr2WwcLyGJAAAAAAEOAACtv0MJ6-AqSJPr2WwcLyGJAAAq_COaAAA%3D.EML
warning: Server did not find item: /users/[email protected]/calendar/AAMkADEzMzY4MjQ0LWNkOGItNGY5Ny04ODFlLTA2Yzg5MmVlNGU4YwBGAAAAAACE1jVdIr03Q7Tv-mnsJ6JrBwCtv0MJ6-AqSJPr2WwcLyGJAAAAAAEOAACtv0MJ6-AqSJPr2WwcLyGJAAAq_CPeAAA%3D.EML
warning: Server did not find item: /users/[email protected]/calendar/AAMkADEzMzY4MjQ0LWNkOGItNGY5Ny04ODFlLTA2Yzg5MmVlNGU4YwBGAAAAAACE1jVdIr03Q7Tv-mnsJ6JrBwCtv0MJ6-AqSJPr2WwcLyGJAAAAAAEOAACtv0MJ6-AqSJPr2WwcLyGJAAAq_CORAAA%3D.EML

This of course had an exception after that, ( SyncError: Etag changed during sync. ) so that isn't a solution, but it does show something interesting.

The server appears to be mixing the case of the 3rd character in the last portion of the name:

sent twice: CPEAAA%3D.EML
expected:  CPeAAA%3D.EML

sent twice: CPdAAA%3D.EML
expected:  CPDAAA%3D.EML

sent twice: COAAAA%3D.EML
expected:  COaAAA%3D.EML

sent twice: COrAAA%3D.EML
expected:  CORAAA%3D.EML

So it looks to me like in these last 6-7 characters, case doesn't seem to be consistent, and entries seem to be present for both. I have checked a couple of these, and the actual file retrieved seems to be identical for both entries.

I'm not sure at this point how to fix this in vdirsyncer - I'm not really a python developer.

@untitaker
Copy link
Member

This issue has come up before with a different server. It basically boils down to whether hrefs are case-sensitive or not, and the way I interpret the CalDAV RFC, they aren't supposed to be.

I'd be interested in how Thunderbird handles this, and how @dmfs deals with this situation.

@untitaker
Copy link
Member

Also this seems to be a common phenomenon with Windows-based (or tangentially Microsoft-related) services?

@benklop
Copy link
Author

benklop commented Dec 2, 2014

I'd say yes and no to the windows related comment. The behavior in KCalendar / akonadi which I suspect is caused by this has been present for me no matter whether my mailbox was hosted on zimbra or exchange - it has been migrated a few times, and this has always occurred, though admittedly I can't actually verify that we're talking about the exact same cause.

@benklop
Copy link
Author

benklop commented Dec 2, 2014

Also, I'd agree they aren't supposed to be case sensitive, especially given that the actual retrieved files seem to be identical. In that case, couldn't duplicate case differing entries be handled a bit more gracefully?

@untitaker
Copy link
Member

Eugh, sorry for mentioning you again dmfs, I re-read the old thread at #113 and saw you didn't see any issues either.

@benklop Do you have any possibility of capturing the requests made to your server? I think davmail has an option for this (not sure if you can reproduce the issue with that), otherwise mitmproxy should be able to do this.

@untitaker
Copy link
Member

In that case, couldn't duplicate case differing entries be handled a bit more gracefully?

I already thought about introducing a new parameter for case-insensitive behavior. I am very vary about making this the default because different cases might actually mean different items (better fail hard than fai silently)

@benklop
Copy link
Author

benklop commented Dec 2, 2014

@untitaker if you can give me some suggestions on how to make davmail capture the requests, I can certainly provide that, possibly scrubbed a bit, depending on the included data. I am currently reproducing this through davmail, so that would be a convenient capture method.

@benklop
Copy link
Author

benklop commented Dec 2, 2014

@untitaker I would certainly expect any such parameter would definitely not be the default; I couldn't imagine that being a good idea.

@untitaker
Copy link
Member

Minimally configured davmail drops a "davmail.log" into my homefolder (or CWD?)

You can change the path in the settings.

@dmfs
Copy link

dmfs commented Dec 2, 2014

@untitaker no problem.
FYI: We handle hrefs to be case-sensitive. RFC 4918 refers to RFC 3986 which says in section 6.2.2.1

   When a URI uses components of the generic syntax, the component
   syntax equivalence rules always apply; namely, that the scheme and
   host are case-insensitive and therefore should be normalized to
   lowercase.

It doesn't mention the path. Of course it doesn't explicitly say the path would be case-insensitive, but it does say that about scheme and host (not just in this section, but in other places as well). That leads me to the conclusion that the path is indeed case-sensitive and I think that's how most web servers and HTTP clients handle it.

@dmfs
Copy link

dmfs commented Dec 2, 2014

That's also how Java's URI class compares URIs. Only scheme and host are compared in a case-insensitive manner.

@untitaker
Copy link
Member

Yeah the HTTP and URI specs are definetly case-sensitive, I thought this behavior could be Web/Cal/CardDAV-specific.

I think I will have to implement a case-sensitivity parameter for the DAV storages.

@benklop
Copy link
Author

benklop commented Dec 2, 2014

@untitaker I've enabled debug level logging in davmail, and ran a vdirsyncer sync, the log contains a pretty fair amount of sensitive data, however. No passwords, but a lot of email addresses and the contents of all the appointments. I can probably use sed to make this a lot better, but i don't want to get rid of anything you'd need.

@untitaker
Copy link
Member

The content of the iCalendar files don't matter I think. Anything that is XML is important, but may be censored.

@untitaker
Copy link
Member

Oh, I meant running Thunderbird against it to see how it deals with this situation.

@benklop
Copy link
Author

benklop commented Dec 2, 2014

@untitaker let me know if you'd like the logging data and I can provide it to you offline. it doesn't seem particularly feasable to sanitize it, unfortunately.

@untitaker
Copy link
Member

@benklop I forgot about this: Could you try using the vdirsyncer version from the outlook_fixes branch? For this, set the case_sensitive parameter to false:

[storage something]
type = caldav
case_sensitive = false

and install vdirsyncer like this:

pip install --user --upgrade git+https://github.com/untitaker/vdirsyncer.git#outlook_fixes

@benklop
Copy link
Author

benklop commented Dec 3, 2014

@untitaker well, i got a whole lot of 'skipping identical href', then eventually:

Unhandled exception occured while syncing my_calendar.
Traceback (most recent call last):
  File "/home/ben/Projects/src/vdirsyncer/vdirsyncer/cli.py", line 480, in sync_collection
    force_delete=status_name in force_delete
  File "/home/ben/Projects/src/vdirsyncer/vdirsyncer/sync.py", line 129, in sync
    b_idents = _prepare_idents(storage_b, storage_a, b_href_to_status)
  File "/home/ben/Projects/src/vdirsyncer/vdirsyncer/sync.py", line 76, in _prepare_idents
    _prefetch(storage, hrefs, download)
  File "/home/ben/Projects/src/vdirsyncer/vdirsyncer/sync.py", line 86, in _prefetch
    for href, item, etag in storage.get_multi(hrefs):
  File "/home/ben/Projects/src/vdirsyncer/vdirsyncer/storage/dav.py", line 327, in get_multi
    .find('{DAV:}prop') \
AttributeError: 'NoneType' object has no attribute 'find'

@untitaker
Copy link
Member

Could you re-run this with --verbosity=DEBUG and also give me the last ~10 lines before the traceback?

@untitaker
Copy link
Member

Actually it would be more helpful if you send it to the email address on my profile. Thank you!

@untitaker
Copy link
Member

@benklop Any update on this?

@benklop
Copy link
Author

benklop commented Dec 5, 2014

@untitaker sorry about that, had a very busy few days. given the huge mass of XML in the end of the output, I think i'm going to send you a little more than 10 lines :)

Briefly, the last several pages of output seem to be a gigantic amount of XML containing:

<D:response>
    <D:href>
        /users/[email protected]/calendar/aamkadezmzy4mjq0lwnkogitngy5ny04odfllta2yzg5mmvlngu4ywbgaaaaaace1jvdir03q7tv-mnsj6jrbwctv0mj6-aqsjpr2wwclygjaaaaaaeoaactv0mj6-aqsjpr2wwclygjaab5nse6aaa=.eml
    </D:href>
    <D:propstat>
        <D:status>
            HTTP/1.1 404 Not Found
        </D:status>
    </D:propstat>
</D:response>

for lots of different .eml resources. I'm fairly certain all of them are getting a 404.

@benklop
Copy link
Author

benklop commented Dec 5, 2014

From earlier in the log, it looks like the case sensitive href is:

<D:response>
  <D:href>
    /users/[email protected]/calendar/AAMkADEzMzY4MjQ0LWNkOGItNGY5Ny04ODFlLTA2Yzg5MmVlNGU4YwBGAAAAAACE1jVdIr03Q7Tv-mnsJ6JrBwCtv0MJ6-AqSJPr2WwcLyGJAAAAAAEOAACtv0MJ6-AqSJPr2WwcLyGJAAAAAAWmAAA%3D.EML
  </D:href>
  <D:propstat>
    <D:prop>
      <D:getcontenttype>
    text/calendar; component=vevent
      </D:getcontenttype>
      <D:getetag>
    2014-12-05T23:11:47Z
      </D:getetag>
    </D:prop>
    <D:status>HTTP/1.1 200 OK</D:status>
  </D:propstat>
</D:response>

From this I can only assume that the server is in fact not case insensitive.

Looking back at the previous data about the files that did cause an issue, I grabbed the first problem URL from my initial report and started changing the case of various characters. Certain parts seem to be case sensitive, but not others. Everything before the second dash seems case sensitive, while everything after does not.
so:

/users/[email protected]/calendar/AAMkADEzMzY4MjQ0LWNkOGItNGY5Ny04ODFlLTA2Yzg5MmVlNGU4YwBGAAAAAACE1jVdIr03Q7Tv-mnsJ6JrBwCtv0MJ6-AqSJPr2WwcLyGJAAAAAAEOAACtv0MJ6-AqSJPr2WwcLyGJAAAq_CPEAAA%3D.EML

seems equivalent to

/users/[email protected]/calendar/AAMkADEzMzY4MjQ0LWNkOGItNGY5Ny04ODFlLTA2Yzg5MmVlNGU4YwBGAAAAAACE1jVdIr03Q7Tv-mnsJ6JrBwCtv0MJ6-aqsjpr2wwclygjaaaaaaeoaactv0mj6-aqsjpr2wwclygjaaaq_cpeaaa%3D.eml

but not for every entry, unfortunately.

AAMkADEzMzY4MjQ0LWNkOGItNGY5Ny04ODFlLTA2Yzg5MmVlNGU4YwBGAAAAAACE1jVdIr03Q7Tv-mnsJ6JrBwCtv0MJ6-aqsjpr2wwclygjaaaaaaeoaactv0mj6-aqsjpr2wwclygjaaaaaawmaaa%3D.EML

does not work, for instance.

The only URLs I've noticed that seem to be partially case insensitive are ones that contain an underscore near the end.

@untitaker
Copy link
Member

@benklop Can you verify that all case-variations of one href actually represent the same item (return same content on GET request)

@benklop
Copy link
Author

benklop commented Dec 6, 2014

diff claims the variations on the same hrefs that are retrievable lead to the same files.

@untitaker
Copy link
Member

Could you re-try:

pip install --user --upgrade git+https://github.com/untitaker/vdirsyncer.git#outlook_fixes

@untitaker
Copy link
Member

The updates I made to that branch don't actually try to generate the "correct" href, instead it relies on the server sending two responses for each href, which seems to be the case here.

@benklop
Copy link
Author

benklop commented Dec 6, 2014

unfortunately it seems to fail with the same problem as before. I won't be able to resume troubleshooting this until tomorrow or next week, unfortunately.

After the large block of XML with 404s, the traceback is:

Unhandled exception occured while syncing my_calendar.
Traceback (most recent call last):
  File "/home/ben/Projects/src/vdirsyncer/vdirsyncer/cli.py", line 480, in sync_collection
    force_delete=status_name in force_delete
  File "/home/ben/Projects/src/vdirsyncer/vdirsyncer/sync.py", line 129, in sync
    b_idents = _prepare_idents(storage_b, storage_a, b_href_to_status)
  File "/home/ben/Projects/src/vdirsyncer/vdirsyncer/sync.py", line 76, in _prepare_idents
    _prefetch(storage, hrefs, download)
  File "/home/ben/Projects/src/vdirsyncer/vdirsyncer/sync.py", line 86, in _prefetch
    for href, item, etag in storage.get_multi(hrefs):
  File "/home/ben/Projects/src/vdirsyncer/vdirsyncer/storage/dav.py", line 327, in get_multi
    .find('{DAV:}prop') \
AttributeError: 'NoneType' object has no attribute 'find'
Exception in thread Thread-2 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
  File "/usr/lib/python2.7/threading.py", line 763, in run
  File "/home/ben/Projects/src/vdirsyncer/vdirsyncer/cli.py", line 330, in worker
  File "/home/ben/Projects/src/vdirsyncer/vdirsyncer/cli.py", line 314, in process_job
  File "/usr/lib/python2.7/Queue.py", line 168, in get
  File "/usr/lib/python2.7/threading.py", line 332, in wait
<type 'exceptions.TypeError'>: 'NoneType' object is not callable

the hrefs in the last block of XML are all lower case still.

@untitaker
Copy link
Member

That's highly irritating, I just added code to catch exactly that exception. The exception from the threading module seems like an unrelated heisenbug. Are you sure this is happening always?

@untitaker
Copy link
Member

I just tried this with my own Office365 account and while vdirsyncer yields a lot of warnings due to duplicate hrefs, synchronization works just fine.

@untitaker
Copy link
Member

During testing I also fixed some unrelated bugs, and I am surprised you didn't run into those. The outlook_fixes should contain all of it.

@benklop
Copy link
Author

benklop commented Dec 6, 2014

@untitaker it's possible somehow the executable didn't actually update to the newer code you pushed... I don't seem to be able to restore it to its original state either. I'll work on that next week.

@untitaker
Copy link
Member

I edited history in the branch. Easiest way to get around it is to reclone the repo. I'll avoid doing that for now.

@untitaker
Copy link
Member

@benklop It seems I didn't correctly push my changes, please try again.

@untitaker
Copy link
Member

@benklop Any updates on this? I edited the history of the outlook_fixes branch (rebasing etc) so you'll need to do something like:

git fetch
git reset --hard origin/outlook_fixes

Or just re-clone the repo.

@jludwig Could you also try the outlook_fixes branch? Quite some things have changed.

@untitaker
Copy link
Member

@benklop @jludwig Some other user reported this and they were using version 4.3.3. I was able to reproduce this issue fully with 4.3.3, but not with 4.5.1. Please try upgrading.

@flart
Copy link

flart commented Jan 19, 2015

Just to acknowledge untitakers comment: I tried with version 4.5.1 and the master branch of vdirsyncer. It works without any problems.

@untitaker
Copy link
Member

Closing due to inactivity.

@Nicop06
Copy link

Nicop06 commented Dec 15, 2016

I know this is an old issue but I still have the problem with the latest DavMail (4.7.3-2438). Would you consider reopening it?

Basically, the first run will return the following calendar events:

AAAAAM7eAAA%3D.EML
AAAAAM7fAAA%3D.EML

While the second one will return the following (notice the upper case e and f):

AAAAAM7EAAA%3D.EML
AAAAAM7FAAA%3D.EML

I couldn't find anything in the DavMail logs (this is always in lower case). Currently, I fixed the problem by removing the exception risen if the key match in lower case.

@untitaker
Copy link
Member

Could you try with 4.5.1?

@untitaker
Copy link
Member

Also tbh this really should be filed against DavMail

@Nicop06
Copy link

Nicop06 commented Dec 15, 2016

In 4.5.0, I have this error:

error: The server returned something vdirsyncer doesn't understand. Error message: InvalidXMLResponse('Invalid XML encountered: mismatched tag: line 1, column 1269\nDouble-check the URLs in your config.',)

Between 4.5.1 and 4.6.1 I have a NullPointerException in DavMail. After that, I have the same behavior than the latest. I tested all version from 4.5.0 to 4.7.3 except for 4.7.0 and 4.7.1 (I used the deb packages).

You are right, I will try to fix this problem on DavMail side. Also, this might come from Outlook. I have other problems with it like repeated event not repeating the event title, leading to empty event in khal.

Anyway, thank you for this great tool.

@untitaker
Copy link
Member

For that error there's a repair tool by Microsoft linked in the vdirsyncer docs

@Nicop06
Copy link

Nicop06 commented Dec 15, 2016

Thanks for the tip, I'll try that tomorrow at work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants