Skip to content
This repository has been archived by the owner on Jan 16, 2024. It is now read-only.

SDK crashed when network unstable #180

Closed
jikaihu opened this issue Sep 21, 2017 · 3 comments
Closed

SDK crashed when network unstable #180

jikaihu opened this issue Sep 21, 2017 · 3 comments

Comments

@jikaihu
Copy link

jikaihu commented Sep 21, 2017

I found when network changed from connect to disconnect, sometimes SDK crashed.
Here are the coredump stacks :
[2017/09/21 11:25:36.591]#1 0xb5df74bc in alexaClientSDK::acl::HTTP2Stream::getCurlHandle (this=0x0)
[2017/09/21 11:25:36.591] at /home/jikaihu/project/MT5658/apollo/third_party/source/avs/sdk/alexa-client-sdk-master/ACL/src/Transport/HTTP2Stream.cpp:347
[2017/09/21 11:25:36.591]#2 0xb5dff034 in alexaClientSDK::acl::HTTP2Transport::networkLoop (
[2017/09/21 11:25:36.591] this=0xa3eaf968)
[2017/09/21 11:25:36.591] at /home/jikaihu/project/MT5658/apollo/third_party/source/avs/sdk/alexa-client-sdk-master/ACL/src/Transport/HTTP2Transport.cpp:469
[2017/09/21 11:25:36.622]#3 0xb5e08c9c in std::_Mem_fn<void (alexaClientSDK::acl::HTTP2Transport::)()>::operator()<, void>(alexaClientSDK::acl::HTTP2Transport) const (
[2017/09/21 11:25:36.638] this=0xb4987990, __object=0xa3eaf968)
[2017/09/21 11:25:36.638] at /mtkoss/gnuarm/neon_4.8.2_2.6.35_cortex-a9-ubuntu/x86_64/armv7a-mediatek-linux-gnueabi/include/c++/4.8.2/functional:601
[2017/09/21 11:25:36.638]#4 0xb5e08bc8 in std::_Bind_simple<std::_Mem_fn<void (alexaClientSDK::acl::HTTP2Transport::)()> (alexaClientSDK::acl::HTTP2Transport)>::_M_invoke<0u>(std::_Index_tuple<0u>) (this=0xb498798c)
[2017/09/21 11:25:36.669] at /mtkoss/gnuarm/neon_4.8.2_2.6.35_cortex-a9-ubuntu/x86_64/armv7a-mediatek-linux-gnueabi/include/c++/4.8.2/functional:1732
[2017/09/21 11:25:36.669]#5 0xb5e08a64 in std::_Bind_simple<std::_Mem_fn<void (alexaClientSDK::acl::HTTP2Transport::)()> (alexaClientSDK::acl::HTTP2Transport)>::operator()() (
[2017/09/21 11:25:36.669] this=0xb498798c)
[2017/09/21 11:25:36.669] at /mtkoss/gnuarm/neon_4.8.2_2.6.35_cortex-a9-ubuntu/x86_64/armv7a-mediatek-linux-gnueabi/include/c++/4.8.2/functional:1720
[2017/09/21 11:25:36.684]#6 0xb5e089f0 in std::thread::_Impl<std::_Bind_simple<std::_Mem_fn<void (alexaClientSDK::acl::HTTP2Transport::)()> (alexaClientSDK::acl::HTTP2Transport)> >::_M_run() (this=0xb4987980)
[2017/09/21 11:25:36.700] at /mtkoss/gnuarm/neon_4.8.2_2.6.35_cortex-a9-ubuntu/x86_64/armv7a-mediatek-linux-gnueabi/include/c++/4.8.2/thread:115
[2017/09/21 11:25:36.700]#7 0xb54abbdc in ?? () from /lib/libstdc++.so.6
[2017/09/21 11:25:36.731]#8 0xb54abbdc in ?? () from /lib/libstdc++.so.6

Here are the SDK logs:
[2017/09/21 10:46:52.020]2017-09-21 02:47:36.621 [14d] E AuthDelegate:handleLwaResponseFailed:reason=parseJsonFailed,position=0,error=The document is empty.
[2017/09/21 10:46:52.020]2017-09-21 02:47:36.621 [14e] 0 AuthDelegate:calculatedTimeToRetry:delayMs=2422
[2017/09/21 10:46:54.423]2017-09-21 02:47:39.044 [14f] E HttpPost:doPostFailed:reason=curl_easy_performFailed,result=6,error=Couldn't resolve host name
[2017/09/21 10:46:54.438]2017-09-21 02:47:39.045 [150] E AuthDelegate:handleLwaResponseFailed:reason=parseJsonFailed,position=0,error=The document is empty.
[2017/09/21 10:46:54.438]2017-09-21 02:47:39.045 [151] 0 AuthDelegate:calculatedTimeToRetry:delayMs=3807
[2017/09/21 10:46:58.229]2017-09-21 02:47:42.853 [152] E HttpPost:doPostFailed:reason=curl_easy_performFailed,result=6,error=Couldn't resolve host name
[2017/09/21 10:46:58.245]2017-09-21 02:47:42.853 [153] E AuthDelegate:handleLwaResponseFailed:reason=parseJsonFailed,position=0,error=The document is empty.
[2017/09/21 10:46:58.245]2017-09-21 02:47:42.853 [154] 0 AuthDelegate:calculatedTimeToRetry:delayMs=5509
[2017/09/21 10:47:03.736]2017-09-21 02:47:48.363 [155] E HttpPost:doPostFailed:reason=curl_easy_performFailed,result=6,error=Couldn't resolve host name
[2017/09/21 10:47:03.751]2017-09-21 02:47:48.363 [156] E AuthDelegate:handleLwaResponseFailed:reason=parseJsonFailed,position=0,error=The document is empty.
[2017/09/21 10:47:03.751]2017-09-21 02:47:48.363 [157] 0 AuthDelegate:calculatedTimeToRetry:delayMs=41014
[2017/09/21 10:47:44.748]2017-09-21 02:48:29.378 [158] E HttpPost:doPostFailed:reason=curl_easy_performFailed,result=6,error=Couldn't resolve host name
[2017/09/21 10:47:44.764]2017-09-21 02:48:29.378 [159] E AuthDelegate:handleLwaResponseFailed:reason=parseJsonFailed,position=0,error=The document is empty.
[2017/09/21 10:47:44.764]2017-09-21 02:48:29.379 [15a] 0 AuthDelegate:calculatedTimeToRetry:delayMs=74652
[2017/09/21 10:47:49.101]2017-09-21 02:48:33.723 [15b] E HTTP2Transport:networkLoopRetryingToConnect:reason=establishConnectionFailed,retryCount=15
[2017/09/21 10:48:36.166]2017-09-21 02:49:20.774 [15c] E HTTP2Transport:networkLoopRetryingToConnect:reason=establishConnectionFailed,retryCount=16
[2017/09/21 10:48:59.410]2017-09-21 02:49:44.031 [15d] E HttpPost:doPostFailed:reason=curl_easy_performFailed,result=6,error=Couldn't resolve host name
[2017/09/21 10:48:59.425]2017-09-21 02:49:44.031 [15e] E AuthDelegate:handleLwaResponseFailed:reason=parseJsonFailed,position=0,error=The document is empty.
[2017/09/21 10:48:59.425]2017-09-21 02:49:44.032 [15f] 0 AuthDelegate:calculatedTimeToRetry:delayMs=40855
[2017/09/21 10:49:23.902]2017-09-21 02:50:08.528 [160] E HTTP2Transport:networkLoopRetryingToConnect:reason=establishConnectionFailed,retryCount=17
[2017/09/21 10:49:40.266]2017-09-21 02:50:24.887 [161] E HttpPost:doPostFailed:reason=curl_easy_performFailed,result=6,error=Couldn't resolve host name
[2017/09/21 10:49:40.266]2017-09-21 02:50:24.888 [162] E AuthDelegate:handleLwaResponseFailed:reason=parseJsonFailed,position=0,error=The document is empty.
[2017/09/21 10:49:40.266]2017-09-21 02:50:24.888 [163] 0 AuthDelegate:calculatedTimeToRetry:delayMs=45605
[2017/09/21 10:50:25.865]2017-09-21 02:51:10.494 [164] E HttpPost:doPostFailed:reason=curl_easy_performFailed,result=6,error=Couldn't resolve host name
[2017/09/21 10:50:25.881]2017-09-21 02:51:10.494 [165] E AuthDelegate:handleLwaResponseFailed:reason=parseJsonFailed,position=0,error=The document is empty.
[2017/09/21 10:50:25.881]2017-09-21 02:51:10.494 [166] 0 AuthDelegate:calculatedTimeToRetry:delayMs=74915
[2017/09/21 10:50:50.842]2017-09-21 02:51:35.466 [167] E HTTP2Transport:networkLoopRetryingToConnect:reason=establishConnectionFailed,retryCount=18
[2017/09/21 10:51:40.356]2017-09-21 02:52:24.944 [168] E HTTP2Transport:networkLoopRetryingToConnect:reason=establishConnectionFailed,retryCount=19
[2017/09/21 10:51:40.793]2017-09-21 02:52:25.410 [169] E HttpPost:doPostFailed:reason=curl_easy_performFailed,result=6,error=Couldn't resolve host name
[2017/09/21 10:51:40.793]2017-09-21 02:52:25.410 [16a] E AuthDelegate:handleLwaResponseFailed:reason=parseJsonFailed,position=0,error=The document is empty.
[2017/09/21 10:51:40.793]2017-09-21 02:52:25.410 [16b] 0 AuthDelegate:calculatedTimeToRetry:delayMs=55798
[2017/09/21 10:52:36.578]2017-09-21 02:53:21.209 [16c] E HttpPost:doPostFailed:reason=curl_easy_performFailed,result=6,error=Couldn't resolve host name
[2017/09/21 10:52:36.594]2017-09-21 02:53:21.209 [16d] E AuthDelegate:handleLwaResponseFailed:reason=parseJsonFailed,position=0,error=The document is empty.
[2017/09/21 10:52:36.594]2017-09-21 02:53:21.209 [16e] 0 AuthDelegate:calculatedTimeToRetry:delayMs=50683
[2017/09/21 10:53:03.535]2017-09-21 02:53:48.158 [16f] E HTTP2Transport:networkLoopRetryingToConnect:reason=establishConnectionFailed,retryCount=20
[2017/09/21 10:53:27.263]2017-09-21 02:54:11.893 [170] E HttpPost:doPostFailed:reason=curl_easy_performFailed,result=6,error=Couldn't resolve host name
[2017/09/21 10:53:27.278]2017-09-21 02:54:11.893 [171] E AuthDelegate:handleLwaResponseFailed:reason=parseJsonFailed,position=0,error=The document is empty.
[2017/09/21 10:53:27.278]2017-09-21 02:54:11.893 [172] 0 AuthDelegate:calculatedTimeToRetry:delayMs=38164
[2017/09/21 10:53:47.309]2017-09-21 02:54:31.926 [173] E HTTP2Transport:networkLoopRetryingToConnect:reason=establishConnectionFailed,retryCount=21
[2017/09/21 10:54:05.467]2017-09-21 02:54:50.058 [174] E HttpPost:doPostFailed:reason=curl_easy_performFailed,result=6,error=Couldn't resolve host name
[2017/09/21 10:54:05.467]2017-09-21 02:54:50.058 [175] E AuthDelegate:handleLwaResponseFailed:reason=parseJsonFailed,position=0,error=The document is empty.
[2017/09/21 10:54:05.483]2017-09-21 02:54:50.058 [176] 0 AuthDelegate:calculatedTimeToRetry:delayMs=32242
[2017/09/21 10:54:37.681]2017-09-21 02:55:22.301 [177] E HttpPost:doPostFailed:reason=curl_easy_performFailed,result=6,error=Couldn't resolve host name
[2017/09/21 10:54:37.697]2017-09-21 02:55:22.301 [178] E AuthDelegate:handleLwaResponseFailed:reason=parseJsonFailed,position=0,error=The document is empty.
[2017/09/21 10:54:37.697]2017-09-21 02:55:22.301 [179] 0 AuthDelegate:calculatedTimeToRetry:delayMs=84154
[2017/09/21 10:54:44.920]2017-09-21 02:55:29.515 [17a] E HTTP2Transport:networkLoopRetryingToConnect:reason=establishConnectionFailed,retryCount=22
[2017/09/21 10:55:36.368]2017-09-21 02:56:20.993 [17b] E HTTP2Transport:networkLoopRetryingToConnect:reason=establishConnectionFailed,retryCount=23
[2017/09/21 10:56:01.828]2017-09-21 02:56:46.456 [17c] E HttpPost:doPostFailed:reason=curl_easy_performFailed,result=6,error=Couldn't resolve host name
[2017/09/21 10:56:01.843]2017-09-21 02:56:46.456 [17d] E AuthDelegate:handleLwaResponseFailed:reason=parseJsonFailed,position=0,error=The document is empty.
[2017/09/21 10:56:01.843]2017-09-21 02:56:46.457 [17e] 0 AuthDelegate:calculatedTimeToRetry:delayMs=56787
[2017/09/21 10:56:33.418]2017-09-21 02:57:18.038 [17f] E HTTP2Transport:networkLoopRetryingToConnect:reason=establishConnectionFailed,retryCount=24
[2017/09/21 10:56:51.248]2017-09-21 02:57:35.882 [180] 0 AuthDelegate:onAuthStateChangeCalled:state=2,error=1
[2017/09/21 10:56:58.627]2017-09-21 02:57:43.244 [181] E HttpPost:doPostFailed:reason=curl_easy_performFailed,result=6,error=Couldn't resolve host name
[2017/09/21 10:56:58.627]2017-09-21 02:57:43.245 [182] E AuthDelegate:handleLwaResponseFailed:reason=parseJsonFailed,position=0,error=The document is empty.
[2017/09/21 10:56:58.627]2017-09-21 02:57:43.245 [183] 0 AuthDelegate:calculatedTimeToRetry:delayMs=89756
[2017/09/21 10:57:37.175]2017-09-21 02:58:21.774 [184] E HTTP2Transport:setupDownchannelStreamFailed:reason=getAuthTokenFailed
[2017/09/21 10:57:37.175]2017-09-21 02:58:21.774 [185] E HTTP2Transport:establishConnectionFailed:reason=setupDownchannelStreamFailed,error=INVALID_AUTH
[2017/09/21 10:57:37.175]2017-09-21 02:58:21.774 [186] I HTTP2Transport:networkLoop:reason=connect successful
[2017/09/21 10:57:37.175]2017-09-21 02:58:21.774 [187] I HTTP2Transport:networkLoop:reason=networkLoop thread stopped.

Liburl reports this error:
[2017/09/21 10:57:37.144]* getaddrinfo(3) failed for avs-alexa-na.amazon.com:443
[2017/09/21 10:57:37.144]* Couldn't resolve host 'avs-alexa-na.amazon.com'
[2017/09/21 10:57:37.144]* Closing connection 24

Thanks.

@kjkh
Copy link

kjkh commented Sep 21, 2017

Hi,

The AuthDelegate failures seem normal because it wasn't able to authenticate using LWA (on account of no connection), but the SDK should not have crashed.

  1. Which version of the SDK are you using?
  2. What state was the Sample App in before the disconnect happens?
  3. Do you also have the SDK logs leading up to the segfault? Or does it happen right after:

[2017/09/21 10:57:37.175]2017-09-21 02:58:21.774 [187] I HTTP2Transport:networkLoop:reason=networkLoop thread stopped

@jikaihu
Copy link
Author

jikaihu commented Sep 22, 2017

Hi kjjh,
The version of SDK is 0.6. Before the disconnect happened, The Sample App was connected to server.

It happened after the log "HTTP2Transport:networkLoop:reason=networkLoop thread stopped". I added this log for debug. 

Hear are the code that i added log:

    cleanupFinishedStreams();
    cleanupStalledStreams();
    if (isStopping()) {
           ACSDK_ERROR(LX("networkLoop")
            .d("reason", "networkLoop thread stopped"));
        break;
    }

    if (canProcessOutgoingMessage()) {
        processNextOutgoingMessage();
    }

  From the corefile , the crash happened at the below code . Because m_downchannelStream is a null pointer.

CURLMcode ret = curl_multi_remove_handle(m_multi->handle, m_downchannelStream->getCurlHandle());
if (ret != CURLM_OK) {
    ACSDK_ERROR(LX("networkLoopCleanupFailed")
                .d("reason", "curlFailure")
                .d("method", "curl_multi_remove_handle")
                .d("error", curl_multi_strerror(ret)));
    //Don't do anything here since we should to cleanup the downchannel stream anyway
}

Thanks.

@dhpp
Copy link
Contributor

dhpp commented Oct 10, 2017

Hi jikaihu,

We have fixed several issues like this within ACL since the 0.6 version, and I would suggest you update at least that library to our current release. We would definitely like to know if this issue exists in the current version.

If you are not able to update that library, you may still find value in diffing the code within the ACL/src/Transport directory, where the curl / stream management exists. The key files there would be HTTP2Transport and HTTP2Stream.

Hope this helps. Please re-open this ticket if there are direct follow-up questions, or if, as I mention, the issue still persists in the current release. Thanks.

-David

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants