Skip to content
This repository has been archived by the owner on Apr 6, 2019. It is now read-only.

Inconsistent async_read behaviour after 2.4.3 #17

Closed
willxinc opened this issue Jul 1, 2017 · 6 comments
Closed

Inconsistent async_read behaviour after 2.4.3 #17

willxinc opened this issue Jul 1, 2017 · 6 comments
Labels

Comments

@willxinc
Copy link

willxinc commented Jul 1, 2017

On versions 2.4.2 and before async_read would always call the callback with read_result.sucess == false after the read is finished.

After 2.4.2, this no longer appears to always be the case, in fact, after the client has finished sending all its info and disconnected, some of the time the callback won't be triggered at all.

Tested on Visual Studio 2017.

@Cylix
Copy link
Owner

Cylix commented Jul 2, 2017

Hi,

Thanks for reporting this issue.
This is an unexpected behavior that might indeed have appeared with the change submitted in v2.4.3!

The change done on v2.4.3 aimed to reduce CPU consumption on unix, but I still did a change on windows just in case.

void
io_service::process_rd_event(const fd_t& fd, tracked_socket& socket) {
  __TACOPIE_LOG(debug, "processing read event");

  auto rd_callback = socket.rd_callback;

  socket.is_executing_rd_callback = true;

  m_callback_workers << [=] {
    __TACOPIE_LOG(debug, "execute read callback");
    rd_callback(fd);

    std::lock_guard<std::mutex> lock(m_tracked_sockets_mtx);
    auto it = m_tracked_sockets.find(fd);

    if (it == m_tracked_sockets.end()) { return; }

    auto& socket                    = it->second;
    socket.is_executing_rd_callback = false;

    if (socket.marked_for_untrack && !socket.is_executing_wr_callback) {
      __TACOPIE_LOG(debug, "untrack socket");
      m_tracked_sockets.erase(it);
      m_wait_for_removal_condvar.notify_all();
+      m_notifier.notify();
    }
-      m_notifier.notify();
  };
}

I made a similar change for the process_wr_event.

At first sight, I thought it would have no impact.
Unfortunately, it has!
Indeed, the read_callback will start to execute. But if the read callback is executing, the socket is not set in the fd sets watched by select in the next select iteration.
Thus, select will sleep undefinitely.
Before, the call to m_notifier.notify() ensured that select would be waken up after the execution of any read/write callback, leading to reseting the fd sets and detect new event.
Now, it fails to wake up select that keep sleeping and do not detect new events.

Thanks a lot for reporting this issue which is indeed a bug, I truly appreciate!

I will submit a fix by tomorrow night (simply reverting the unnecessary buggy change).

Best!

@willxinc
Copy link
Author

willxinc commented Jul 2, 2017

The odd thing is that even though it fails to wake, sometimes I'm seeing the event still fire.

Regardless, thanks for the prompt response.

@Cylix
Copy link
Owner

Cylix commented Jul 2, 2017

Yeah, this is normal: if the callback finishes to execute before the next setup/call to select, it will work.

Using this assumption, I was able to reproduce the bug really easily by using tacopie itself without cpp_redis:

#include <tacopie/tacopie>

#include <condition_variable>
#include <iostream>
#include <mutex>
#include <signal.h>

#ifdef _WIN32
#include <Winsock2.h>
#endif /* _WIN32 */

std::condition_variable cv;

void
signint_handler(int) {
  cv.notify_all();
}

void
on_new_message(tacopie::tcp_client& client, const tacopie::tcp_client::read_result& res) {
  if (res.success) {
    std::cout << "Client recv data" << std::endl;
    client.async_write({res.buffer, nullptr});
    client.async_read({1024, std::bind(&on_new_message, std::ref(client), std::placeholders::_1)});
    std::this_thread::sleep_for(std::chrono::seconds(2));
  }
  else {
    std::cout << "Client disconnected" << std::endl;
    client.disconnect();
  }
}

int
main(void) {
#ifdef _WIN32
  //! Windows netword DLL init
  WORD version = MAKEWORD(2, 2);
  WSADATA data;

  if (WSAStartup(version, &data) != 0) {
    std::cerr << "WSAStartup() failure" << std::endl;
    return -1;
  }
#endif /* _WIN32 */

  tacopie::tcp_client client;
  client.connect("127.0.0.1", 3001);
  client.async_read({1024, std::bind(&on_new_message, std::ref(client), std::placeholders::_1)});

  signal(SIGINT, &signint_handler);

  std::mutex mtx;
  std::unique_lock<std::mutex> lock(mtx);
  cv.wait(lock);

#ifdef _WIN32
  WSACleanup();
#endif /* _WIN32 */

  return 0;
}

This is just a slightly modified version of the tacopie example where I introduced a 2 second sleep in the read callback.

I can't reproduce it with cpp_redis though. My guess is that all the publish responses arrive too quick and at the same time on my computer, resulting in only one async_read necessary.

@Cylix Cylix closed this as completed in ae0f2e9 Jul 2, 2017
@Cylix
Copy link
Owner

Cylix commented Jul 2, 2017

I published a fix in v2.4.4.
I closed this issue, but feel free to re-open it if the bug is still there or if you have any other issues to report!

Best

@LazyPlanet
Copy link

Multi thread get
_client.sync_commit(std::chrono::milliseconds(1000));

#0 0x00007fbc25bcb6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007fbc259699ec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x000000000053c20d in std::condition_variable::waitstd::__future_base::_State_base::wait()::{lambda()#1}(std::unique_lockstd::mutex&, std::__future_base::_State_base::wait()::{lambda()#1}) (this=0x7fbc00afb070, __lock=..., __p=...) at /usr/include/c++/4.8.2/condition_variable:93
#3 0x0000000000533500 in std::__future_base::_State_base::wait (this=0x7fbc00afb038) at /usr/include/c++/4.8.2/future:327
#4 0x00000000005449b9 in std::__basic_future<cpp_redis::reply>::_M_get_result (this=0x7fbc1ddb6770) at /usr/include/c++/4.8.2/future:596
#5 0x000000000053c3de in std::future<cpp_redis::reply>::get (this=0x7fbc1ddb6770) at /usr/include/c++/4.8.2/future:675
#6 0x00000000005347e1 in Adoter::Redis::Get (this=0x7fbc00afacb0, key="user:guest_1179", value=..., async=false) at ../Include/RedisManager.h:184

@Cylix
Copy link
Owner

Cylix commented Nov 20, 2017

Hi @LazyPlanet ,

Can you detail the issue you are experiencing?

Best

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants