Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection reset by peer errors #689

Closed
silverskysoft opened this issue Nov 21, 2017 · 13 comments
Closed

Connection reset by peer errors #689

silverskysoft opened this issue Nov 21, 2017 · 13 comments

Comments

@silverskysoft
Copy link

ISSUE TYPE
  • Bug Report
COMPONENT NAME
  • UI
  • Installer
SUMMARY

Connection reset by peer message in logs during fresh install. Eventually this same message shows up before during Job execution

ENVIRONMENT
  • AWX version: 1.0.1.223
  • AWX install method: docker on linux
  • Ansible version: 2.4.1
  • Operating System: Centos 7.3
  • Web Browser: Chrome 62
STEPS TO REPRODUCE

Install as per instructions in docker/playbook method in install guide.

EXPECTED RESULTS

Install logs are without errors

ACTUAL RESULTS

We see exception Control command error: error

[2017-11-20 22:31:04,214: DEBUG/ForkPoolWorker-1] Channel open
[2017-11-20 22:31:04,227: DEBUG/ForkPoolWorker-1] Start from server, version: 0.9, properties: {'information': 'Licensed under the MPL.  See http://www.rabbitmq.com/', 'product': 'RabbitMQ', 'copyright': 'Copyright (C) 2007-2017 Pivotal Software, Inc.', 'capabilities': {'exchange_exchange_bindings': True, 'connection.blocked': True, 'authentication_failure_close': True, 'direct_reply_to': True, 'basic.nack': True, 'per_consumer_qos': True, 'consumer_priorities': True, 'consumer_cancel_notify': True, 'publisher_confirms': True}, 'cluster_name': 'rabbit@a518adfcba80', 'platform': 'Erlang/OTP 19.2.1', 'version': '3.6.14'}, mechanisms: ['AMQPLAIN', 'PLAIN'], locales: [u'en_US']
[2017-11-20 22:31:04,229: DEBUG/ForkPoolWorker-1] using channel_id: 1
[2017-11-20 22:31:04,229: DEBUG/ForkPoolWorker-1] Channel open
[2017-11-20 22:31:04,231: DEBUG/MainProcess] pidbox received method active() [reply_to:{u'routing_key': u'96b1ee4c-5253-3275-99fe-0dc342057215', u'exchange': u'reply.celery.pidbox'} ticket:222bd936-4e7e-4036-b512-ac65dae9eef7]
[2017-11-20 22:31:04,232: ERROR/MainProcess] Control command error: error(104, 'Connection reset by peer')
Traceback (most recent call last):
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/worker/pidbox.py", line 42, in on_message
    self.node.handle_message(body, message)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/pidbox.py", line 129, in handle_message
    return self.dispatch(**body)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/pidbox.py", line 112, in dispatch
    ticket=ticket)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/pidbox.py", line 135, in reply
    serializer=self.mailbox.serializer)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/pidbox.py", line 265, in _publish_reply
    **opts
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/messaging.py", line 181, in publish
    exchange_name, declare,
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/messaging.py", line 194, in _publish
    [maybe_declare(entity) for entity in declare]
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/messaging.py", line 102, in maybe_declare
    return maybe_declare(entity, self.channel, retry, **retry_policy)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/common.py", line 129, in maybe_declare
    return _maybe_declare(entity, declared, ident, channel, orig)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/common.py", line 135, in _maybe_declare
    entity.declare(channel=channel)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/entity.py", line 185, in declare
    nowait=nowait, passive=passive,
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/amqp/channel.py", line 616, in exchange_declare
    wait=None if nowait else spec.Exchange.DeclareOk,
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/amqp/abstract_channel.py", line 50, in send_method
    conn.frame_writer(1, self.channel_id, sig, args, content)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/amqp/method_framing.py", line 166, in write_frame
    write(view[:offset])
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/amqp/transport.py", line 258, in write
    self._write(s)
  File "/usr/lib64/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 104] Connection reset by peer
[2017-11-20 22:31:04,242: DEBUG/MainProcess] Closed channel #2
[2017-11-20 22:31:04,242: DEBUG/MainProcess] using channel_id: 2
[2017-11-20 22:31:04,243: DEBUG/MainProcess] Channel open
ADDITIONAL INFORMATION

Generally things and jobs seems to run ok. But this exception keeps occuring on each Job Run.

Other errors seen during Job Runs:

2017-11-21 00:29:28,216 DEBUG    awx.main.models.inventory Going to update inventory computed fields
[2017-11-21 00:29:28,216: DEBUG/ForkPoolWorker-129] Going to update inventory computed fields
[2017-11-21 00:29:28,228: ERROR/MainProcess] Pool callback raised exception: TypeError('handle_work_error() takes at least 4 arguments (4 given)',)
Traceback (most recent call last):
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/billiard/pool.py", line 1747, in safe_apply_callback
    fun(*args, **kwargs)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/worker/request.py", line 366, in on_failure
    self.id, exc, request=self, store_result=self.store_errors,
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/backends/base.py", line 168, in mark_as_failure
    self._call_task_errbacks(request, exc, traceback)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/backends/base.py", line 175, in _call_task_errbacks
    errback(request, exc, traceback)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/canvas.py", line 178, in __call__
    return self.type(*args, **kwargs)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/app/trace.py", line 630, in __protected_call__
    return orig(self, *args, **kwargs)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/app/task.py", line 380, in __call__
    return self.run(*args, **kwargs)
TypeError: handle_work_error() takes at least 4 arguments (4 given)
2017-11-21 00:29:28,266 DEBUG    awx.main.models.inventory Finished updating inventory computed fields
[2017-11-21 00:29:28,266: DEBUG/ForkPoolWorker-129] Finished updating inventory computed fields



2017-11-21 00:29:28,016 DEBUG    awx.main.models.inventory Going to update inventory computed fields
2017-11-21 00:29:28,058 DEBUG    awx.main.models.inventory Finished updating inventory computed fields
2017-11-21 00:29:28,058 WARNING  awx.main.consumers Group sending is currently disabled. Would have sent the following message
Channel: jobs-summary, Payload: {'unified_job_id': 4, 'group_name': 'jobs'}
2017-11-21 00:29:28,194 WARNING  awx.main.consumers Group sending is currently disabled. Would have sent the following message
Channel: jobs-status_changed, Payload: {'status': 'failed', 'unified_job_id': 4, 'group_name': 'jobs'}
[2017-11-21 00:29:28,194: WARNING/ForkPoolWorker-128] Group sending is currently disabled. Would have sent the following message
Channel: jobs-status_changed, Payload: {'status': 'failed', 'unified_job_id': 4, 'group_name': 'jobs'}
[2017-11-21 00:29:28,197: INFO/MainProcess] Received task: awx.main.tasks.update_inventory_computed_fields[f30c4d04-d33c-4f51-9b64-db4a5374fda7]
[2017-11-21 00:29:28,198: DEBUG/MainProcess] TaskPool: Apply <function _fast_trace_task at 0x3014cf8> (args:('awx.main.tasks.update_inventory_computed_fields', 'f30c4d04-d33c-4f51-9b64-db4a5374fda7', {'origin': 'gen469@awx', 'lang': 'py', 'task': 'awx.main.tasks.update_inventory_computed_fields', 'group': None, 'root_id': '12b3968e-303c-4a8d-9182-289365f0161f', u'delivery_info': {u'priority': None, u'redelivered': False, u'routing_key': u'tower', u'exchange': u''}, 'expires': None, u'correlation_id': 'f30c4d04-d33c-4f51-9b64-db4a5374fda7', 'retries': 0, 'timelimit': [None, None], 'argsrepr': '(2, True)', 'eta': None, 'parent_id': '75da9786-f194-46e3-a29f-c433da73f055', u'reply_to': '418ff41a-8535-3908-9e46-afed985bd76c', 'id': 'f30c4d04-d33c-4f51-9b64-db4a5374fda7', 'kwargsrepr': '{}'}, u'[[2, true], {}, {"chord": null, "callbacks": null, "errbacks": null, "chain": null}]', 'application/json', 'utf-8') kwargs:{})
[2017-11-21 00:29:28,200: DEBUG/MainProcess] Task accepted: awx.main.tasks.update_inventory_computed_fields[f30c4d04-d33c-4f51-9b64-db4a5374fda7] pid:470
[2017-11-21 00:29:28,204: WARNING/ForkPoolWorker-128] /var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/app/trace.py:549: RuntimeWarning: Exception raised outside body: TypeError('handle_work_error() takes at least 4 arguments (4 given)',):
Traceback (most recent call last):
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/app/trace.py", line 388, in trace_task
    I, R, state, retval = on_error(task_request, exc, uuid)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/app/trace.py", line 330, in on_error
    task, request, eager=eager, call_errbacks=call_errbacks,
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/app/trace.py", line 164, in handle_error_state
    call_errbacks=call_errbacks)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/app/trace.py", line 209, in handle_failure
    call_errbacks=call_errbacks,
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/backends/base.py", line 168, in mark_as_failure
    self._call_task_errbacks(request, exc, traceback)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/backends/base.py", line 175, in _call_task_errbacks
    errback(request, exc, traceback)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/canvas.py", line 178, in __call__
    return self.type(*args, **kwargs)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/app/trace.py", line 630, in __protected_call__
    return orig(self, *args, **kwargs)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/app/task.py", line 380, in __call__
    return self.run(*args, **kwargs)
TypeError: handle_work_error() takes at least 4 arguments (4 given)
  exc, exc_info.traceback)))
2017-11-21 00:29:28,216 DEBUG    awx.main.models.inventory Going to update inventory computed fields
[2017-11-21 00:29:28,216: DEBUG/ForkPoolWorker-129] Going to update inventory computed fields

These seem simple argument count errors.

@matburt
Copy link
Member

matburt commented Nov 27, 2017

I think this occurred due to us working out some kinks on integrating new dependencies, can you try again with the latest code?

@matburt matburt closed this as completed Nov 27, 2017
@ewjoachim
Copy link
Contributor

I've got the same traceback with 1.0.1.229, and it seems I have tasks stopping in the middle, not updating. Should I open a new ticket ?

@matburt matburt reopened this Dec 1, 2017
@matburt
Copy link
Member

matburt commented Dec 1, 2017

No lets reopen this one.

@ewjoachim
Copy link
Contributor

It's interesting to note that we realised the server was low in terms of RAM, we resized it, restarted everything and it seems we don't have the problem anymore. I don't know if the problem was because of an OOM, or if it was simply a problem that was solved by restarting the server. (it could be interesting to note that AWX with local docker needs at least 3 to 4 Gb)

@matburt
Copy link
Member

matburt commented Dec 1, 2017

Yep that fits in with the Tower system requirements also.

@ewjoachim
Copy link
Contributor

ewjoachim commented Dec 3, 2017

Oh, I just saw it. Could be interesting to link to this page in the AWX installation guide, because if you follow the AWX requirements, you don't know you've failed the Tower requirements :/

Note : the Tower reqs say "2GB minimum, 4GB recommended". Given that with 2GB, and a single project, 2 inventories of 3 machines each, and a single user and a single job running, and nothing else on the machine except the 5 docker containers, on a fresh Debian, I already have OOMs, maybe it's a bit optimistic to say 2GB is enough. We don't know if it is what caused this problem in the first place (all I know is that I haven't reproduced since I resized, but there's not much proving the causality here), but as far as I can tell:

  • If the problem doesn't reappear in the next days, I'd be quite inclined to believe this issue is because of the missing RAM. Additional reports would help corner this.
  • Adding a link to the physical reqs in the AWX install guide wouldn't hurt
  • Raising the RAM to 3 or 4GB in the tower docs would feel safer, in my very personnal opinion. I can only guess that AWX had a smaller memory footprint when this requirement was written, and by then, 2GB were low but enough.

@ewjoachim
Copy link
Contributor

ewjoachim commented Dec 3, 2017

A bit of data in my case: this is my AWX installation with a single [everything]. No celery job and no web requests are being processed at the moment, the system is ready but idle.

awx@awx:~$ docker ps
CONTAINER ID        IMAGE                     COMMAND                  CREATED             STATUS              PORTS                                NAMES
e7c1cdf1e963        ansible/awx_task:latest   "/tini -- /bin/sh ..."   3 days ago          Up 3 days           8052/tcp                             awx_task
bf810b722b61        ansible/awx_web:latest    "/tini -- /bin/sh ..."   3 days ago          Up 3 days           0.0.0.0:8052->8052/tcp               awx_web
c90e72ad48d2        memcached:alpine          "docker-entrypoint..."   6 days ago          Up 5 days           11211/tcp                            memcached
c4c429e528c5        rabbitmq:3                "docker-entrypoint..."   6 days ago          Up 5 days           4369/tcp, 5671-5672/tcp, 25672/tcp   rabbitmq
bf7691976645        postgres:9.6              "docker-entrypoint..."   6 days ago          Up 5 days           5432/tcp                             postgres
awx@awx:~$ docker stats --no-stream
CONTAINER           CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
e7c1cdf1e963        0.10%               717.2MiB / 3.863GiB   18.13%              466MB / 548MB       369kB / 23.4MB      18
bf810b722b61        0.13%               1.063GiB / 3.863GiB   27.52%              109MB / 98.9MB      631kB / 4.1kB       12
c90e72ad48d2        0.04%               19.01MiB / 3.863GiB   0.48%               141MB / 96.6MB      1.98MB / 0B         10
c4c429e528c5        0.09%               77.89MiB / 3.863GiB   1.97%               721MB / 434MB       30.2MB / 205MB      75
bf7691976645        0.00%               22.04MiB / 3.863GiB   0.56%               552MB / 539MB       47.5MB / 4.64GB     11

As you can see, 1.8 GB are already used. Given the OS itself and the docker daemon need a little bit too, 2GB seems really like hovering a gun above our feet.

@matburt
Copy link
Member

matburt commented Dec 4, 2017

I've added some basic system requirements here: cf0cc2e

We'll keep this open to investigate. I worry we'll run into this when the system is under heavy load.

@Aletor93
Copy link

Aletor93 commented Dec 12, 2017

As mentionned on #767, I'm matching requisites and still have the same issue :
ISSUE
Error loop (multiples installs / reinstalls, from scratch or not, multiple different commits), UI isn't available but I've awx_web favicon when I reach server's IP over HTTP.

SPECS
Host running docker :

Hostname	Dist	CPU	Cores	vCPUs	RAM (Gio)
tcblora0484b	RedHat 7.3	2	8	32	251.7

Docker stats :

# docker stats --no-stream
CONTAINER           CPU %               MEM USAGE / LIMIT       MEM %               NET I/O               BLOCK I/O             PIDS
492a538278ad        0.11%               466.2 MiB / 251.7 GiB   0.18%               834.1 kB / 1.012 MB   89.08 MB / 1.778 MB   11
50f7dd9f95c4        0.15%               615.4 MiB / 251.7 GiB   0.24%               96.96 kB / 5.67 MB    93.37 MB / 0 B        12
26cde9fd47cd        0.03%               9.523 MiB / 251.7 GiB   0.00%               187.5 kB / 131 kB     2.606 MB / 0 B        10
5cdf1c293e69        0.47%               142 MiB / 251.7 GiB     0.06%               210.8 kB / 208.4 kB   32.54 MB / 63.81 MB   553
9d0fc83c61ac        0.00%               45.88 MiB / 251.7 GiB   0.02%               674.7 kB / 547.7 kB   26.42 MB / 65.93 MB   6

AWX BUILD

# git show
commit 1bc2d83403f980e0a1f55e3280c6ef25ae65f806
Merge: 9be438a 8c90d36
Author: Ryan Petrello 
Date:   Mon Dec 11 20:35:27 2017 -0500

    Merge pull request #789 from ryanpetrello/multivault-acceptance
    
    add some more tests and acceptance docs to wrap up multivault support

LOGS

# sudo docker logs -f awx_task
Using /etc/ansible/ansible.cfg as config file
[DEPRECATION WARNING]: The sudo command line option has been deprecated in 
favor of the "become" command line arguments. This feature will be removed in 
version 2.6. Deprecation warnings can be disabled by setting 
deprecation_warnings=False in ansible.cfg.
127.0.0.1 | SUCCESS => {
    "changed": false, 
    "db": "awx"
}
Operations to perform:
  Apply all migrations: auth, conf, contenttypes, django_celery_results, main, sessions, sites, social_django, sso, taggit
Running migrations:
  Applying main.0010_saved_launchtime_configs... OK
  Applying main.0011_blank_start_args... OK
Traceback (most recent call last):
  File "/usr/bin/awx-manage", line 9, in 
    load_entry_point('awx==1.0.1.316', 'console_scripts', 'awx-manage')()
  File "/usr/lib/python2.7/site-packages/awx/__init__.py", line 109, in manage
    execute_from_command_line(sys.argv)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/core/management/__init__.py", line 364, in execute_from_command_line
    utility.execute()
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/core/management/__init__.py", line 356, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/core/management/base.py", line 283, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/core/management/base.py", line 330, in execute
    output = self.handle(*args, **options)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/core/management/commands/shell.py", line 101, in handle
    exec(sys.stdin.read())
  File "", line 1, in 
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/contrib/auth/models.py", line 170, in create_superuser
    return self._create_user(username, email, password, **extra_fields)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/contrib/auth/models.py", line 153, in _create_user
    user.save(using=self._db)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/contrib/auth/base_user.py", line 80, in save
    super(AbstractBaseUser, self).save(*args, **kwargs)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/db/models/base.py", line 808, in save
    force_update=force_update, update_fields=update_fields)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/db/models/base.py", line 838, in save_base
    updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/db/models/base.py", line 924, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/db/models/base.py", line 963, in _do_insert
    using=using, raw=raw)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/db/models/query.py", line 1076, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 1107, in execute_sql
    cursor.execute(sql, params)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/db/backends/utils.py", line 65, in execute
    return self.cursor.execute(sql, params)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/db/utils.py", line 94, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/db/backends/utils.py", line 65, in execute
    return self.cursor.execute(sql, params)
django.db.utils.IntegrityError: duplicate key value violates unique constraint "auth_user_username_key"
DETAIL:  Key (username)=(admin) already exists.
Instance already registered awx
Instance Group already registered tower
Instance already registered awx
2017-12-12 13:48:39,231 CRIT Supervisor running as root (no user in config file)
2017-12-12 13:48:39,231 WARN For [program:awx-celeryd-beat], redirect_stderr=true but stderr_logfile has also been set to a filename, the filename has been ignored
2017-12-12 13:48:39,245 INFO RPC interface 'supervisor' initialized
2017-12-12 13:48:39,245 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2017-12-12 13:48:39,245 INFO supervisord started with pid 92
2017-12-12 13:48:40,248 INFO spawned: 'celery' with pid 95
2017-12-12 13:48:40,249 INFO spawned: 'channels-worker' with pid 96
2017-12-12 13:48:40,250 INFO spawned: 'callback-receiver' with pid 97
2017-12-12 13:48:40,252 INFO spawned: 'awx-celeryd-beat' with pid 98
2017-12-12 13:48:41,252 INFO success: celery entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-12-12 13:48:41,252 INFO success: channels-worker entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-12-12 13:48:41,252 INFO success: callback-receiver entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-12-12 13:48:41,252 INFO success: awx-celeryd-beat entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-12-12 13:48:42,964 INFO     awx.main.tasks Syncing Schedules
2017-12-12 13:48:42,964 INFO     awx.main.tasks Syncing Schedules
[2017-12-12 13:48:42,988: DEBUG/MainProcess] Setting default socket timeout to 30
[2017-12-12 13:48:42,988: INFO/MainProcess] beat: Starting...
2017-12-12 13:48:43,084 - INFO - runworker - Using single-threaded worker.
2017-12-12 13:48:43,084 - INFO - runworker - Running worker against channel layer default (asgi_amqp.core.AMQPChannelLayer)
2017-12-12 13:48:43,085 - INFO - worker - Listening on channels websocket.connect, websocket.disconnect, websocket.receive
/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/platforms.py:795: RuntimeWarning: You're running the worker with superuser privileges: this is
absolutely not recommended!
Please specify a different user using the -u option.
User information: uid=0 euid=0 gid=0 egid=0
  uid=uid, euid=euid, gid=gid, egid=egid,
[2017-12-12 13:48:43,139: DEBUG/MainProcess] Current schedule:







[2017-12-12 13:48:43,139: DEBUG/MainProcess] beat: Ticking with max interval->1.00 minute
[2017-12-12 13:48:43,141: DEBUG/MainProcess] beat: Waking up in 19.96 seconds.
[2017-12-12 13:48:43,239: DEBUG/MainProcess] | Worker: Preparing bootsteps.
[2017-12-12 13:48:43,242: DEBUG/MainProcess] | Worker: Building graph...
[2017-12-12 13:48:43,242: DEBUG/MainProcess] | Worker: New boot order: {Beat, Timer, Hub, Pool, Autoscaler, StateDB, Consumer}
[2017-12-12 13:48:43,267: DEBUG/MainProcess] | Consumer: Preparing bootsteps.
[2017-12-12 13:48:43,267: DEBUG/MainProcess] | Consumer: Building graph...
[2017-12-12 13:48:43,290: DEBUG/MainProcess] | Consumer: New boot order: {Connection, Events, Mingle, Tasks, Control, Gossip, Agent, Heart, event loop}
[2017-12-12 13:48:43,300: DEBUG/MainProcess] | Worker: Starting Hub
[2017-12-12 13:48:43,300: DEBUG/MainProcess] ^-- substep ok
[2017-12-12 13:48:43,300: DEBUG/MainProcess] | Worker: Starting Pool
[2017-12-12 13:48:43,300: DEBUG/MainProcess] ^-- substep ok
[2017-12-12 13:48:43,300: DEBUG/MainProcess] | Worker: Starting Autoscaler
[2017-12-12 13:48:43,300: DEBUG/MainProcess] ^-- substep ok
[2017-12-12 13:48:43,301: DEBUG/MainProcess] | Worker: Starting Consumer
[2017-12-12 13:48:43,301: DEBUG/MainProcess] | Consumer: Starting Connection
[2017-12-12 13:48:43,315: DEBUG/MainProcess] Start from server, version: 0.9, properties: {'information': 'Licensed under the MPL.  See http://www.rabbitmq.com/', 'product': 'RabbitMQ', 'copyright': 'Copyright (C) 2007-2017 Pivotal Software, Inc.', 'capabilities': {'exchange_exchange_bindings': True, 'connection.blocked': True, 'authentication_failure_close': True, 'direct_reply_to': True, 'basic.nack': True, 'per_consumer_qos': True, 'consumer_priorities': True, 'consumer_cancel_notify': True, 'publisher_confirms': True}, 'cluster_name': 'rabbit@5cdf1c293e69', 'platform': 'Erlang/OTP 19.2.1', 'version': '3.6.14'}, mechanisms: ['AMQPLAIN', 'PLAIN'], locales: [u'en_US']
[2017-12-12 13:48:43,316: INFO/MainProcess] Connected to amqp://guest:**@rabbitmq:5672/awx
[2017-12-12 13:48:43,317: DEBUG/MainProcess] ^-- substep ok
[2017-12-12 13:48:43,317: DEBUG/MainProcess] | Consumer: Starting Events
[2017-12-12 13:48:43,331: DEBUG/MainProcess] Start from server, version: 0.9, properties: {'information': 'Licensed under the MPL.  See http://www.rabbitmq.com/', 'product': 'RabbitMQ', 'copyright': 'Copyright (C) 2007-2017 Pivotal Software, Inc.', 'capabilities': {'exchange_exchange_bindings': True, 'connection.blocked': True, 'authentication_failure_close': True, 'direct_reply_to': True, 'basic.nack': True, 'per_consumer_qos': True, 'consumer_priorities': True, 'consumer_cancel_notify': True, 'publisher_confirms': True}, 'cluster_name': 'rabbit@5cdf1c293e69', 'platform': 'Erlang/OTP 19.2.1', 'version': '3.6.14'}, mechanisms: ['AMQPLAIN', 'PLAIN'], locales: [u'en_US']
[2017-12-12 13:48:43,333: DEBUG/MainProcess] ^-- substep ok
[2017-12-12 13:48:43,333: DEBUG/MainProcess] | Consumer: Starting Mingle
[2017-12-12 13:48:43,333: INFO/MainProcess] mingle: searching for neighbors
[2017-12-12 13:48:43,333: DEBUG/MainProcess] using channel_id: 1
[2017-12-12 13:48:43,334: DEBUG/MainProcess] Channel open
[2017-12-12 13:48:43,358: DEBUG/MainProcess] Start from server, version: 0.9, properties: {'information': 'Licensed under the MPL.  See http://www.rabbitmq.com/', 'product': 'RabbitMQ', 'copyright': 'Copyright (C) 2007-2017 Pivotal Software, Inc.', 'capabilities': {'exchange_exchange_bindings': True, 'connection.blocked': True, 'authentication_failure_close': True, 'direct_reply_to': True, 'basic.nack': True, 'per_consumer_qos': True, 'consumer_priorities': True, 'consumer_cancel_notify': True, 'publisher_confirms': True}, 'cluster_name': 'rabbit@5cdf1c293e69', 'platform': 'Erlang/OTP 19.2.1', 'version': '3.6.14'}, mechanisms: ['AMQPLAIN', 'PLAIN'], locales: [u'en_US']
[2017-12-12 13:48:43,359: DEBUG/MainProcess] using channel_id: 1
[2017-12-12 13:48:43,360: DEBUG/MainProcess] Channel open
[2017-12-12 13:48:44,380: INFO/MainProcess] mingle: all alone
[2017-12-12 13:48:44,380: DEBUG/MainProcess] ^-- substep ok
[2017-12-12 13:48:44,380: DEBUG/MainProcess] | Consumer: Starting Tasks
[2017-12-12 13:48:44,401: DEBUG/MainProcess] ^-- substep ok
[2017-12-12 13:48:44,401: DEBUG/MainProcess] | Consumer: Starting Control
[2017-12-12 13:48:44,401: DEBUG/MainProcess] using channel_id: 2
[2017-12-12 13:48:44,402: DEBUG/MainProcess] Channel open
[2017-12-12 13:48:44,405: DEBUG/MainProcess] ^-- substep ok
[2017-12-12 13:48:44,405: DEBUG/MainProcess] | Consumer: Starting Gossip
[2017-12-12 13:48:44,405: DEBUG/MainProcess] using channel_id: 3
[2017-12-12 13:48:44,406: DEBUG/MainProcess] Channel open
[2017-12-12 13:48:44,412: DEBUG/MainProcess] ^-- substep ok
[2017-12-12 13:48:44,412: DEBUG/MainProcess] | Consumer: Starting Heart
[2017-12-12 13:48:44,412: DEBUG/MainProcess] using channel_id: 1
[2017-12-12 13:48:44,413: DEBUG/MainProcess] Channel open
[2017-12-12 13:48:44,414: DEBUG/MainProcess] ^-- substep ok
[2017-12-12 13:48:44,414: DEBUG/MainProcess] | Consumer: Starting event loop
[2017-12-12 13:48:44,414: DEBUG/MainProcess] | Worker: Hub.register Autoscaler...
[2017-12-12 13:48:44,414: DEBUG/MainProcess] | Worker: Hub.register Pool...
[2017-12-12 13:48:44,416: INFO/MainProcess] celery@localhost ready.
[2017-12-12 13:48:44,416: DEBUG/MainProcess] basic.qos: prefetch_count->4
[2017-12-12 13:49:03,113: DEBUG/MainProcess] beat: Synchronizing schedule...
[2017-12-12 13:49:03,130: DEBUG/MainProcess] Start from server, version: 0.9, properties: {'information': 'Licensed under the MPL.  See http://www.rabbitmq.com/', 'product': 'RabbitMQ', 'copyright': 'Copyright (C) 2007-2017 Pivotal Software, Inc.', 'capabilities': {'exchange_exchange_bindings': True, 'connection.blocked': True, 'authentication_failure_close': True, 'direct_reply_to': True, 'basic.nack': True, 'per_consumer_qos': True, 'consumer_priorities': True, 'consumer_cancel_notify': True, 'publisher_confirms': True}, 'cluster_name': 'rabbit@5cdf1c293e69', 'platform': 'Erlang/OTP 19.2.1', 'version': '3.6.14'}, mechanisms: ['AMQPLAIN', 'PLAIN'], locales: [u'en_US']
[2017-12-12 13:49:03,132: INFO/MainProcess] Scheduler: Sending due task task_manager (awx.main.scheduler.tasks.run_task_manager)
[2017-12-12 13:49:03,158: DEBUG/MainProcess] using channel_id: 1
[2017-12-12 13:49:03,159: DEBUG/MainProcess] Channel open
[2017-12-12 13:49:03,161: DEBUG/MainProcess] awx.main.scheduler.tasks.run_task_manager sent. id->90d960bf-f68d-4572-954d-0dc50be6d192
[2017-12-12 13:49:03,161: DEBUG/MainProcess] beat: Waking up in 9.94 seconds.
[2017-12-12 13:49:03,167: INFO/MainProcess] Received task: awx.main.scheduler.tasks.run_task_manager[90d960bf-f68d-4572-954d-0dc50be6d192]   expires:[2017-12-12 13:49:23.138570+00:00]
[2017-12-12 13:49:03,167: INFO/MainProcess] Scaling up 1 processes.
[2017-12-12 13:49:03,322: DEBUG/MainProcess] TaskPool: Apply  (args:('awx.main.scheduler.tasks.run_task_manager', '90d960bf-f68d-4572-954d-0dc50be6d192', {'origin': 'gen98@awx', 'lang': 'py', 'task': 'awx.main.scheduler.tasks.run_task_manager', 'group': None, 'root_id': '90d960bf-f68d-4572-954d-0dc50be6d192', u'delivery_info': {u'priority': 0, u'redelivered': False, u'routing_key': u'tower', u'exchange': u''}, 'expires': '2017-12-12T13:49:23.138570+00:00', u'correlation_id': '90d960bf-f68d-4572-954d-0dc50be6d192', 'retries': 0, 'timelimit': [None, None], 'argsrepr': '()', 'eta': None, 'parent_id': None, u'reply_to': 'fa6cb10b-32bf-3bc7-9f4b-4b848b91bbe0', 'id': '90d960bf-f68d-4572-954d-0dc50be6d192', 'kwargsrepr': '{}'}, u'[[], {}, {"chord": null, "callbacks": null, "errbacks": null, "chain": null}]', 'application/json', 'utf-8') kwargs:{})
[2017-12-12 13:49:03,323: DEBUG/MainProcess] basic.qos: prefetch_count->8
[2017-12-12 13:49:03,324: DEBUG/ForkPoolWorker-1] Closed channel #1
[2017-12-12 13:49:03,344: DEBUG/MainProcess] Task accepted: awx.main.scheduler.tasks.run_task_manager[90d960bf-f68d-4572-954d-0dc50be6d192] pid:139
2017-12-12 13:49:03,354 DEBUG    awx.main.scheduler Running Tower task manager.
2017-12-12 13:49:03,354 DEBUG    awx.main.scheduler Running Tower task manager.
[2017-12-12 13:49:03,354: DEBUG/ForkPoolWorker-1] Running Tower task manager.
2017-12-12 13:49:03,362 DEBUG    awx.main.scheduler Starting Scheduler
2017-12-12 13:49:03,362 DEBUG    awx.main.scheduler Starting Scheduler
[2017-12-12 13:49:03,362: DEBUG/ForkPoolWorker-1] Starting Scheduler
2017-12-12 13:49:03,364 DEBUG    awx.main.scheduler Failing inconsistent running jobs.
2017-12-12 13:49:03,364 DEBUG    awx.main.scheduler Failing inconsistent running jobs.
[2017-12-12 13:49:03,364: DEBUG/ForkPoolWorker-1] Failing inconsistent running jobs.
[2017-12-12 13:49:03,391: DEBUG/ForkPoolWorker-1] Start from server, version: 0.9, properties: {'information': 'Licensed under the MPL.  See http://www.rabbitmq.com/', 'product': 'RabbitMQ', 'copyright': 'Copyright (C) 2007-2017 Pivotal Software, Inc.', 'capabilities': {'exchange_exchange_bindings': True, 'connection.blocked': True, 'authentication_failure_close': True, 'direct_reply_to': True, 'basic.nack': True, 'per_consumer_qos': True, 'consumer_priorities': True, 'consumer_cancel_notify': True, 'publisher_confirms': True}, 'cluster_name': 'rabbit@5cdf1c293e69', 'platform': 'Erlang/OTP 19.2.1', 'version': '3.6.14'}, mechanisms: ['AMQPLAIN', 'PLAIN'], locales: [u'en_US']
[2017-12-12 13:49:03,392: DEBUG/ForkPoolWorker-1] using channel_id: 1
[2017-12-12 13:49:03,393: DEBUG/ForkPoolWorker-1] Channel open
[2017-12-12 13:49:03,412: DEBUG/ForkPoolWorker-1] Start from server, version: 0.9, properties: {'information': 'Licensed under the MPL.  See http://www.rabbitmq.com/', 'product': 'RabbitMQ', 'copyright': 'Copyright (C) 2007-2017 Pivotal Software, Inc.', 'capabilities': {'exchange_exchange_bindings': True, 'connection.blocked': True, 'authentication_failure_close': True, 'direct_reply_to': True, 'basic.nack': True, 'per_consumer_qos': True, 'consumer_priorities': True, 'consumer_cancel_notify': True, 'publisher_confirms': True}, 'cluster_name': 'rabbit@5cdf1c293e69', 'platform': 'Erlang/OTP 19.2.1', 'version': '3.6.14'}, mechanisms: ['AMQPLAIN', 'PLAIN'], locales: [u'en_US']
[2017-12-12 13:49:03,412: DEBUG/ForkPoolWorker-1] using channel_id: 1
[2017-12-12 13:49:03,413: DEBUG/ForkPoolWorker-1] Channel open
[2017-12-12 13:49:03,415: DEBUG/MainProcess] pidbox received method active() [reply_to:{u'routing_key': u'f2253e23-a877-3fc4-b5d2-2b4ce1e78c69', u'exchange': u'reply.celery.pidbox'} ticket:60cb820f-4daa-499e-9759-da71326d9aed]
[2017-12-12 13:49:03,416: ERROR/MainProcess] Control command error: error(104, 'Connection reset by peer')
Traceback (most recent call last):
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/worker/pidbox.py", line 42, in on_message
    self.node.handle_message(body, message)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/pidbox.py", line 129, in handle_message
    return self.dispatch(**body)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/pidbox.py", line 112, in dispatch
    ticket=ticket)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/pidbox.py", line 135, in reply
    serializer=self.mailbox.serializer)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/pidbox.py", line 265, in _publish_reply
    **opts
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/messaging.py", line 181, in publish
    exchange_name, declare,
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/messaging.py", line 194, in _publish
    [maybe_declare(entity) for entity in declare]
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/messaging.py", line 102, in maybe_declare
    return maybe_declare(entity, self.channel, retry, **retry_policy)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/common.py", line 129, in maybe_declare
    return _maybe_declare(entity, declared, ident, channel, orig)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/common.py", line 135, in _maybe_declare
    entity.declare(channel=channel)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/entity.py", line 185, in declare
    nowait=nowait, passive=passive,
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/amqp/channel.py", line 616, in exchange_declare
    wait=None if nowait else spec.Exchange.DeclareOk,
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/amqp/abstract_channel.py", line 50, in send_method
    conn.frame_writer(1, self.channel_id, sig, args, content)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/amqp/method_framing.py", line 166, in write_frame
    write(view[:offset])
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/amqp/transport.py", line 258, in write
    self._write(s)
  File "/usr/lib64/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 104] Connection reset by peer
[2017-12-12 13:49:03,737: DEBUG/MainProcess] Closed channel #2
[2017-12-12 13:49:03,738: DEBUG/MainProcess] using channel_id: 2

@matburt
Copy link
Member

matburt commented Dec 12, 2017

You'll have to excuse us a bit as we're working through some of these issues after upgrading some dependencies. From my experience the connection reset doesn't seem to functionally be a problem. it seems to reconnect after the fact anyway.

@Aletor93
Copy link

No problem, thank you for your time, I'll wait for the updates but I want to make more tests.

As a reminder I'm not able to use webui after running the playbook not matter what I'm trying to do or by editing proxy parameters.
Issue's author seems to have access to the web UI because he is mention that he also see python error during job execution. Not my exact same case.

I'll try to run playbook installer on another host just to be sure it's not a local issue. I'll update this post regarding my tests.

@Aletor93
Copy link

Ok sorry, a colleague tried Web UI on Google Chrome and it worked. I tried on IE11 & FF40 in before. I'm so angry...

@chrismeyersfsu
Copy link
Member

@silverskysoft PR #921 should fix the handle_work_error problem. Can you verify?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants