Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deployment Issue at step 6 #4

Closed
aphexyuri opened this issue Apr 11, 2023 · 13 comments
Closed

Deployment Issue at step 6 #4

aphexyuri opened this issue Apr 11, 2023 · 13 comments

Comments

@aphexyuri
Copy link

aphexyuri commented Apr 11, 2023

Firstly, thanks for the work on the TF/Ansible deployment. I did however run in to an issue at step 6 with the following:

Hoping it's a simple fix or something I'm missing; perhaps some steps required to set up Prometheus. Help would be greatly appreciated.


The error appears to be in '/Users/myuser/Desktop/terraform-polygon-supernets/ansible/site.yml': line 36, column 7, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

  roles:
    - prometheus.prometheus.node_exporter
      ^ here```
@tinom9
Copy link

tinom9 commented Apr 15, 2023

Have you installed the Ansible requirements?

cd ansible
ansible-galaxy install -r requirements.yml

@aphexyuri
Copy link
Author

@tinom9 that did the trick., it can't seem to reach the nodes and running alias ansible='ansible --inventory inventory/aws_ec2.yml --vault-password-file=password.txt --extra-vars "@local-extra-vars.yml"' ansible -m all ping give me:

(base) ➜  ansible git:(main) ✗ alias ansible='ansible --inventory inventory/aws_ec2.yml --vault-password-file=password.txt --extra-vars "@local-extra-vars.yml"'
ansible -m all ping
[WARNING]: Could not match supplied host pattern, ignoring: ping
[WARNING]: No hosts matched, nothing to do

inventory looks okay:

(base) ➜  ansible git:(main) ✗ ansible-inventory --graph
@all:
  |--@ungrouped:
  |--@aws_ec2:
  |  |--i-07d77f8eeb0b41523
  |  |--i-084e52f1f7ed26860
  |  |--i-0a3c97f08afce6885
  |  |--i-0d6132a33909c748f
  |--@validator:
  |  |--i-07d77f8eeb0b41523
  |  |--i-084e52f1f7ed26860
  |  |--i-0a3c97f08afce6885
  |  |--i-0d6132a33909c748f
  |--@devnet01_edge_rg_private:
  |  |--i-07d77f8eeb0b41523
  |  |--i-084e52f1f7ed26860
  |  |--i-0a3c97f08afce6885
  |  |--i-0d6132a33909c748f
  |--@validator_001:
  |  |--i-07d77f8eeb0b41523
  |--@validator_004:
  |  |--i-084e52f1f7ed26860
  |--@validator_002:
  |  |--i-0a3c97f08afce6885
  |--@validator_003:
  |  |--i-0d6132a33909c748f

@tinom9
Copy link

tinom9 commented Apr 15, 2023

Try ansible all -m ping :)

@aphexyuri
Copy link
Author

Great, ty! Ping runs but seems like the instances aren't reachable:

(base) ➜  ansible git:(main) ✗ ansible all -m ping
i-084e52f1f7ed26860 | UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: kex_exchange_identification: Connection closed by remote host\r\nConnection closed by UNKNOWN port 65535",
    "unreachable": true
}
i-07d77f8eeb0b41523 | UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: kex_exchange_identification: Connection closed by remote host\r\nConnection closed by UNKNOWN port 65535",
    "unreachable": true
}
i-0d6132a33909c748f | UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: kex_exchange_identification: Connection closed by remote host\r\nConnection closed by UNKNOWN port 65535",
    "unreachable": true
}
i-0a3c97f08afce6885 | UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: kex_exchange_identification: Connection closed by remote host\r\nConnection closed by UNKNOWN port 65535",
    "unreachable": true
}

I do see them all running in the AWS console though.

@tinom9
Copy link

tinom9 commented Apr 16, 2023

I'd say it's not getting the right ssh key. Make sure it's accessible and you've set it up properly.

You can always test it by connecting to a validator instance with the specified params:

ssh -i $SSH_KEY_FILE ubuntu@$VALIDATOR_01_INSTANCE_ID \
    -o IdentitiesOnly=yes \
    -o StrictHostKeyChecking=no \
    -o ProxyCommand="sh -c \"aws ssm start-session --target %h --document-name AWS-StartSSHSession --parameters 'portNumber=%p'\""

@aphexyuri
Copy link
Author

What should I use for VALIDATOR_01_INSTANCE_ID?

On a side note, there's a discrepancy in step 7 with the private key location ~/.ssh/ vs ~/cert/ paths - I made it all ~/.ssh/ (also ansible_ssh_private_key_file: ~/.ssh/devnet_private.key in local-extra-vars.yml)

@aphexyuri
Copy link
Author

@tinom9 can i ask what OS you're using? was just trying our run.sh w/o success:

[WARNING]:  * Failed to parse /Users/yurivisser/Desktop/terraform-polygon-supernets/ansible/inventory/aws_ec2.yml with auto plugin: Failed
to describe instances: An error occurred (UnauthorizedOperation) when calling the DescribeInstances operation: You are not authorized to
perform this operation.
[WARNING]:  * Failed to parse /Users/yurivisser/Desktop/terraform-polygon-supernets/ansible/inventory/aws_ec2.yml with yaml plugin: Plugin
configuration YAML file, not YAML inventory
[WARNING]:  * Failed to parse /Users/yurivisser/Desktop/terraform-polygon-supernets/ansible/inventory/aws_ec2.yml with ini plugin: Invalid
host pattern '---' supplied, '---' is normally a sign this is a YAML file.
[WARNING]: Unable to parse /Users/yurivisser/Desktop/terraform-polygon-supernets/ansible/inventory/aws_ec2.yml as an inventory source
[WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'
Starting galaxy collection install process
Nothing to do. All requested collections are already installed. If you want to reinstall them, consider using `--force`.
[WARNING]:  * Failed to parse /Users/yurivisser/Desktop/terraform-polygon-supernets/ansible/inventory/aws_ec2.yml with auto plugin: Failed
to describe instances: An error occurred (UnauthorizedOperation) when calling the DescribeInstances operation: You are not authorized to
perform this operation.
[WARNING]:  * Failed to parse /Users/yurivisser/Desktop/terraform-polygon-supernets/ansible/inventory/aws_ec2.yml with yaml plugin: Plugin
configuration YAML file, not YAML inventory
[WARNING]:  * Failed to parse /Users/yurivisser/Desktop/terraform-polygon-supernets/ansible/inventory/aws_ec2.yml with ini plugin: Invalid
host pattern '---' supplied, '---' is normally a sign this is a YAML file.
[WARNING]: Unable to parse /Users/yurivisser/Desktop/terraform-polygon-supernets/ansible/inventory/aws_ec2.yml as an inventory source
[WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'
[WARNING]: Collection prometheus.prometheus does not support Ansible version 2.14.4

PLAY [all] *********************************************************************************************************************************
skipping: no hosts matched
[WARNING]: Could not match supplied host pattern, ignoring: devnet01_edge_polygon_private

PLAY [all:&devnet01_edge_polygon_private] **************************************************************************************************
skipping: no hosts matched
[WARNING]: Could not match supplied host pattern, ignoring: geth

PLAY [geth:&devnet01_edge_polygon_private] *************************************************************************************************
skipping: no hosts matched
[WARNING]: Could not match supplied host pattern, ignoring: fullnode
[WARNING]: Could not match supplied host pattern, ignoring: validator

PLAY [fullnode:validator:&devnet01_edge_polygon_private] ***********************************************************************************
skipping: no hosts matched

PLAY [fullnode:validator:&devnet01_edge_polygon_private] ***********************************************************************************
skipping: no hosts matched

PLAY [fullnode:validator:&devnet01_edge_polygon_private] ***********************************************************************************
skipping: no hosts matched

PLAY RECAP *********************************************************************************************************************************

@gatsbyz
Copy link
Contributor

gatsbyz commented May 1, 2023

are you still stuck on this? please confirm if ansible all -m ping is working

@ajruizvargas
Copy link

I am actually stuck at the same place. ansible all -m ping returning

i-0cb0636a89d0d03cf | UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname i-0cb0636a89d0d03cf: nodename nor servname provided, or not known",
    "unreachable": true
}
i-023e1176957aa6b8b | UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname i-023e1176957aa6b8b: nodename nor servname provided, or not known",
    "unreachable": true
}
i-04f517e27dc91d8e7 | UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname i-04f517e27dc91d8e7: nodename nor servname provided, or not known",
    "unreachable": true
}
i-0c07fca0d42ae4147 | UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname i-0c07fca0d42ae4147: nodename nor servname provided, or not known",
    "unreachable": true
}
i-01766b8b0e3f5d461 | UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname i-01766b8b0e3f5d461: nodename nor servname provided, or not known",
    "unreachable": true
}

Any ideas?

@imarkus8787
Copy link

I was running into the same issue and was able to make some progress. The main issue i found was an AWS permission problem where i had to add a dedicated policy under my AWS IAM user. Its specific permission for the 4 validators and the geth-001 node. I also had to install session manager since along the way i got another error (https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-working-with-install-plugin.html)

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ssm:StartSession"
],
"Resource": [
"arn:aws:ec2:us-west-2:010531221017:instance/i-0c21d328536a23815",
"arn:aws:ec2:us-west-2:010531221017:instance/i-029da648f72eca49e",
"arn:aws:ec2:us-west-2:010531221017:instance/i-05c1dc647f0ab13ca",
"arn:aws:ec2:us-west-2:010531221017:instance/i-07cc225ea7b1f8cfd",
"arn:aws:ec2:us-west-2:010531221017:instance/i-0efe99c6be33073d4",
"arn:aws:ssm:us-west-2::document/AWS-StartSSHSession"
]
},
{
"Effect": "Allow",
"Action": [
"ssm:TerminateSession",
"ssm:ResumeSession"
],
"Resource": [
"arn:aws:ssm:::session/${aws:username}-*"
]
}
]
}

@praetoriansentry
Copy link
Contributor

It's going to be important to use the full command:
ansible --inventory inventory/aws_ec2.yml --extra-vars "@local-extra-vars.yml" all -m ping

In local-extra-vars.yml there are some lines like this:

ansible_ssh_private_key_file: ~/devnet_private.key
ansible_ssh_common_args: >
  -o IdentitiesOnly=yes
  -o StrictHostKeyChecking=no
  -o ProxyCommand="sh -c \"aws ssm start-session --target %h --document-name AWS-StartSSHSession --parameters 'portNumber=%p'\""

These can be edited based on your needs, but basically it would tell Ansible where to find your ssh key and also configure Ansible to use SSH over SSM.

@aphexyuri
Copy link
Author

Finally got back to giving this a try, with the latest edge release and additions & changes to the docs. Turned out my aws ssm setup and setting vars (example.env step) was broken.
Ty and great work on making the docs clearer and adding some verification commands along the way!

@praetoriansentry
Copy link
Contributor

Awesome - thanks @aphexyuri - We'll be adding more documentation around tuning, loading testing, and regular operations (e.g. looking at logs, etc). If you have any other thoughts of documentation that would be helpful, feel free to drop us a line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants