Skip to content
This repository was archived by the owner on Feb 7, 2023. It is now read-only.

tests: OpenShift Ansible Installer sanity test #162

Merged
merged 4 commits into from
Jun 13, 2017

Conversation

miabbott
Copy link
Collaborator

@miabbott miabbott commented May 25, 2017

This is a 'meta' playbook that borrowed heavily from the work done in
'ansible-ansible-openshift-ansible' [0]. It will install an OpenShift
Origin cluster on a single host via the openshift/openshfit-ansible
installer.

This playbook can be invoked as you would for any other tests in this
repo, by supplying an inventory file or hostname/IP address to
ansible-playbook. The host from the inventory is used to template
out a second inventory file which is used when calling
ansible-playbook on the openshift/openshift-ansible playbook. This,
in theory, allows users to maintain the same workflow that is familiar
to running other tests in this repo.

There are variables defined in the playbook that control the version
of OpenShift Origin to be installed, the git tag of the
openshift/openshift-ansible repo to use, the host where the cluster
should be installed, and the Ansible user to use when installing the
cluster. These can all be overridden via CLI parameters.

This lacks a couple of things that could be added later:

  • additional checks to determine health of cluster
  • cleanup/uninstall of the cluster
  • maybe deploying a project?

[0] https://pagure.io/ansible-ansible-openshift-ansible

This is a 'meta' playbook that borrowed heavily from the work done in
'ansible-ansible-openshift-ansible' [0].  It will install an OpenShift
Origin cluster on a single host via the openshift/openshfit-ansible
installer.

This playbook can be invoked as you would for any other tests in this
repo, by supplying an inventory file or hostname/IP address to
`ansible-playbook`.  The host from the inventory is used to template
out a second inventory file which is used when calling
`ansible-playbook` on the openshift/openshift-ansible playbook. This,
in theory, allows users to maintain the same workflow that is familiar
to running other tests in this repo.

There are variables defined in the playbook that control the version
of OpenShift Origin to be installed, the `git` tag of the
openshift/openshift-ansible repo to use, the host where the cluster
should be installed, and the Ansible user to use when installing the
cluster.  These can all be overridden via CLI parameters.

This lacks a couple of things that could be added later:
  - additional checks to determine health of cluster
  - cleanup/uninstall of the cluster
  - maybe deploying a project?
@miabbott
Copy link
Collaborator Author

@dustymabe Here's the first crack at the OpenShift Ansible Installer test

@miabbott
Copy link
Collaborator Author

@miabbott
Copy link
Collaborator Author

Worth noting that I only tested this against F25 AH. Will run this against CAHC and RHELAH shortly.

@cgwalters
Copy link
Member

That's pretty straightforward indeed. Looks sane offhand to me.

Though last time I was using o-a an issue I hit was that it was pulling :latest of the origin container which means that in practice this is going to be a moving target.

@miabbott
Copy link
Collaborator Author

Though last time I was using o-a an issue I hit was that it was pulling :latest of the origin container which means that in practice this is going to be a moving target.

Agreed. It will be up to the executor (be it human or computer) of the test to supply the desired values for the OpenShift Origin release and openshift-ansible release.

@miabbott
Copy link
Collaborator Author

Another enhancement would be some sort of sanity checking that the version of OpenShift Origin is compatible with openshift-ansible. They link the two as described here.

@dustymabe
Copy link
Contributor

@dustymabe Here's the first crack at the OpenShift Ansible Installer test

thanks man - looks pretty sweet. A few questions:

  • Is it possible to play aorund with the callback plugins to get output look like normal ansible output, rather than all jumbled up when it returns?
  • I had to set this variable because the memory in my machine was only 8G: openshift_disable_check=memory_availability. is there any way to set these vars in the top level a-h-t inventory, or do I need to dig into the openshift-ansible templated inventory in order to set them?

I talked with scott dodson and we should be able to use the centos paas sig rpms as a way to determine what stable version of openshift-ansible to use. We can either use those rpms directly or we could use the Version info from the rpm to dictate a tag for us to do a git checkout on.

@mike-nguyen
Copy link
Collaborator

@dustymabe do you have an example of what normal ansible output vs jumbled up output for reference?

@dustymabe
Copy link
Contributor

@dustymabe do you have an example of what normal ansible output vs jumbled up output for reference?

this is jumbled:

$ ansible-playbook -i inventory tests/openshift-ansible-test/main.yml                                                                                              

PLAY [OpenShift Ansible Installer Test] **************************************************************************************************************************************************************************

TASK [Gathering Facts] *******************************************************************************************************************************************************************************************
ok: [34.201.54.182]

TASK [Setup vars for templating the inventory, etc.] *************************************************************************************************************************************************************
ok: [34.201.54.182]

TASK [Make temp directory of holding] ****************************************************************************************************************************************************************************
changed: [34.201.54.182 -> localhost]

TASK [git clone openshift-ansible repo] **************************************************************************************************************************************************************************
changed: [34.201.54.182 -> localhost]

TASK [Template the inventory file] *******************************************************************************************************************************************************************************
changed: [34.201.54.182 -> localhost]

TASK [Run the openshift-ansible playbook] ************************************************************************************************************************************************************************
fatal: [34.201.54.182 -> localhost]: FAILED! => {"changed": true, "cmd": ["ansible-playbook", "-i", "cluster-inventory", "playbooks/byo/config.yml", "-e", "ansible_python_interpreter=/usr/bin/python3"], "delta"
: "0:07:19.030171", "end": "2017-05-25 15:56:15.770011", "failed": true, "rc": 2, "start": "2017-05-25 15:48:56.739840", "stderr": "[DEPRECATION WARNING]: always_run is deprecated. Use check_mode = no instead..
\n\nThis feature will be removed in version 2.4. Deprecation warnings can be \ndisabled by setting deprecation_warnings=False in ansible.cfg.\n[DEPRECATION WARNING]: always_run is deprecated. Use check_mode = n
o instead..\n\nThis feature will be removed in version 2.4. Deprecation warnings can be \ndisabled by setting deprecation_warnings=False in ansible.cfg.\n[DEPRECATION WARNING]: always_run is deprecated. Use che
ck_mode = no instead..\n\nThis feature will be removed in version 2.4. Deprecation warnings can be \ndisabled by setting deprecation_warnings=False in ansible.cfg.\n[DEPRECATION WARNING]: always_run is deprecat
ed. Use check_mode = no instead..\n\nThis feature will be removed in version 2.4. Deprecation warnings can be \ndisabled by setting deprecation_warnings=False in ansible.cfg.\n[DEPRECATION WARNING]: always_run 
is deprecated. Use check_mode = no instead..\n\nThis feature will be removed in version 2.4. Deprecation warnings can be \ndisabled by setting deprecation_warnings=False in ansible.cfg.\n[DEPRECATION WARNING]: 
always_run is deprecated. Use check_mode = no instead..\n\nThis feature will be removed in version 2.4. Deprecation warnings can be \ndisabled by setting deprecation_warnings=False in ansible.cfg.\n[DEPRECATION
 WARNING]: always_run is deprecated. Use check_mode = no instead..\n\nThis feature will be removed in version 2.4. Deprecation warnings can be \ndisabled by setting deprecation_warnings=False in ansible.cfg.", 
"stderr_lines": ["[DEPRECATION WARNING]: always_run is deprecated. Use check_mode = no instead..", "", "This feature will be removed in version 2.4. Deprecation warnings can be ", "disabled by setting deprecati
on_warnings=False in ansible.cfg.", "[DEPRECATION WARNING]: always_run is deprecated. Use check_mode = no instead..", "", "This feature will be removed in version 2.4. Deprecation warnings can be ", "disabled b
y setting deprecation_warnings=False in ansible.cfg.", "[DEPRECATION WARNING]: always_run is deprecated. Use check_mode = no instead..", "", "This feature will be removed in version 2.4. Deprecation warnings ca
n be ", "disabled by setting deprecation_warnings=False in ansible.cfg.", "[DEPRECATION WARNING]: always_run is deprecated. Use check_mode = no instead..", "", "This feature will be removed in version 2.4. Depr
ecation warnings can be ", "disabled by setting deprecation_warnings=False in ansible.cfg.", "[DEPRECATION WARNING]: always_run is deprecated. Use check_mode = no instead..", "", "This feature will be removed i
n version 2.4. Deprecation warnings can be ", "disabled by setting deprecation_warnings=False in ansible.cfg.", "[DEPRECATION WARNING]: always_run is deprecated. Use check_mode = no instead..", "", "This featur
e will be removed in version 2.4. Deprecation warnings can be ", "disabled by setting deprecation_warnings=False in ansible.cfg.", "[DEPRECATION WARNING]: always_run is deprecated. Use check_mode = no instead..
", "", "This feature will be removed in version 2.4. Deprecation warnings can be ", "disabled by setting deprecation_warnings=False in ansible.cfg."], "stdout": "\nPLAY [Create initial host groups for localhost
] ********************************\n\nTASK [include_vars] ************************************************************\nok: [localhost]\n\nPLAY [Verify Requirements] ********************************************
*********\n\nTASK [Gathering Facts] *********************************************************\nok:

basically is there a way to make that one task (that essentially runs another ansible playbook) have output that looks more like ansible output, but maybe shifted over 4 spaces or something?

@mike-nguyen
Copy link
Collaborator

mike-nguyen commented May 25, 2017

@dustymabe Oh, the symlink to the callback plugin is missing. @miabbott can you drop that in?

What you are seeing is the default ("normal") Ansible output not the formatted output from our callback_plugins. This is the output with the callback_plugin symlink in the test directory:


TASK [openshift_health_check] **************************************************

CHECK [disk_availability : fedora] *********************************************

CHECK [memory_availability : fedora] *******************************************
fatal: [fedora]: FAILED! => {
    "changed": false, 
    "checks": {
        "disk_availability": {
            "skipped": true, 
            "skipped_reason": "Not active for this host"
        }, 
        "memory_availability": {
            "failed": true, 
            "msg": "Available memory (2.0 GB) below recommended value (16.0 GB)"
        }
    }, 
    "failed": true, 
    "playbook_context": "install"
}

MSG:

One or more checks failed

PLAY RECAP *********************************************************************
fedora                     : ok=41   changed=8    unreachable=0    failed=1   
localhost                  : ok=1    changed=0    unreachable=0    failed=0   


Failure summary:

  1. Host:     fedora
     Play:     Verify Requirements
     Task:     openshift_health_check
     Message:  One or more checks failed
     Details:  check "memory_availability":
               Available memory (2.0 GB) below recommended value (16.0 GB)

The execution of "playbooks/byo/config.yml"
includes checks designed to fail early if the requirements
of the playbook are not met. One or more of these checks
failed. To disregard these results, you may choose to
disable failing checks by setting an Ansible variable:

   openshift_disable_check=memory_availability

Failing check names are shown in the failure details above.
Some checks may be configurable by variables if your requirements
are different from the defaults; consult check documentation.
Variables can be set in the inventory or passed on the
command line using the -e flag to ansible-playbook.
---

PLAY RECAP *********************************************************************
fedora                     : ok=5    changed=3    unreachable=0    failed=1   

[openshift-ansible-test]$ ll
lrwxrwxrwx. 1 me me   23 May 25 19:01 callback_plugins -> ../../callback_plugins/
-rw-rw-r--. 1 me me 1866 May 25 18:57 main.yml
drwxrwxr-x. 2 me me   60 May 25 18:57 templates

@miabbott miabbott changed the title tests: OpenShift Ansible Installer sanity test [WIP] tests: OpenShift Ansible Installer sanity test May 26, 2017
@miabbott
Copy link
Collaborator Author

Slapped on the 'WIP' label as there are a number of issues to iron out.

@miabbott
Copy link
Collaborator Author

@dustymabe Oh, the symlink to the callback plugin is missing. @miabbott can you drop that in?

That is an easy fix. I thought that 'openshift-ansible' itself had a callback plugin to format the output, but maybe that is getting overridden by the 'a-h-t' plugin

- added symlinks to callback_plugins + roles
- introduced new vars to handle public vs private ip addresses
- added conditional to skip memory check
- added conditional to use Python3 for Fedora only
- added README.md
@miabbott
Copy link
Collaborator Author

miabbott commented Jun 2, 2017

Added a new commit with some changes based on additional testing ⬆️

@mike-nguyen Added the symlink to callback_plugin so the output should be pretty now.

@dustymabe The playbook will set the necessary command line argument to skip the memory check auto-magically now.

I tested this against F25, CentOS, and RHEL Atomic Host in OpenStack and local libvirt. If someone can give this a spin against AWS, that would be interesting to see how it works out.

@miabbott miabbott changed the title [WIP] tests: OpenShift Ansible Installer sanity test tests: OpenShift Ansible Installer sanity test Jun 2, 2017
To run the test, simply invoke as any other Ansible playbook:

```
$ ansible-playbook -i inventory tests/system-containers/main.yml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is that the right playbook yml file?

@dustymabe
Copy link
Contributor

I tested this against F25, CentOS, and RHEL Atomic Host in OpenStack and local libvirt. If someone can give this a spin against AWS, that would be interesting to see how it works out.

i suspect this will fail with aws, but maybe some of the updates you just did will allow me to work around that. I'll try it.

@dustymabe
Copy link
Contributor

this actually passed on aws with an inventory file like:

[masters]
54.89.211.109 ansible_user=fedora cli_oo_host=10.0.36.120

@dustymabe
Copy link
Contributor

+1 from me

@miabbott
Copy link
Collaborator Author

miabbott commented Jun 2, 2017

@dustymabe Thanks for the AWS test. I'll make a note in the README and in the playbook file about setting up that variable when running in EC2.

I may try to run my own EC2 instance and see if there is a way to programatically determine the value for that variable, too.

@samvarankashyap
Copy link

@miabbott : I have been testing the PR on fedora 25 following image https://download.fedoraproject.org/pub/fedora/linux/releases/25/CloudImages/x86_64/images/Fedora-Cloud-Base-25-1.3.x86_64.qcow2
and running into issues.
could you please share the image link on which you are running the installer .
Thanks,

@miabbott
Copy link
Collaborator Author

miabbott commented Jun 6, 2017

@samvarankashyap I've tested this using the latest F25 Atomic Host:

https://dl.fedoraproject.org/pub/alt/atomic/stable/Fedora-Atomic-25-20170522.0/CloudImages/x86_64/images/Fedora-Atomic-25-20170522.0.x86_64.qcow2

This worked for me in OpenStack and on local libvirt.

Feel free to share the errors you are encountering.

Note, I did not test this PR against a non Atomic Host, so there may be issues there that I have not accounted for.

@dustymabe
Copy link
Contributor

@samvarankashyap - The image you tested against was the cloud base image, not the atomic host image.

@samvarankashyap
Copy link

@miabbott @dustymabe : Thanks , I will run the PR on atomic image and get you back with feedback.

@miabbott
Copy link
Collaborator Author

miabbott commented Jun 6, 2017

@samvarankashyap I ran this playbook against an up-to-date F25 Cloud VM in OpenStack and encountered some problems along the way.

  • needed to install NetworkManager and python
  • needed to enable + start NetworkManager
  • openshift-ansible installer would blow up when specifying version 1.5.1 (the default for this PR) of OpenShift Origin (version 1.5.0 worked fine)

Since the PR is really targeting Atomic Host platforms, I'm not terribly worried about the first two problems but wanted to just report my findings.

I'm going to work with @dustymabe about the 3rd problem, because that is probably something that should be fixed in openshift-ansible

@samvarankashyap
Copy link

samvarankashyap commented Jun 7, 2017

@miabbott :
Ran the on a container gives me the following error, whereas running the same on a duffy node didnt give me any :
error is as follows :


TASK [etcd_common : Install etcd for etcdctl] **********************************
fatal: [192.168.122.85]: FAILED! => {"failed": true, "msg": "The conditional check 'o' failed. The error was: error while evaluating conditional (o): 'o' is undefined\n\nThe error appears to have been in '/tmp/tmp.DIIPhdvi90/roles/etcd/tasks/main.yml': line 123, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- include_role:\n  ^ here\n"}

NO MORE HOSTS LEFT *************************************************************
        to retry, use: --limit @/tmp/tmp.DIIPhdvi90/playbooks/byo/config.retry

PLAY RECAP *********************************************************************
192.168.122.85             : ok=217  changed=44   unreachable=0    failed=1   
localhost                  : ok=10   changed=0    unreachable=0    failed=0   


Failure summary:

  1. Host:     192.168.122.85
     Play:     Configure etcd
     Task:     etcd_common : Install etcd for etcdctl
     Message:  The conditional check 'o' failed. The error was: error while evaluating conditional (o): 'o' is undefined
               
               The error appears to have been in '/tmp/tmp.DIIPhdvi90/roles/etcd/tasks/main.yml': line 123, column 3, but may
               be elsewhere in the file depending on the exact syntax problem.
               
               The offending line appears to be:
               
               
               - include_role:
                 ^ here
               
---
        to retry, use: --limit @/atomic-host-tests/tests/openshift-ansible-test/main.retry

PLAY RECAP *********************************************************************
node1                      : ok=6    changed=4    unreachable=0    failed=1  

complete run log :https://gist.github.com/samvarankashyap/80f3e325e89d6630e3ea465c98f183f4
@miabbott : am i missing any dependency by any chance ?

@samvarankashyap
Copy link

@miabbott if it helps , i found the same issue reported but unsolved on openshift-ansible repo
openshift/openshift-ansible#4121

@miabbott
Copy link
Collaborator Author

miabbott commented Jun 7, 2017

I may try to run my own EC2 instance and see if there is a way to programatically determine the value for that variable, too.

@dustymabe I got access via the OpenShift group to AWS and was able to run the playbook like so:

$ ansible-playbook -v -i 54.242.57.236, -u fedora --private-key ~/.ssh/libra.pem tests/openshift-ansible-test/main.yml

It failed checking to see if 8443 was open, but I think that is because the default security group had blocked that port for me.

@miabbott
Copy link
Collaborator Author

miabbott commented Jun 8, 2017

It failed checking to see if 8443 was open, but I think that is because the default security group had blocked that port for me.

Using a different security group alleviated this issue. I did notice that checking for the pods to be Running took a little bit longer than in my previous tests with OpenStack/libvirt, but that's why the retry logic is there. :)

@dustymabe
Copy link
Contributor

did you have to use cli_oo_host=x.x.x.x ?

@miabbott
Copy link
Collaborator Author

miabbott commented Jun 8, 2017

did you have to use cli_oo_host=x.x.x.x ?

Nope, it Just Worked For Me™

@miabbott
Copy link
Collaborator Author

miabbott commented Jun 8, 2017

@samvarankashyap I ran this PR from within a container without trouble. I wasn't using a duffy node, but rather my own F25 workstation.

In my setup I booted a F25 Atomic Host using local libvirt, then locally created a Docker image using a slightly modified Dockerfile in #167, and manually ran the playbook from inside the container targeting the F25AH VM. (The modification was just to checkout this PR before running ansible-playbook)

With that proved out, I ran it automatically like so:

sudo docker run -e TEST_PATH='tests/openshift-ansible-test/main.yml' -v /home/miabbott/.ssh/insecure_id_rsa:/root/.ssh/id_rsa:z aht-image -i 192.168.122.179, -u cloud-user

And that worked, too. Not sure if there is something specific to the duffy environment, but I'm not able to reproduce that error. (Granted, I only ran it twice in a container, but I feel pretty good that it should work fine)

@dustymabe
Copy link
Contributor

should we just merge this and fix issues later?

@miabbott
Copy link
Collaborator Author

I'll give another 24 hours for additional feedback. If nothing else is received, I'll merge this tomorrow.

@miabbott miabbott merged commit d9e13d1 into projectatomic:master Jun 13, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants