diff --git a/README.md b/README.md new file mode 100644 index 0000000..36e4824 --- /dev/null +++ b/README.md @@ -0,0 +1,84 @@ +# INTRO +These are ansible playbooks used for deploying an OMD instance as well as a simple haproxy and two web servers. These are the playbooks that were used by Marius Pana at the 2nd Check_MK conference in Munich, Germany. The presentation will be made available online shortly for those that are interested. + +Alert handlers (as defined by Check_MK) can be used from within Check_MK to signal the execution of specific handlers (as defined by Ansible) from the ansible playbooks so as to provide a simple feedback loop which provides self healing. + +*We are still looking for a good mapping of services between check_mk and ansible. One solution that was recommended was the use of service attributes(nagios macros) which could then be mapped one-to-one with ansible tags. As soon as we have something functional we will update this. If anyone else has ideas we are interested in hearing them.* + +These examples are fairly simple but can and should be expanded to include more logic for repairing your specific systems/services. We intended these as a starting point. + +## About these playbooks +We are assuming you are using a RedHat based distro. These playbooks will deploy for you an OMD instance on a freshly installed system, they will configure an HAProxy for load balancing between two apache web servers. + +We do not do the initial provisioning via these playbooks but this could be included in the future (i.e. deploy to joyent, cobbler or others). In other words we expect that you have the systems freshly installed and configured with a root user that is allowed SSH access as defined in the cmkconfinv (inventory) file. + +### ansible inventory file +The cmkconvinf file contains our inventory. In it we define three groups of hosts, a variable named folder which is the OMD folder we create via the WATO API for the respective host(s) and the IP address where these hosts can be reached. + +You must have these installed and configured before running these playbooks. You will also need to know the root user password. + +## Prerequisites +Make sure you change the users and ssh keys via the common role. Upload you ssh keys in roles/common/files and edit roles/common/vars/usersandpsks.yml. + +### Ansible +You will need a functional ansible set-up. Setting it up can be as easy as cloning the ansible repo or installing via your operating system package manager. More information about installing ansible can be found here: http://docs.ansible.com/ansible/intro_installation.html . + +You will also need to clone this repository to play around with these playbooks. + +### Check_MK +We are assuming you are using the CEE (Check_MK Enterprise Edition). While this should work with any recent version of Check_MK we are specifically targeting the use of the current innovation branch (1.2.7i) because of the new Alert Handlers (Werk #8275). + +If you would like to deploy your OMD instance via these playbooks you will need to download Check_MK CEE in RPM format and place it in the following directory: + +> roles/omd/files + +## Deploying OMD via Ansible +This is a very simple way to deploy an OMD instance and create a site named "prod". The following command will deploy OMD to your preinstalled server. It will prompt for the root users password (-k). + +> ansible-playbook -i cmkconfinv site.yml -l omd -u root -k --skip-tags check_mk_agent,check_mk_discovery,check_mk_apply + +Notice the use of the --skip-tags switch which is necessary as in this first run we do not have an OMD instance running from which to pull the agent, discovery, etc. + +You now need to create an Automation user in our Check_MK site and use that information in the roles/omd/vars/main.yml file. + +Now we can deploy the check_mk_agent to our monitoring instance as well. Notice we are running just the check_mk_agent, discovery and pply steps now. Also after bootstrapping your system you can use your own user if you created one and uploaded the ssh keys. In this case you could use ansible with sudo (-u -s instead of -u root). + +> ansible-playbook -i cmkconfinv site.yml -l omd -u root --tags check_mk_agent,check_mk_discovery,check_mk_apply + + +## Deploying the webserver and loadbalancer +The following will configure your webservers and loadbalancer. It will prompt for the root users password (-k). Once it is done you should have in your OMD instance 4 hosts (1 omd, 2 web servers and one lb) and their services monitored. + +> ansible-playbook -i cmkconfinv site.yml -l loadbalancers,webservers -u root -k + +## Check_MK Alert Handlers +We have created two alert handlers to showcase two different scenarios: + +1. services.sh - Restarting of apache web services if they are failed +2. instantiate.sh - Deploying a loadbalancer if it fails (state DOWN) + +These are specific to the setup we were using for the presentation at the conference however they serve as a good starting point. + +Add the following two Alert Handlers to your Check_MK site and place the scripts in ~/local/share/check_mk/alert_handlers (make sure they are executable): + +services.sh +![image of services.sh ](http://i67.tinypic.com/jgqqzm.png) +``` +#!/bin/bash +ansible-playbook -i /omd/sites/prod/ansible/cmkconfinv /omd/sites/prod/ansible/site.yml -l webservers -u root --tags httpd +``` + +instantiate.sh +![image of instantiate.sh ](http://i65.tinypic.com/14c9s8w.png) +``` +#!/bin/bash +ssh root@10.88.88.145 vmadm create -f /etc/zones/loadbalancer.json + +ansible-playbook -i /omd/sites/prod/ansible/cmkconfinv /omd/sites/prod/ansible/site.yml -l loadbalancers -u root +``` +The first line is specific to my setup which is using SmartOS available at 10.88.88.145. There I have already created a manifest file (loadbalancer.json) to create a loadbalancer instance. You will want to change this for your particular set-up. + +## TODO +You may notice an extra two check_mk checks named up_upscale and down_scale on your loadbalancer instance. These are not finished yet however they are an example of how you could use check_mk and ansible to do autoscaling. Based on feedback received via your monitoring you can bring up or down more instances effectively doing autoscaling. This is a work in progress and will be updated in the near future. The ansible tags are add_backend and del_backend, these may be useful if you plan on extending these. + +There are certainly more things to be done here ... diff --git a/bootstrap.yml b/bootstrap.yml new file mode 100644 index 0000000..2355a79 --- /dev/null +++ b/bootstrap.yml @@ -0,0 +1,7 @@ +--- +# file: bootstrap.yml +- hosts: all + #vars: + vars_files: [roles/common/vars/usersandpsks.yml, roles/omd/vars/main.yml] + roles: + - common diff --git a/cmkconfinv b/cmkconfinv new file mode 100644 index 0000000..915893d --- /dev/null +++ b/cmkconfinv @@ -0,0 +1,9 @@ +[loadbalancers] +lb01 ansible_ssh_host=10.88.88.127 folder=loadbalancers + +[webservers] +web1 ansible_ssh_host=10.88.88.128 folder=webservers +web2 ansible_ssh_host=10.88.88.129 folder=webservers + +[omd] +omd ansible_ssh_host=10.88.88.150 folder=omd diff --git a/loadbalancers.yml b/loadbalancers.yml new file mode 100644 index 0000000..d97fd1f --- /dev/null +++ b/loadbalancers.yml @@ -0,0 +1,7 @@ +--- +# file: loadbalancers.yml +- hosts: loadbalancers + vars_files: [roles/common/vars/usersandpsks.yml, roles/omd/vars/main.yml] + roles: + - common + - loadbalancers diff --git a/omd.yml b/omd.yml new file mode 100644 index 0000000..634404c --- /dev/null +++ b/omd.yml @@ -0,0 +1,7 @@ +--- +# file: omd.yml +- hosts: omd + #vars: + #vars_files: + roles: + - omd diff --git a/roles/common/files/ssh_keys/mariusp.pub b/roles/common/files/ssh_keys/mariusp.pub new file mode 100644 index 0000000..694f8e5 --- /dev/null +++ b/roles/common/files/ssh_keys/mariusp.pub @@ -0,0 +1 @@ +ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC/asZXkhLJVGIcPQGUxZDLl/yMwslgn6GyJd6QGKUmR+Snr1hMz01y7WEWPvfXXUqNym6rMU5fAMUr+alcyzMGZYKyymTLfjgp0SUuWG3TGpl3EPxnfGwNcXOvuJE9cnY0q3nhZgQjvn6EdEFDKAmLG1WXlKYjbQUUrHp0wFvEx3TNIXMVJqHxbKi8Uwyvn5EB1emdeJkaAaXJbk1TxALu400Ts0KYJUUyMn5njJjVELwtPVsnb0skmKSXd4dgBLN+wo94YQLpdfCnmho0uPhZfTHHi0+jtJNtUSycOSuOr/TxYGirxOYcb5FoOvzg9L0RyQAj6O+Hzs3RkHB+qast mariusp@marduk.local diff --git a/roles/common/handlers/main.yml b/roles/common/handlers/main.yml new file mode 100644 index 0000000..079c4f1 --- /dev/null +++ b/roles/common/handlers/main.yml @@ -0,0 +1,15 @@ +--- +# file: roles/common/handlers/main.yml +- name: restart ntp + service: name=ntp state=restarted + tags: + - ntpd + +- name: restart xinetd + service: name=xinetd state=restarted + tags: xinetd + +- name: restart sshd + service: name=sshd state=restarted + tags: + - sshd diff --git a/roles/common/tasks/main.yml b/roles/common/tasks/main.yml new file mode 100644 index 0000000..31c70b9 --- /dev/null +++ b/roles/common/tasks/main.yml @@ -0,0 +1,73 @@ +--- +# file: roles/common/tasks/main.yml +- name: make sure ntp,epel,etc. are installed + yum: pkg={{ item }} state=installed + with_items: + - ntp + - xinetd + - epel-release + #- screen + #- vim-enhanced + #- mc + tags: packages + +- name: add sphs group + action: group name=sphs state=present + +- name: add our users + action: user name={{ item }} groups=sphs state=present append=yes + with_items: usersAdd + when: item != 'none' + +- name: Add SSH public key to user mariusp + action: authorized_key user=mariusp key="{{ lookup('file', "../files/ssh_keys/mariusp.pub") }}" + +- name: Remove users + action: user name={{ item }} state=absent remove=yes + with_items: usersDel + when: item != 'none' + +# Enable sudo for sphs group with no password +- name: Enable sudo without password for sudo group + action: 'lineinfile "dest=/etc/sudoers" state=present regexp="^%sphs ALL" line="%sphs ALL=(ALL) NOPASSWD: ALL"' + +- name: install check_mk agent + yum: pkg=http://{{ omdhost }}/{{ omdsite }}/check_mk/agents/{{ rpmagent }} state=installed + tags: + - check_mk_agent + +# change to get_uri - do some error checking +- name: add host to omd + uri: + method: POST + body_format: json + url: http://{{omdhost}}/{{omdsite}}/check_mk/webapi.py?action=add_host&_username={{automationuser}}&_secret={{autosecret}} + body: 'request={"attributes":{"alias":"{{inventory_hostname}}","ipaddress":"{{ansible_default_ipv4["address"]}}"},"hostname":"{{inventory_hostname}}","folder":"{{folder}}"}' + delegate_to: 127.0.0.1 + tags: + - check_mk_agent + notify: + - cmk_discovery + - cmk_apply + +- name: cmk_discovery + uri: + method: POST + url: http://{{ omdhost }}/{{ omdsite }}/check_mk/webapi.py?action=discover_services&_username={{ automationuser }}&_secret={{ autosecret }}&mode=refresh + body: 'request={"hostname":"{{ inventory_hostname }}"}' + body_format: json + status_code: 200 + tags: + - check_mk_discovery + delegate_to: 127.0.0.1 + +- name: cmk_apply + uri: + method: POST + url: http://{{ omdhost }}/{{ omdsite }}/check_mk/webapi.py?action=activate_changes&_username={{ automationuser }}&_secret={{ autosecret }}&mode=specific + body: request={"sites":["{{ omdsite }}"]} + body_format: json + status_code: 200 + tags: + - check_mk_apply + delegate_to: 127.0.0.1 diff --git a/roles/common/vars/usersandpsks.yml b/roles/common/vars/usersandpsks.yml new file mode 100644 index 0000000..059f83a --- /dev/null +++ b/roles/common/vars/usersandpsks.yml @@ -0,0 +1,8 @@ +--- +usersAdd: + - mariusp +usersDel: + - none +usersPSK: + - name: mariusp + psk: ["../files/ssh_keys/mariusp.pub"] diff --git a/roles/loadbalancers/files/check_ha.sh b/roles/loadbalancers/files/check_ha.sh new file mode 100644 index 0000000..b1e599c --- /dev/null +++ b/roles/loadbalancers/files/check_ha.sh @@ -0,0 +1,24 @@ +#!/bin/bash +CONN=`echo "show info" | socat /var/lib/haproxy/stats stdio |grep CurrConns | cut -d' ' -f2` +SRVS=`cat /etc/haproxy/haproxy.cfg |grep check | grep server |wc -l` +if [ $CONN = 0 ]; then + CONN=4 +fi +if [ $SRVS = 0 ]; then + echo "<<>>" + echo "up_scale 1000" + echo "<<>>" + echo "down_scale 1000" +else + let "CONNPERSRV=$CONN/$SRVS" + echo "<<>>" + echo "up_scale $CONNPERSRV" + if [ $SRVS -le 2 ]; then + echo "<<>>" + echo "down_scale 16" + else + echo "<<>>" + echo "down_scale $CONNPERSRV" + fi + +fi diff --git a/roles/loadbalancers/files/haproxy.cfg.j2 b/roles/loadbalancers/files/haproxy.cfg.j2 new file mode 100644 index 0000000..ea75cc9 --- /dev/null +++ b/roles/loadbalancers/files/haproxy.cfg.j2 @@ -0,0 +1,57 @@ +global + log 127.0.0.1 local2 + chroot /var/lib/haproxy + pidfile /var/run/haproxy.pid + maxconn 4000 + user haproxy + group haproxy + daemon + stats socket /var/lib/haproxy/stats mode 644 level admin + stats timeout 2m + +#--------------------------------------------------------------------- +# common defaults that all the 'listen' and 'backend' sections will +# use if not designated in their block +#--------------------------------------------------------------------- +defaults + mode http + log global + option httplog + option dontlognull + option http-server-close + option forwardfor except 127.0.0.0/8 + option redispatch + retries 3 + timeout http-request 10s + timeout queue 1m + timeout connect 10s + timeout client 1m + timeout server 1m + timeout http-keep-alive 10s + timeout check 10s + maxconn 3000 + +#--------------------------------------------------------------------- +# main frontend which proxys to the backends +#--------------------------------------------------------------------- +frontend main *:5000 + acl url_static path_beg -i /static /images /javascript /stylesheets + acl url_static path_end -i .jpg .gif .png .css .js + + #use_backend static if url_static + #default_backend appname + +## +listen appname 0.0.0.0:80 + mode http + stats enable + stats uri /haproxy?stats + stats realm Strictly\ Private + stats auth marius:marius + balance roundrobin + option httpclose + option forwardfor + # we are adding our hosts manually .. + # we could populate this dynamically from our inventory + server web1 10.88.88.128:80 check + server web2 10.88.88.129:80 check diff --git a/roles/loadbalancers/handlers/main.yml b/roles/loadbalancers/handlers/main.yml new file mode 100644 index 0000000..d2c9b73 --- /dev/null +++ b/roles/loadbalancers/handlers/main.yml @@ -0,0 +1,29 @@ +--- +- name: restart haproxy + service: name=haproxy state=restarted + tags: haproxy + +#la executie scriptul va seta cu -e o noua variabila de genul new_server=' server web2 10.88.88.129:80 check' + +- name: add_backend + action: 'lineinfile "dest=/etc/haproxy/haproxy.cfg" state=present regexp="{{new_server}}" line="{{new_Server}}"' + tags: + - add_backend + +#la executie scriptul va seta cu -e o noua variabila de genul old_server=' server web2 10.88.88.129:80 check' +- name: del_backend + action: 'lineinfile "dest=/etc/haproxy/haproxy.cfg" state=absent regexp="{{old_server}}" line="{{old_Server}}"' + tags: + - del_backend + +- name: cmk_discovery + command: curl 'http://{{ omdhost }}/{{ omdsite }}/check_mk/webapi.py?action=discover_services&_username={{ automationuser }}&_secret={{ autosecret }}&mode=refresh' -d 'request={"hostname":"{{ inventory_hostname }}"}' + tags: + - check_mk_agent + - check_mk_discovery + +- name: cmk_apply + command: curl 'http://{{ omdhost }}/{{ omdsite }}/check_mk/webapi.py?action=activate_changes&_username={{ automationuser }}&_secret={{ autosecret }}&mode=specific' -d 'request={"sites":["{{ omdsite }}"]}' + tags: + - check_mk_agent + - check_mk_discovery diff --git a/roles/loadbalancers/tasks/main.yml b/roles/loadbalancers/tasks/main.yml new file mode 100644 index 0000000..899d49b --- /dev/null +++ b/roles/loadbalancers/tasks/main.yml @@ -0,0 +1,22 @@ +--- +- name: make sure haproxy and socat is installed + yum: name={{ item}} state=latest + with_items: + - socat + - haproxy + tags: packages + +- name: copy haproxy configuration files + copy: src=../files/haproxy.cfg.j2 dest=/etc/haproxy/haproxy.cfg backup=yes mode=0644 + notify: + - restart haproxy + +- name: deploy ha_check.sh (autoscale) + copy: src=../files/check_ha.sh dest=/usr/lib/check_mk_agent/plugins/check_sa.sh mode=755 + tags: check_sa + notify: + - cmk_discovery + - cmk_apply + +- name: enable haproxy + service: name=haproxy enabled=yes state=started diff --git a/roles/omd/tasks/main.yml b/roles/omd/tasks/main.yml new file mode 100644 index 0000000..a0a9ca2 --- /dev/null +++ b/roles/omd/tasks/main.yml @@ -0,0 +1,27 @@ +--- +# file: roles/common/tasks/main.yml +# we need omd host/site from omd role +- include_vars: roles/omd/vars/main.yml + +- name: make sure epel is installed + yum: pkg={{ item }} state=installed + with_items: + - epel-release + tags: packages + +- name: upload omd package + copy: src=roles/omd/files/check-mk-enterprise-1.2.7i3p1-el6-36.x86_64.rpm dest=/tmp + +- name: install omd server + yum: name=/tmp/check-mk-enterprise-1.2.7i3p1-el6-36.x86_64.rpm state=present + +# might be nice to create ansible module for omd +- name: create prod instance + command: /usr/bin/omd create prod + tags: + - omdcreate + +- name: start our prod instance + command: /usr/bin/omd start prod + tags: + - omdstart diff --git a/roles/omd/vars/main.yml b/roles/omd/vars/main.yml new file mode 100644 index 0000000..c56c487 --- /dev/null +++ b/roles/omd/vars/main.yml @@ -0,0 +1,7 @@ +--- +automationuser: "automaton" +autosecret: "GUVKRNECLRGFBTQJCRFY" +omdhost: "10.88.88.150" +#omdhost: "192.168.217.129" +omdsite: "prod" +rpmagent: "check-mk-agent-1.2.7i3p1-1.noarch.rpm" diff --git a/roles/webservers/files/konf.jpg b/roles/webservers/files/konf.jpg new file mode 100644 index 0000000..fd91efe Binary files /dev/null and b/roles/webservers/files/konf.jpg differ diff --git a/roles/webservers/files/status.conf.j2 b/roles/webservers/files/status.conf.j2 new file mode 100644 index 0000000..cfb047e --- /dev/null +++ b/roles/webservers/files/status.conf.j2 @@ -0,0 +1,6 @@ + + SetHandler server-status + Order deny,allow + Deny from all + Allow from 10.88.88.150 127.0.0.1 ::1 + diff --git a/roles/webservers/handlers/main.yml b/roles/webservers/handlers/main.yml new file mode 100644 index 0000000..df85424 --- /dev/null +++ b/roles/webservers/handlers/main.yml @@ -0,0 +1,20 @@ +--- +- name: restart httpd + service: name=httpd state=restarted + tags: + - httpd + notify: + - cmk_discovery + - cmk_apply + +- name: cmk_discovery + command: curl 'http://{{ omdhost }}/{{ omdsite }}/check_mk/webapi.py?action=discover_services&_username={{ automationuser }}&_secret={{ autosecret }}&mode=refresh' -d 'request={"hostname":"{{ inventory_hostname }}"}' + tags: + - check_mk_agent + - check_mk_discovery + +- name: cmk_apply + command: curl 'http://{{ omdhost }}/{{ omdsite }}/check_mk/webapi.py?action=activate_changes&_username={{ automationuser }}&_secret={{ autosecret }}&mode=specific' -d 'request={"sites":["{{ omdsite }}"]}' + tags: + - check_mk_agent + - check_mk_discovery diff --git a/roles/webservers/tasks/main.yml b/roles/webservers/tasks/main.yml new file mode 100644 index 0000000..13f45a8 --- /dev/null +++ b/roles/webservers/tasks/main.yml @@ -0,0 +1,36 @@ +--- +- name: make sure httpd is installed + yum: name=httpd state=latest + tags: httpd + +- name: enable httpd + service: name=httpd enabled=yes state=started + tags: + - httpd + +- name: enable http status + copy: src=../files/status.conf.j2 dest=/etc/httpd/conf.d/status.conf backup=yes mode=0644 + notify: + - restart httpd + tags: + - http_status + - cmk_discovery + - cmk_apply + +- name: add apache_status plugin + get_url: url=http://{{ omdhost }}/{{ omdsite }}/check_mk/agents/plugins/apache_status dest=/usr/lib/check_mk_agent/plugins/apache_status mode=0755 + tags: + - apache_status + notify: + - cmk_discovery + - cmk_apply + +- name: copy images to sites + copy: src=../files/konf.jpg dest=/var/www/html/ mode=0644 + tags: + - webcontent + +- name: copy index.html to sites + template: src=../templates/index.html.j2 dest=/var/www/html/index.html mode=0644 + tags: + - webcontent diff --git a/roles/webservers/templates/index.html.j2 b/roles/webservers/templates/index.html.j2 new file mode 100644 index 0000000..b91c123 --- /dev/null +++ b/roles/webservers/templates/index.html.j2 @@ -0,0 +1,16 @@ + + +Welcome to the 2ND Check_MK Conference! + + + + +
+
+Welcome to the 2ND Check_MK Conference! +
+ +

Im running on {{ inventory_hostname }}.

+

Running on {{ ansible_os_family }} ;-}

+ + diff --git a/site.yml b/site.yml new file mode 100644 index 0000000..8431a9f --- /dev/null +++ b/site.yml @@ -0,0 +1,5 @@ +--- +- include: bootstrap.yml +- include: webservers.yml +- include: loadbalancers.yml +- include: omd.yml diff --git a/webservers.yml b/webservers.yml new file mode 100644 index 0000000..bab7828 --- /dev/null +++ b/webservers.yml @@ -0,0 +1,7 @@ +--- +# file: webservers.yml +- hosts: webservers + vars_files: [roles/common/vars/usersandpsks.yml, roles/omd/vars/main.yml] + roles: + - common + - webservers