Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
P
prometheus
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Iterations
Wiki
Requirements
Custom issue tracker
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Locked files
Build
Pipelines
Jobs
Pipeline schedules
Test cases
Artifacts
Deploy
Releases
Package registry
Container registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Code review analytics
Issue analytics
Insights
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
infra
ansible-shared
prometheus
Commits
e3dc7290
Commit
e3dc7290
authored
3 years ago
by
Thomas Schneider
Browse files
Options
Downloads
Patches
Plain Diff
Add README.md
parent
a6cd3c46
No related branches found
No related tags found
No related merge requests found
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
README.md
+90
-0
90 additions, 0 deletions
README.md
with
90 additions
and
0 deletions
README.md
0 → 100644
+
90
−
0
View file @
e3dc7290
# Ansible roles for Prometheus and related monitoring tools
A word of warning: these roles are typical FSMPI/AStA RWTH roles, i. e., they
assume certain details about the underlying infrastructure. However, they are
quite simple and thus self-documenting, and _should_ run fine on any
sufficiently recent (≥ buster) Debian.
## Variables of interest
Set
`prometheus_host`
to the host where Prometheus runs. The exporter roles
will configure their scraping on that host in
`/etc/prometheus/scrape/<exporter
type>_{{ ansible_fqdn }}.yml`
. This directory is created and configured to be
scanned by the prometheus role.
Generally, interesting variables are documented in the roles’ defaults. Where
applicable, they tend to mirror the upstream configuration structure—this holds
especially for the various tools using YAML as well. We will not reproduce
upstream documentation here (it will only become outdated) so please consult it
directly (keep in mind that Debian usually ships an older version that current
upstream).
Several tools have distinct sets of options set via command line and
configuration file, respectively. Debian typically configures the former via
`/etc/default/<name>`
. These are configured via the various
`<name>_args`
variables.
## Roles
You will likely want to use a reverse proxy in front of the user-facing web
interfaces (especially since Prometheus and Alertmanager support only very
simple authentication and authorisation and recommend to use a proxy as well).
Such configuration and setup is explicitly outside the scope of the roles inside
this repository.
### Alertmanager
The role will _not_ handle installing the web interface (which Debian does not
ship).
### Grafana
Currently, this role hardcodes Grafana to listen on a UNIX socket at
`/run/grafana/sock`
, which will be world-read/writable.
Various variables are still lacking documentation.
### MySQL exporter
The role will _not_ create the user with the required permissions.
### Node exporter
Some file-based collectors are split in a separate package as of Debian
bullseye.
You may want to disable S.M.A.R.T. checking in VMs, as virtual SCSI disks
surprisingly do not provide such information.
### Prometheus
The
`prometheus_rules`
variable corresponds to the Prometheus alerting rule
configuration, which is also YAML based and also uses
`{{ }}`
for templating.
In order not to collide with Ansible’s Jinja2 templating, you can use
`[[ ]]`
for templating which is to be interpreted by Prometheus—it will be replaced by
`{{ }}`
when creating
`/etc/prometheus/rules/ansible_rules.yml`
(see
`prometheus/templates/rules.yml.j2`
for details).
Example:
```
yaml
prometheus_rules
:
groups
:
-
name
:
node
rules
:
-
alert
:
SmartDiskFault
expr
:
smartmon_device_smart_healthy !=
1
annotations
:
summary
:
>-
Disk [[ $labels.disk ]] on [[ reReplaceAll ":[\\d]+" ""
$labels.instance ]] is faulty
# The long line must not be broken, or Prometheus’/Golang’s
# templating engine barfs
# yamllint disable rule:line-length
description
:
|-
Information on the disk:
[[ with printf "smartmon_device_info{disk='%s',instance='%s'}" $labels.disk $labels.instance | query ]]
Model: [[ .Labels.device_model ]]
Serial: [[ .Labels.serial_number ]]
[[ end ]]
# yamllint enable
```
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment