Grega Bremec 2 роки тому
батько
коміт
be55f6838f
4 змінених файлів з 194 додано та 11 видалено
  1. 194 11
      README.adoc
  2. BIN
      pics/psacct-sample.png
  3. BIN
      pics/sysstat-sample-io.png
  4. BIN
      pics/sysstat-sample-sched.png

+ 194 - 11
README.adoc

@@ -22,7 +22,7 @@ the way in order to keep the disk space utilisation as low as possible.
 
 TBD
 
-== Standalone ==
+== Standalone Containers ==
 
 Start the composition.
 
@@ -44,6 +44,31 @@ d4840ad57bfffd4b069e7c2357721ff7aaa6b6ee77f90ad4866a76a1ceb6adb7
 
 ------
 
+Configure prometheus with a data source from the `exporter` container.
+
+[subs=+quotes]
+------
+$ *podman inspect -f '{{.NetworkSettings.IPAddress}}' exporter
+10.88.0.8
+
+$ *tail -n15 tmp-test/prometheus.yml*
+
+scrape_configs:
+  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
+  - job_name: "prometheus"
+    static_configs:
+      - targets: ["localhost:9090"]
+
+  **- job_name: "exporter"
+    metrics_path: "/q/metrics"
+    scheme: "http"
+    static_configs:
+      - targets: ["10.88.0.8:8080"]
+    scrape_interval: 10s
+    scrape_timeout: 5s**
+
+------
+
 Add prometheus and grafana.
 
 [subs=+quotes]
@@ -62,17 +87,29 @@ $ *podman run --name grafana -d --rm -p 3000:3000 \*
 78d5bfa7977923b828c1818bb877fa87bdd96086cc8c875fbc46073489f6760e
 ------
 
+Configure grafana with prometheus as the datasource and dashboard away!
+
+.Process Accounting Graphs from a Single Host
+image::pics/psacct-sample.png[scaledwidth="95%" width="95%"]
+
+.Sysstat Scheduler Information, Single Host
+image::pics/sysstat-sample-sched.png[scaledwidth="95%" width="95%"]
+
+.Sysstat I/O Information, Single Host
+image::pics/sysstat-sample-io.png[scaledwidth="95%" width="95%"]
+
 == Images ==
 
 This set of images requires a valid entitlement for RHEL (and consequently
 either a RHEL system to build on or a RHEL system to create an entitlement
 secret from).
 
-IMPORTANT: You do not have to build the images, they are already provided by the `is-readymade.yml` resource.
+IMPORTANT: You do not have to build the images, I have built them for `x86_64` and made them available on `quay.io/benko/`.
 
 === SAR ===
 
-The _system activity reporting_ image is based on `ubi-minimal` and includes just the `sysstat` package.
+The _system activity reporting_ image is based on `ubi-minimal` and includes
+just the `sysstat` package.
 
 It expects a volume to be attached at `/var/log/sa`.
 
@@ -80,9 +117,11 @@ Entrypoint takes care of initialising the `saXX` files.
 
 // TODO: and rotating any old files out of the way.
 
-It *requires* to be executed under `root` UID (can be rootless, but that may affect your data depending on host and container configuration).
+It *requires* to be executed under `root` UID (can be rootless, but that may
+affect your data depending on host and container configuration).
 
-It also *requires* access to host's network namespace if you want to measure global network statistics.
+It also *requires* access to host's network namespace if you want to measure
+global network statistics.
 
 // NOTE: When running in a pod, the below is irrelevant as the exporter sets
 //	    the hostname, and you can override it there. It does however obtain
@@ -94,17 +133,32 @@ It also *requires* access to host's network namespace if you want to measure glo
 
 ==== Parameters ====
 
-TBD
+`PERIOD`::
+    Sampling period in seconds. Defaults to `10`. Increase this to something
+    like `30` (or more) for hosts with many network interfaces, block devices,
+    and/or CPUs.
+
+`STARTUP_SCRATCH`::
+    Whether to scratch existing `sa1` data at startup. Defaults to `0`, but
+    could be anything except `1`, `yes`, or `true`, which activates it.
+    
+`STARTUP_ROTATE`::
+    Whether to mark data as rotated at startup. Basically just writes a marker
+    in the previous `sadc` data file. Defaults to `0`, but could be anything
+    except `1`, `yes`, or `true`, which activates it.
 
 === PSACCT ===
 
-The _process accounting_ image is based on `ubi-minimal` and includes just the `psacct` package.
+The _process accounting_ image is based on `ubi-minimal` and includes just the
+`psacct` package.
 
 It expects a volume to be attached at `/var/account`.
 
 Entrypoint takes care of rotating any old `pacct` files out of the way.
 
-In addition to *requiring* execution under a *real* `root` UID (i.e. *NOT* a rootless container), it also *requires* the `CAP_SYS_PACCT` capability (`--cap-add=SYS_PACCT`) and access to host's PID namespace (`--pid=host`).
+In addition to *requiring* execution under a *real* `root` UID (i.e. *NOT* a
+rootless container), it also *requires* the `CAP_SYS_PACCT` capability
+(`--cap-add=SYS_PACCT`) and access to host's PID namespace (`--pid=host`).
 
 // NOTE: When running in a pod, the below is irrelevant as the exporter sets
 //	    the hostname, and you can override it there. It does however obtain
@@ -116,11 +170,23 @@ In addition to *requiring* execution under a *real* `root` UID (i.e. *NOT* a roo
 
 ==== Parameters ====
 
-TBD
+`PERIOD`::
+    Sampling period in seconds. Defaults to `10`. Increase this to something
+    like `30` (or more) for hosts with many network interfaces, block devices,
+    and/or CPUs.
+
+`CUMULATIVE`::
+    Tells the collection process to never reset the `pacct` file and just keep
+    it growing, thus reporting cumulative stats since container start. Beware
+    that the `pacct` file will grow correspondinly large as time goes by.
+
+`STARTUP_SCRATCH`::
+    Whether to scratch existing `pacct` data at startup. Defaults to `0`, but
+    could be anything except `1`, `yes`, or `true`, which activates it.
 
 === Exporter ===
 
-TBD
+The brain of the group.
 
 // TODO: Add support for hostname overrides in app.
 
@@ -141,7 +207,44 @@ TBD
 
 ==== Parameters ====
 
-TBD
+In `application.properties` or as Java system properties:
+
+`exporter.data.path`::
+    Override the location where the metrics files are expected to show up.
+    Defaults to `/metrics` but obviously can't be that for testing outside of a
+    container.
+
+==== Debugging ====
+
+There are a couple of logger categories that might help you see what's going on.
+
+By default, the routes are fairly noisy, as apparently `TRACE` level logging
+doesn't work for some reason, so I had to bump everything up a level, so at
+`INFO` you already see a note about every record that's been processed - you
+will see their unmarshaled bodies (completely shameless, I know).
+
+These can be bumped up to `DEBUG` if you need more info:
+
+`psacct-reader`::
+    The route reading process accounting files from `psacct-dump-all` file.
+    Pretty much all the logic is here, but since there can be a large number of
+    process records in the file it is split and each record is processed
+    asynchronously by the dispatch route.
+
+`psacct-dispatch`::
+    The route dispatching the records to the registration service.
+
+`psacct-reset`::
+    To be able to work with instantaneous data, rather than cumulative, all
+    previously registered records are synchronously reset to zero upon the
+    arrival of a new snapshot. This prevents metrics for previously registered
+    processes from disappearing.
+
+`sysstat-reader`::
+    The route that reads `sysstat-dump.json` file. All the logic is here.
+
+`net.p0f.openshift.metrics`::
+    Non-camel stuff is all logged in this category.
 
 === Building with Podman ===
 
@@ -157,8 +260,20 @@ base image by using the `--from` option in `podman build`.
 $ *podman build --from=registry.fedoraproject.org/fedora-minimal:36 -f ./images/Containerfile-sysstat -t collector-sysstat:latest*
 -------------------------------
 
+You will have noticed there is no `Containerfile` for exporter. That is because
+`quarkus-maven-plugin` can do just fine
+https://quarkus.io/guides/container-image[building an image on its own]. Just
+add the `jib` extension and tell it to push the image somewhere.
+
+[subs=+quotes]
+-------------------------------
+$ *mvn package -Dquarkus.container-image.build=true -Dquarkus.container-image.push=true -Dquarkus.container-image.registry=foo*
+-------------------------------
+
 === Building in OpenShift ===
 
+==== Collector Images ====
+
 If building the images in OpenShift Container Platform, you must make sure an
 entitlement secret and corresponding RHSM certificate secret are mounted inside
 the build pod in order for packages to be found and installed.
@@ -229,3 +344,71 @@ NOTE: Key thing in `Containerfile` steps is to remove `/etc/rhsm-host` at some
       point unless `/etc/pki/entitlement-host` contains something (such as for
       example, valid entitlemets). Both are symlinks to `/run/secrets`.
 
+==== Exporter Image ====
+
+===== Java Build =====
+
+Java build is relatively simple.
+
+Figure out what OpenJDK image is available in the cluster and create a new build.
+
+[subs=+quotes]
+-------------------------------
+$ *oc new-build openjdk-11-rhel8:1.0~https://github.com/benko/linux-metrics-exporter.git --context-dir=exporter*
+-------------------------------
+
+Wait for the build to complete (it's going to take quite some time to download all deps) and that's it!
+
+If you're experimenting with the code, don't forget to mark the build as incremental.
+
+[subs=+quotes]
+-------------------------------
+$ *oc patch bc/linux-metrics-exporter -p '{"spec": {"strategy": {"sourceStrategy": {"incremental": true}}}}'*
+-------------------------------
+
+===== Native Build =====
+
+TBD
+
+// For the native build, you need a specific Mandrel image. Import it first.
+// 
+// $ oc import-image mandrel --from=registry.redhat.io/quarkus/mandrel-21-rhel8:latest --confirm
+// imagestream.image.openshift.io/mandrel imported
+// ...
+
+===== Publishing Image =====
+
+Make sure the internal OpenShift image registry is exposed if you want to copy the image somewhere else.
+
+[subs=+quotes]
+-------------------------------
+$ *oc patch config.imageregistry/cluster --type=merge -p '{"spec": {"defaultRoute": true}}'*
+-------------------------------
+
+Login to both source and target registries.
+
+[subs=+quotes]
+-------------------------------
+$ *podman login quay.io*
+Username: *youruser*
+Password: *yourpassword*
+Login Succeeded!
+
+$ *oc whoami -t*
+sha256~8tIizkcLNroDEcWXJgoPMsVYUriK1sGnJ6N94WSveEU
+
+$ podman login default-route-openshift-image-registry.apps.your.openshift.cluster
+Username: _this-is-irrelevant_
+Password: *token-pasted-here*
+Login Succeeded!
+-------------------------------
+
+Then simply copy the image using `skopeo`.
+
+[subs=+quotes]
+-------------------------------
+$ *skopeo copy \*
+    *docker://default-route-openshift-image-registry.apps.your.openshift.cluster/project/linux-metrics-exporter:latest \*
+    *docker://quay.io/youruser/yourimage:latest*
+-------------------------------
+

BIN
pics/psacct-sample.png


BIN
pics/sysstat-sample-io.png


BIN
pics/sysstat-sample-sched.png