Configuring the Soveren Agent
We use Helm for managing the deployment of Soveren Agents. To customize values sent to your Soveren Agent, you need to create the
values.yaml file in the folder that you use for custom Helm configuration.
You can change a number of things regarding the Soveren Agent deployment. You can always check our repository for the full list of possible values. But don't forget to run a
helm upgrade command after you've updated the
values.yaml file, providing the
-f path_to/values.yaml as a command line option.
values.yaml only for the values that you want to override!
Never use a complete copy of our
values.yaml from the repository. This leads to a lot of glitches in production that are hard and time consuming to track down.
values.yaml for the values that you want to change.
To save you some keystrokes when installing or updating the Agent, we suggest placing the following snippet into the
digger: token: <TOKEN>
Digger is a component of the Agent that actually sends metadata to the Soveren Cloud. Detection tool gets over-the-air updates of the part of the model from the Soveren Cloud. These are the places where the token value is used. (Detection tool gets the token value from Digger.)
You can adjust resource usage limits for each of the Soveren Agent's components.
As a rule of thumb, we do not recommend to change the
requests values. They are set with regard to a minimum reasonable functionality that the component can provide given that much resources.
limits however differ widely between Agent's components, and are heavily traffic dependent. There is no universal recipe for determining them, except to keep an eye on the actual usage and check how fast the data map is built by the product. General tradeoff here is this: the more resources you allow, the faster the map is built.
Soveren Agent does not persist any data, it is completely normal if any component restarts and virtual storage is flushed. The
ephemeral-storage is just for making sure that the virtual disk space is not overused. You can safely get rid of any of them.
The interceptors are placed on each node of the cluster as a DaemonSet. Their ability to collect the traffic is proportional to how much resources they are allowed to use.
HTTP requests and responses with
Content-type: application/json, reading from virtual network interfaces of the host and building request/response pairs. Thus the memory they use is directly proportional to how large those
The reading is done in a non-blocking fashion, leveraging the
libpcap library. If there is not enough CPU the interceptors may not have enough time to read the traffic and build enough request/response pairs relevant for building the data map.
The default configuration is the following. You are encouraged to observe the actual usage for a while and tune the
limits up or down.
interceptor: resources: requests: cpu: "100m" memory: "128Mi" limits: cpu: "1000m" memory: "2048Mi" ephemeral-storage: 100Mi
Kafka is the only component not built by Soveren and used pretty much as is. It can grow very large in terms of the
The default values here are as follows. Under normal circumstances you don't need to touch any of them.
kafka: embedded: resources: requests: cpu: "100m" memory: "650Mi" ephemeral-storage: 5Gi limits: cpu: "200m" memory: "1024Mi" ephemeral-storage: 10Gi
Heap usage by Kafka
In our testing, Kafka was found to be somewhat heap-hungry. That's why we limited the heap usage separately from the main memory usage limits. You don't need to change it but here's what is set as the default:
kafka: embedded: env: - name: KAFKA_HEAP_OPTS value: -Xmx512m -Xms512m
Digger is a component which reads the data from Kafka, sends relevant requests and responses to the Deection tool and collect the results. Then it forms a metadata packet and sends it to the Soveren Cloud which creates all those beautiful product dashboards.
Digger employs all sorts of data sampling algorithms to make sure that all endpoints and assets in the data map are uniformly covered. In particular, Digger looks into the Kafka topics and moves offsets in there according to what has already been covered.
Normally you should not want to change the resource values for Digger but here they are:
digger: resources: requests: cpu: "100m" memory: "100Mi" limits: cpu: "1500m" memory: "768Mi" ephemeral-storage: 100Mi
The Detection tool does all the heavy lifting when it comes to detecting data types in the flows and their sensitivity. It runs a custom built machine learning models using Python for that.
The values for the Detection tool resource consumption are adjusted for optimal performance regardless of the traffic nature. However, in some cases with a lot of heavy traffic it might make sense to increase the limits, so we encourage you to monitor the actual usage and adjust accordingly.
detectionTool: resources: requests: cpu: "100m" memory: "1680Mi" limits: cpu: "1100m" memory: "2304Mi" ephemeral-storage: 200Mi
We run a Prometheus agent to collect some metrics to check basic performance of the Soveren Agent. Values here are pretty generic for most cases.
prometheusAgent: resources: requests: memory: "192Mi" cpu: "75m" limits: memory: "400Mi" cpu: "75m" ephemeral-storage: 100Mi
Sometimes it makes sense to confine the Soveren Agent to dedicated namespaces to monitor. You can do that by explicitly stating the allowed namespaces (the allow list) or by excluding particular ones (the exclude list).
The syntax is like this:
- if nothing is specified then all namespaces will be covered;
- asterisk means everything;
action: allowincludes this namespace into monitoring;
action: denyexcludes this namespace from monitoring.
Here's an example of how you can do this:
digger: cfg: kubernetesfilterlist: definitions: # - namespace: default # action: allow # - namespace: kube-system # action: deny - namespace: "*" action: allow
When defining names you can use wildcards and globs like
devspace-[1-9] etc as defined in the Go path package
The default policy of the Agent is to work with explicitly mentioned namespaces and ignore everything else.
allow * if you set any
If you've placed some
deny definitions into the filter list and want everything else to be monitored then please make sure you've ended the list with the following:
- namespace: "*" action: allow
Changing the log level
By default log levels of all Soveren Agent components is set to
error. You can change this by specifying different log level for individual components, like this:
[digger|interceptor|detectionTool|prometheusAgent]: cfg: log: level: error
(You need to create different config sections for dirrefent components —
prometheusAgent — but the syntax is the same.)
We don't manage the log level of Kafka, by default it's