VictoriaLogs on k0s: a log database that fits in 128 MiB
Published: 2026-06-10
Metrics on this cluster have been handled by VictoriaMetrics for a while, but logs were still kubectl logs and grep. I wanted Traefik access logs queryable — who hits which site, which requests fail — without renting a bigger VPS. VictoriaLogs turned out to be the whole answer: a single container with a 128 MiB memory limit, a 5 GiB hostPath volume, and a query UI. This post covers the server side; the log shipper is the next post.
Why VictoriaLogs
The contenders for "logs on a 2-core VPS shared with the actual workloads":
- Elasticsearch — needs more heap than this VPS has RAM. Ruled out in seconds.
- Loki — designed for object storage and microservices mode; single-binary mode works but still wants hundreds of MiB, and its label-cardinality rules push complexity onto the agent config.
- VictoriaLogs — one Go binary, no JVM, no object storage, indexes all fields including high-cardinality ones (IPs, paths), and speaks the Loki push protocol so every existing agent works with it.
I already run a clustered VictoriaLogs + Vector setup at work, so the query language (LogsQL) carries over. The homelab version is radically smaller: one replica, one Deployment, no Helm chart — the whole thing is ~90 lines of YAML.
The Deployment
yamlapiVersion: apps/v1
kind: Deployment
metadata:
name: victoria-logs
namespace: logging
spec:
replicas: 1
strategy:
type: Recreate
selector:
matchLabels:
app: victoria-logs
template:
metadata:
labels:
app: victoria-logs
spec:
tolerations:
- operator: Exists
containers:
- name: victoria-logs
image: victoriametrics/victoria-logs:v1.11.0-victorialogs
args:
- --storageDataPath=/data
- --retentionPeriod=7d
- --httpListenAddr=:9428
- --loggerFormat=json
ports:
- containerPort: 9428
name: http
resources:
requests:
cpu: 10m
memory: 32Mi
limits:
cpu: 200m
memory: 128Mi
volumeMounts:
- name: data
mountPath: /data
livenessProbe:
httpGet:
path: /health
port: 9428
initialDelaySeconds: 10
periodSeconds: 30
volumes:
- name: data
persistentVolumeClaim:
claimName: victoria-logs-data
Worth pausing on three choices:
strategy: Recreate. The volume is a ReadWriteOnce hostPath claim. The default RollingUpdate would try to start the new pod while the old one still holds the data directory — two writers on one storage directory is how you corrupt an LSM tree. Recreate kills the old pod first. The trade-off is a few seconds of ingestion downtime per deploy, which doesn't matter: agents buffer and retry.
The resource block. 32 MiB request, 128 MiB limit — these are real numbers from a running instance ingesting Traefik access logs for nine sites. VictoriaLogs in steady state sits around 40–60 MiB here.
retentionPeriod=7d. Access logs are operational data, not an archive. A week answers "what happened" questions; anything that needs to live longer is a metric and already lives in VictoriaMetrics with 90-day retention.
Storage
Same pattern as the metrics volume — a hand-made PV pinned to a host directory, bound explicitly:
yamlapiVersion: v1
kind: PersistentVolume
metadata:
name: victoria-logs-data
spec:
capacity:
storage: 5Gi
accessModes: [ReadWriteOnce]
persistentVolumeReclaimPolicy: Retain
hostPath:
path: /var/lib/victoria-logs
type: DirectoryOrCreate
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: victoria-logs-data
namespace: logging
spec:
accessModes: [ReadWriteOnce]
resources:
requests:
storage: 5Gi
volumeName: victoria-logs-data
5 GiB is oversized on purpose. VictoriaLogs compresses aggressively — a week of access logs for these sites is tens of MiB on disk. The headroom exists so that adding more log sources later doesn't require a storage migration.
Ingestion: three protocols, zero plugins
The Service is a plain ClusterIP on 9428. The same port accepts:
/insert/jsonline— VictoriaLogs' native format/insert/loki/api/v1/push— the Loki protocol, which is what Promtail, Grafana Alloy, and Vector'slokisink speak/insert/elasticsearch/_bulk— the Elasticsearch bulk API, for Filebeat/Logstash
This is the practical reason VictoriaLogs slots into an existing setup so easily: any agent you already know how to configure can ship to it unmodified. My shipper points at:
http://victoria-logs.logging.svc.cluster.local:9428/insert/loki/api/v1/push
Querying: VMUI and LogsQL
The built-in UI lives at :9428/select/vmui. Port-forward and go:
bashkubectl -n logging port-forward svc/victoria-logs 9428:9428
# → http://localhost:9428/select/vmui
LogsQL reads like a pipeline. A stream selector, then filters:
{job="traefik-access"} # everything from one stream
{job="traefik-access"} DownstreamStatus:~"[45][0-9][0-9]" # only 4xx/5xx
{job="traefik-access"} RequestPath:"/feed" _time:1h # one path, last hour
_time:5m error # word "error" anywhere, 5 min
Unlike Loki, field filters like DownstreamStatus:~"[45][0-9][0-9]" don't require the field to be a stream label — VictoriaLogs indexes the log fields themselves. That removes the whole "which labels are safe" cardinality design exercise.
Grafana datasource
Grafana needs the victoriametrics-logs-datasource plugin, installed and provisioned from the Helm values:
yamlplugins:
- victoriametrics-logs-datasource
datasources:
datasources.yaml:
apiVersion: 1
datasources:
- name: VictoriaLogs
type: victoriametrics-logs-datasource
url: http://victoria-logs.logging.svc.cluster.local:9428
access: proxy
After that, logs panels in dashboards take LogsQL expressions directly.
What can go wrong
Liveness kills the pod during startup. After an unclean shutdown VictoriaLogs replays its WAL before /health responds. With a tight initialDelaySeconds the kubelet kills it mid-replay and you get a restart loop. 10 seconds is enough for this data volume; if your volume is bigger, scale the delay with it (I hit exactly this class of problem with another stateful pod on this node and ended up at initialDelaySeconds: 30).
Provisioned dashboards show "datasource not found". Grafana assigns a random datasource UID unless you pin one. A dashboard JSON committed with "uid": "victorialogs" breaks when the provisioned datasource got UID PD775F2863313E6C7. Either set uid: explicitly in the datasource provisioning block, or copy the generated UID into the dashboard JSON. I learned this by staring at an empty logs panel that worked fine in Explore.
Disk fills despite retention. Retention deletes whole partitions, and the deletion runs on a schedule — a sudden log flood (a bot hammering a site, an app stuck in an error loop) can outrun it. The fix is at the agent: drop noisy streams before they're shipped, don't try to filter them at query time.
Summary
- One Deployment, one Service, one 5 GiB hostPath PV — the entire log backend in ~90 lines of YAML
- 32 MiB request / 128 MiB limit is genuinely enough for single-node access-log volume
strategy: Recreateis mandatory with a ReadWriteOnce hostPath volume — RollingUpdate means two writers- Loki-compatible push endpoint means Promtail/Alloy/Vector work without plugins
- LogsQL filters on any field without cardinality planning — the main day-to-day win over Loki
- Pin the Grafana datasource UID, or provisioned dashboards will point at nothing