What is Cortex?
Cortex is a CNCF sandbox project that seeks to provide long term storage and a global metrics view for metrics scraped using Prometheus. Prometheus has become the default monitoring applications and systems in a cloud-native world. For any real-world use case, Prometheus should be highly available which has it’s set of challenges. Once you run Prometheus in HA mode there are a bunch of issues such as data duplication, achieving single pane of glass for duplicate data, etc. To solve this problem Cortex was born.
Why do we need Cortex?
Make Prometheus High availability for monitoring & avoid Data De-duplication.
Global metrics view — This provides us with a central location where we can observe the metrics of our entire infrastructure.
Long term storage — Prometheus’s local storage is not meant as durable long term storage. Metrics sent to the cortex are stored in the configured storage service.
Optionally, Cortex also supports Object Stores for storing chunk — GCS, S3
Thanos vs Cortex — differences and similarities
Thanos and Cortex have very similar goals to aggregate metrics, store them in block storage and have a single pane of glass for all your metrics. Thus, it’s no surprise that both projects reuse a lot of Prometheus code. However, there are a few key differences that may help you decide to use one of them over the other.
Cortex | Thanos |
Recent data stored in injestors (A Cortex Component) | Recent data stored in Prometheus |
Use Prometheus write API to write data at the remote location | Use sidecar approach to write data to a remote location |
HA is supprted | HA is supported |
Single setup can be integrated with multiple Prometheus | Single setup can be associated with single Prometheus |
Only Egress is required from cluster running Prometheus | Requires Ingress to cluster in which Prometheus is running for querying |
Indexed chunks | Prom TSDB blocks |
Cortex Installation
Cortex can be easily installed by using Helm package manager in Kubernetes. So, we will use the standard helm chart created by the Cortex team, but before we have to install consul inside the cluster as the data store.
$ helm repo add hashicorp https://helm.releases.hashicorp.com
$ helm search repo hashicorp/consul
$ helm install consul hashicorp/consul --set global.name=consul --namespace cortex
Now we have the data store in-place, we need to configure the storage gateway to connect with a remote storage backend. We decided to go ahead with the S3 bucket in AWS. So we have to customize the default values file of Cortex according to our use-case, you can find the values file here and you need to add your service account under blocks_storage of the config.
$ helm repo add cortex-helm https://cortexproject.github.io/cortex-helm-chart$ helm install cortex --namespace cortex -f my-cortex-values.yaml cortex-helm/cortex
Here we are pretty much done with the cortex setup and now it’s time for configuring the Prometheus to connect with Cortex. Now you can verify the consul and cortex nodes by using kubectl.
Prometheus Configuration
For configuration of the Prometheus. We just have to add these lines of the block in our prometheus.yaml file. The remote write and read API is part of Prometheus to send and receive metrics samples to a third-party API, in our case Cortex.
remote_write:
url: http://cortex.cortex/api/prom/push
Here, we are adding stable helm charts default repository, so we can search & install stable chart from it.
$ helm repo add stable https://charts.helm.sh/stable
$ helm install stable prometheus-community/kube-prometheus-stack -f values.yaml
Now you can verify the Prometheus and Grafana nodes by using kubectl. Once we are done with changes in Prometheus, we can go to the S3 bucket to validate the data creation.
PoC — Prometheus Integration with Cortex
You can either instrument your application with the language APIs and their in-process exporters to send metrics data to a Prometheus backend or you can use the OpenTelemetry Collector with its out-of-process exporters to send metrics data to a Prometheus backend. In this exercise, we are going to use a metrics instrumented application with the OpenTelemetry collector.
Here you can see the OpenTelemetry collector ConfigMap, Deployment and the Service that you need to use.
apiVersion: v1
kind: ConfigMap
metadata:
name: collector-config
data:
collector.yaml: |
receivers:
otlp:
protocols:
grpc:
processors:
batch:
exporters:
logging:
prometheus:
endpoint: 0.0.0.0:8889
namespace: cortex
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [logging]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [logging,prometheus]
OTEL Collector Config Map YAML
apiVersion: apps/v1
kind: Deployment
metadata:
name: otel-collector
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: otel-collector
template:
metadata:
labels:
app.kubernetes.io/name: otel-collector
spec:
containers:
- name: otelcol
image: otel/opentelemetry-collector:0.27.0
args:
- --config=/conf/collector.yaml
- --log-level=debug
volumeMounts:
- mountPath: /conf
name: collector-config
volumes:
- configMap:
items:
- key: collector.yaml
path: collector.yaml
name: collector-config
name: collector-config
OTEL Collector Deployment YAML
apiVersion: v1
kind: Service
metadata:
annotations:
serviceloadbalancer/lb.cookie-sticky-session: "false"
name: otel-collector
spec:
ports:
- name: grpc-otlp
port: 4317
protocol: TCP
targetPort: 4317
- name: prome-otlp
port: 8889
protocol: TCP
targetPort: 8889
selector:
app.kubernetes.io/name: otel-collector
type: LoadBalancer
OTEL Collector Service YAML
The exporters: section is how exporters are configured. Exporters may come with default settings, but many require configuration to specify at least the destination and security settings. Any configuration for an exporter can be done in ConfigMap. Here we use prometheus exporter to export metrics from the application. Now you can create OpenTelemetry collector ConfigMap, Deployment and the Service resources using
$ kubectl create -f FILENAME
and verify the Prometheus, Grafana and OpenTelemetry collector pods by using kubectl.
Before deploying PoC we will configure Prometheus to scrape this new otel-collector target. In Prometheus terms, an endpoint you can scrape is called an instance, usually corresponding to a single process. A collection of instances with the same purpose, a process replicated for scalability or reliability for example, is called a job. To achieve this, add the following job definition to the scrape_configs section in your prometheus.yml and restart your Prometheus instance.
- job_name: 'otel-collector'
scrape_interval: 2s
static_configs:
- targets:
- <ip_address>:8889
Please note that you must replace <ip_address> with your EXTERNAL-IP of the otel-collector service.
$ helm delete stable --namespace cortex$ helm install stable prometheus-community/kube-prometheus-stack --namespace cortex -f values.yaml
Next, You have to clone and configure the PoC which generate metrics. This is a metrics instrumented application which export metrics to OpenTelemetry collector through the agent.
$ git clone https://github.com/open-telemetry/opentelemetry-collector.git$ cd opentelemetry-collector/examples/demo/
Change the otlp exporter endpoint in OpenTelemetry agent according to your external-ip of the collector service and run the application using docker-compose.yaml
$ docker-compose up
Next you need to use port forwarding to access Prometheus and Grafana dashboards in the cluster.
$ kubectl port-forward svc/prometheus-server 3000:80 -n cortex
Open new terminal window and forward a local port to a port on the Grafana service
$ kubectl port-forward svc/stable-grafana 8080:80 -n cortex
Navigate to Prometheus dashboard on http://127.0.0.1:3000/ click status drop-down on the navigation bar and select target then search otel-collector target.
Navigate endpoint of the otel-collector job and you will be able to see the all the metrics that generated from the application.
Verify metrics from Grafana
After we need to setup Grafana Metrics Dashboard from Prometheus data source. To create a Prometheus data source navigate to Grafana application on http://127.0.0.1:8080/ and login. Default login is admin. But there is no default password — it’s generating if it’s not set and stored inside secretes. You can read it in this way:
$ kubectl get secret --namespace cortex grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
After we need to setup Grafana Metrics Dashboard from Prometheus data source. You can create a data source with Prometheus data source type.
1.Click on the “cogwheel” in the sidebar to open the Configuration menu.
2. Click on “Data Sources”.
3. Click on “Add data source”
4. Select “Prometheus” as the type.
5. Set the appropriate Prometheus server URL.
6. Adjust other data source settings as desired.
7. Click “Save & Test” to save the new data source.
You can find the URL and the access mode below.
After we need to setup Grafana Metrics Dashboard from Prometheus data source. You can create a data source with Prometheus data source type. You can find the URL and the access mode below.
The URL : http://cortex-query-frontend-headless:8080/prometheus
Access Mode : "server (default)"
Let’s start with adding a panel to show the request latency of the PoC. Press ‘Add Query’ on the panel and fill in the query using cortex_appdemo_request_latency metric. Then click save and apply buttons you will be able to see the metric dashboard.
Let’s start with adding a panel to show the request latency of the PoC. Press ‘Add Query’ on the panel and fill in the query using cortex_appdemo_request_latency metric. Then click save and apply buttons you will be able to see the metric dashboard.
Can you share the latest cortex chart values. its not working for the latest version