Nginx Plus Monitoring and Tracing: Harnessing the Power of OpenTelemetry

Raffaele Marcello
7 min readOct 29, 2023

--

With the release of R29, Nginx Plus introduced several enhanced features (for more information, please refer to Announcing NGINX Plus R29 — NGINX ).In this article, I would like to focus on Native OpenTelemetry.

OpenTelemetry is a comprehensive suite of tools, APIs, and SDKs designed for instrumenting, generating, collecting, and exporting telemetry data (including metrics, logs, and traces). It serves as a valuable resource for analyzing the performance and behavior of your software.

Before this official release, it was already possible to incorporate OpenTelemetry distributed tracing support into Nginx using a module provided by the OpenTelemetry contrib project (refer to opentelemetry-cpp-contrib/instrumentation/nginx at main · open-telemetry/opentelemetry-cpp-contrib · GitHub and Learn how to instrument NGINX with OpenTelemetry | OpenTelemetry).

With this new release, the NGINX OTel dynamic module is now available, delivering a high-performance implementation of OTel for tracing HTTP requests with NGINX Plus.

To utilize the new ngx_otel_module with Nginx Plus, you can install it through the nginx-plus-module-otel package. Compared to previeus third party version, the official module provides several enhancements, including:

  • Better Performance
  • Easy Provisioning
  • Fully Dynamic Variable-Based Sampling

Let’s explore how it works and how to configure it. In this demonstration, I will use Docker to build the Nginx Plus image and Docker Compose to run the example. You can find the code for this demo on my github.

Nginx Plus Image Build with Otel module

First we need to build and Nginx Plus image and install the ngx_otel_module on it. To do this, you will require an NGINX Plus subscription (purchased or trial). If you don’t have a subscription, you can request a trial here. Detailed build documentation is available here, for this demo I used the recommended NGINX Plus Dockerfile (Alpine Linux) to which I added the nginx-plus-module-otel module:

FROM alpine:3.17

LABEL maintainer="NGINX Docker Maintainers <docker-maint@nginx.com>"

# Define NGINX versions for NGINX Plus and NGINX Plus modules
# Uncomment this block and the versioned nginxPackages in the main RUN
# instruction to install a specific release
# ENV NGINX_VERSION 29
# ENV NJS_VERSION 0.7.12
# ENV PKG_RELEASE 1

# Download certificate and key from the customer portal (https://account.f5.com)
# and copy to the build context
RUN --mount=type=secret,id=nginx-crt,dst=cert.pem \
--mount=type=secret,id=nginx-key,dst=cert.key \
set -x \
# Create nginx user/group first, to be consistent throughout Docker variants
&& addgroup -g 101 -S nginx \
&& adduser -S -D -H -u 101 -h /var/cache/nginx -s /sbin/nologin -G nginx -g nginx nginx \
# Install the latest release of NGINX Plus and/or NGINX Plus modules
# Uncomment individual modules if necessary
# Use versioned packages over defaults to specify a release
&& nginxPackages=" \
nginx-plus \
# nginx-plus=${NGINX_VERSION}-r${PKG_RELEASE} \
# nginx-plus-module-xslt \
# nginx-plus-module-xslt=${NGINX_VERSION}-r${PKG_RELEASE} \
# nginx-plus-module-geoip \
# nginx-plus-module-geoip=${NGINX_VERSION}-r${PKG_RELEASE} \
# nginx-plus-module-image-filter \
# nginx-plus-module-image-filter=${NGINX_VERSION}-r${PKG_RELEASE} \
# nginx-plus-module-perl \
# nginx-plus-module-perl=${NGINX_VERSION}-r${PKG_RELEASE} \
# nginx-plus-module-njs \
# nginx-plus-module-njs=${NGINX_VERSION}.${NJS_VERSION}-r${PKG_RELEASE} \
nginx-plus-module-otel \
" \
KEY_SHA512="e09fa32f0a0eab2b879ccbbc4d0e4fb9751486eedda75e35fac65802cc9faa266425edf83e261137a2f4d16281ce2c1a5f4502930fe75154723da014214f0655" \
&& wget -O /tmp/nginx_signing.rsa.pub https://nginx.org/keys/nginx_signing.rsa.pub \
&& if echo "$KEY_SHA512 */tmp/nginx_signing.rsa.pub" | sha512sum -c -; then \
echo "key verification succeeded!"; \
mv /tmp/nginx_signing.rsa.pub /etc/apk/keys/; \
else \
echo "key verification failed!"; \
exit 1; \
fi \
&& cat cert.pem > /etc/apk/cert.pem \
&& cat cert.key > /etc/apk/cert.key \
&& apk add -X "https://pkgs.nginx.com/plus/alpine/v$(egrep -o '^[0-9]+\.[0-9]+' /etc/alpine-release)/main" --no-cache $nginxPackages \
&& if [ -f "/etc/apk/keys/nginx_signing.rsa.pub" ]; then rm -f /etc/apk/keys/nginx_signing.rsa.pub; fi \
&& if [ -f "/etc/apk/cert.key" ] && [ -f "/etc/apk/cert.pem" ]; then rm -f /etc/apk/cert.key /etc/apk/cert.pem; fi \
# Bring in tzdata so users could set the timezones through the environment
# variables
&& apk add --no-cache tzdata \
# Bring in curl and ca-certificates to make registering on DNS SD easier
&& apk add --no-cache curl ca-certificates \
# Forward request and error logs to Docker log collector
&& ln -sf /dev/stdout /var/log/nginx/access.log \
&& ln -sf /dev/stderr /var/log/nginx/error.log

EXPOSE 80

STOPSIGNAL SIGQUIT

CMD ["nginx", "-g", "daemon off;"]

To create a Docker image put the Dockerfile, nginx-repo.crt, and nginx-repo.key files in the same directory and run the following command:

docker build --no-cache -t nginxplus --secret id=nginx-crt,src=your_cert_file --secret id=nginx-key,src=your_key_file .

Demo Architecture

The architecture that I propose for this demo consists of two Nginx Plus instances: one serves as an API Gateway, and the other as a backend service (a time server). The API Gateway forwards requests to the backend service. Both Nginx Plus instances are instrumented to send traces to an OpenTelemetry collector, which, in turn, forwards these traces to a Zipkin instance.

To demonstrate this setup in action, we will utilize Docker Compose to create this straightforward architecture. We’ll inject traffic into the system and observe the resulting traces on the Zipkin dashboard.”

The Docker compose descriptor

The previous architecture can be reproduced with the following `docker-compose.yml` file:

version: '3'
services:
zipkin:
image: openzipkin/zipkin:2.24
container_name: zipkin
environment:
- STORAGE_TYPE=mem
ports:
# Port used for the Zipkin UI and HTTP Api
- 9411:9411
collector:
image: otel/opentelemetry-collector:0.88.0
command: ['--config=/etc/otel-collector-config.yaml']
volumes:
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
depends_on:
- zipkin
api-gateway:
image: nginx-plus:r29
ports:
- "8080:80"
volumes:
- ./api-gtw/nginx.conf:/etc/nginx/nginx.conf
depends_on:
- collector
- backend
backend:
image: nginx-plus:r29
ports:
- "8081:80"
volumes:
- ./backend/nginx.conf:/etc/nginx/nginx.conf
depends_on:
- collector

This setup includes an in-memory Zipkin instance, an OpenTelemetry collector instance, and two Nginx Plus instances — one serving as an API Gateway and the other as a backend time server.

Now, let’s delve into the detailed configuration of each container.

Nginx configuration

The configuration for the api gateway is the following:

load_module modules/ngx_otel_module.so;
events {
worker_connections 1024;
}
http {
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;

otel_service_name api-gateway;
otel_exporter {
endpoint collector:4317;
}

server {
listen 80;

otel_trace on;
otel_trace_context propagate;

location /uuid {
proxy_pass http://backend:80;
}

location /time {
proxy_pass http://backend:80;
}

}
}

This Nginx setup for the API gateway follows the typical configuration, with the exception of specific parts where it’s tailored for OpenTelemetry integration:

  • otel_service_name: This parameter defines the service name that will appear in the trace data;
  • otel_exporter: Specifies the OTel data export endpoint, which is the address of the OTLP/gRPC endpoint that will receive telemetry data;
  • otel_trace on;: This directive enables or disables OpenTelemetry tracing. In this demo, I’ve set it to ‘on’ to ensure that all traces are sent to the collector. However, it’s possible to specify a variable for more fine-grained control over the sampling. Refer to the documentation for further information.
  • otel_trace_context propagate;: Here, we specify how to propagate the traceparent/tracestate headers. The ‘propagate’ setting uses an existing trace context from the request if it already exists; otherwise, it injects a new context. Note that there are other possible values for this configuration — please consult the documentation for more details.

The configuration for the time server backend is as follows:

load_module modules/ngx_otel_module.so;
events {
worker_connections 1024;
}
http {
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;

otel_service_name backend;
otel_exporter {
endpoint collector:4317;
}

server {
listen 80;

otel_trace on;
otel_trace_context propagate;

location /uuid {
proxy_pass https://httpbin.org/uuid;
}

location /time {
default_type application/json;
return 200 '{"status":"success","time":"$time_local", "msec":"$msec"}';
}
}
}

OTEL Collector Configuration

Here is the configuration for the OpenTelemetry collector:

receivers:
otlp:
protocols:
grpc:
http:
otlp/2:
protocols:
grpc:
endpoint: 0.0.0.0:55690
exporters:
zipkin/nontls:
endpoint: "http://zipkin:9411/api/v2/spans"
format: proto
default_service_name: default-service
service:
pipelines:
traces:
receivers: [otlp]
exporters: [zipkin/nontls]

The OpenTelemetry collector configuration includes a traces pipeline composed by:

  • One receiver for the OTLP protocol, which listens on the default port 4317.
  • One exporter for ‘zipkin/nontls’ on port 9411.

It’s important to note that this demo does not currently employ TLS for data flows. While there’s room for improvement in this regard, the primary focus here is to explore the functionality of OpenTelemetry.

Run the demo

To start the demo, use the following command:

docker compose up -d

This command will create four containers, including two Nginx Plus instances (one for an API gateway and one for a time service), an OpenTelemetry collector, and a Zipkin instance.

Creating network "nginx-otel_default" with the default driver
Creating zipkin ... done
Creating nginx-otel_collector_1 ... done
Creating nginx-otel_backend_1 ... done
Creating nginx-otel_api-gateway_1 ... done

To begin, open a web browser and navigate to the Zipkin dashboard at http://localhost:9411. Upon initial access, the dashboard will be empty. Let’s generate some traffic by running the following command:

for i in {1..5}
do
curl http://localhost:8080/uuid
curl http://localhost:8080/time
done

We can view the traces in Zipkin:

For more detailed information, select a specific trace to display its details:

Zipkin can also generate a graph of dependencies:

In this scenario, the graph consists of only two nodes: one for our API gateway and one for our backend.

Conclusion and next steps

In conclusion, the integration of Nginx Plus with OpenTelemetry is a powerful step towards achieving comprehensive observability and performance optimization for your applications. By implementing this integration, you gain real-time insights into your system’s performance, identify bottlenecks, and effectively troubleshoot issues before they impact your users. The benefits are clear: increased operational efficiency, reduced downtime, and improved user experiences.

This demo demonstrated how to instrument Nginx Plus to push traces to the OpenTelemetry collector. The collector can be configured to send traces to other software, either locally (for example, Jaeger) or in the cloud. There are different solutions available, one of which can be Azure Application Insights. From the cloud, these traces can be combined with the other components of our architecture.

--

--

Raffaele Marcello
Raffaele Marcello

Written by Raffaele Marcello

I love Cloud technologies and I am a AWS Certified Solutions Architect and SysOps Administrator. Here you can find some articles about my experiences.

No responses yet