Skip to main content
1007

Using AWS Distro for OpenTelemetry to instrument your Java application

Created
Active
Last edited
Viewed 1k times
3 min read
Part of AWS Collective
7

It is usually a fun experience to see observability in action with AWS. Whenever you go to the AWS console, it feels like a candy store for telemetry data. With AWS X-Ray you see all those fancy traces and spans revealing the end-to-end execution of code. With Amazon CloudWatch, you can quickly visualize metrics with different charts and configurations. But have you ever wondered how data ended up there in the first place? Surely, most AWS services automatically push their telemetry data there, so you never have to worry about this. But what if you need to send telemetry data from a particular Java microservice that you built? This guide will show you how to instrument a microservice written in Java to send telemetry data to AWS using AWS Distro for OpenTelemetry.

💡 You can find the complete code from this tutorial on GitHub. Read more about how to instrument Java applications using OpenTelemetry in this blog post, or watch in this hands-on YouTube series.

To illustrate how this is done, we will use a microservice implemented using Spring Boot. This microservice creates both traces and metrics using the OpenTelemetry SDK for Java.

package tutorial.buildon.aws.o11y;

import java.util.Objects;
import javax.annotation.PostConstruct;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import org.springframework.beans.factory.annotation.Value;
import org.springframework.web.bind.annotation.*;

import io.opentelemetry.api.GlobalOpenTelemetry;
import io.opentelemetry.api.metrics.LongCounter;
import io.opentelemetry.api.metrics.Meter;
import io.opentelemetry.api.trace.Span;
import io.opentelemetry.api.trace.Tracer;
import io.opentelemetry.context.Scope;
import io.opentelemetry.instrumentation.annotations.WithSpan;

import static tutorial.buildon.aws.o11y.Constants.*;
import static java.lang.Runtime.*;

@RestController
public class HelloAppController {

    private static final Logger log =
        LoggerFactory.getLogger(HelloAppController.class);

    @Value("otel.traces.api.version")
    private String tracesApiVersion;

    @Value("otel.metrics.api.version")
    private String metricsApiVersion;

    private final Tracer tracer =
        GlobalOpenTelemetry.getTracer("io.opentelemetry.traces.hello",
            tracesApiVersion);

    private final Meter meter =
        GlobalOpenTelemetry.meterBuilder("io.opentelemetry.metrics.hello")
            .setInstrumentationVersion(metricsApiVersion)
            .build();

    private LongCounter numberOfExecutions;

    @PostConstruct
    public void createMetrics() {

        numberOfExecutions =
            meter
                .counterBuilder(NUMBER_OF_EXEC_NAME)
                .setDescription(NUMBER_OF_EXEC_DESCRIPTION)
                .setUnit("int")
                .build();

        meter
            .gaugeBuilder(HEAP_MEMORY_NAME)
            .setDescription(HEAP_MEMORY_DESCRIPTION)
            .setUnit("byte")
            .buildWithCallback(
                r -> {
                    r.record(getRuntime().totalMemory() - getRuntime().freeMemory());
                });

    }

    @RequestMapping(method= RequestMethod.GET, value="/hello")
    public Response hello() {
        Response response = buildResponse();
        // Creating a custom span
        Span span = tracer.spanBuilder("mySpan").startSpan();
        try (Scope scope = span.makeCurrent()) {
            if (response.isValid()) {
                log.info("The response is valid.");
            }
            // Update the synchronous metric
            numberOfExecutions.add(1);
        } finally {
            span.end();
        }
        return response;
    }

    @WithSpan
    private Response buildResponse() {
        return new Response("Hello World");
    }

    private record Response (String message) {
        private Response {
            Objects.requireNonNull(message);
        }
        private boolean isValid() {
            return true;
        }
    }

}

This code produces three spans and two metrics. The spans created are the root span /hello and then the child spans buildResponse and mySpan in that order. As for the metrics, it creates a counter metric named custom.metric.number.of.exec that increments every time the microservice is invoked, and a gauge metric named custom.metric.heap.memory that keeps monitoring the JVM heap utilization.

All of this is possible thanks to the objects tracer and meter that are created during the microservice initialization. However, the JVM that executes the microservice must be properly instrumented so that these objects can be created in the first place. To instrument the JVM, you need to download the AWS distribution of the OpenTelemetry agent for Java and execute the code for this microservice with this agent. Let's automate this process with a script. Create a file named run-microservice.sh with the following content:

mvn clean package -Dmaven.test.skip=true

AGENT_FILE=opentelemetry-javaagent-all.jar
if [ ! -f "${AGENT_FILE}" ]; then
  curl -L https://github.com/aws-observability/aws-otel-java-instrumentation/releases/download/v1.19.2/aws-opentelemetry-agent.jar --output ${AGENT_FILE}
fi

export OTEL_TRACES_EXPORTER=otlp
export OTEL_METRICS_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:5555

export OTEL_RESOURCE_ATTRIBUTES=service.name=hello-app,service.version=1.0
export OTEL_TRACES_SAMPLER=always_on
export OTEL_IMR_EXPORT_INTERVAL=1000
export OTEL_METRIC_EXPORT_INTERVAL=1000

java -javaagent:./${AGENT_FILE} -jar target/hello-app-1.0.jar

By executing the microservice with this script, the agent will automatically instrument its JVM and make creating the objects tracer and meter possible. The agent also instrument all the libraries and frameworks that the microservice code uses. Note that the script run-microservice.sh exports an environment variable called OTEL_EXPORTER_OTLP_ENDPOINT that instructs the agent to send all traces and metrics to the OTLP endpoint http://localhost:5555. This must be an endpoint exposed by an OpenTelemetry collector that will take care of receiving all the telemetry data produced and send to AWS.

To implement the OpenTelemetry collector, you need to create a configuration file that describes how the processing pipeline should work. A processing pipeline comprises three components: one or more receivers, optional processors, and one or more exporters. Create a file named collector-config-aws.yaml with the following content:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:5555

processors:
  batch:
    timeout: 5s
    send_batch_size: 1024

exporters:
  awsemf:
    region: 'us-east-1'
    log_group_name: '/metrics/otel'
    log_stream_name: 'otel-using-java'
  awsxray:
    region: 'us-east-1'

service:
  pipelines:
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [awsemf]
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [awsxray]

Let's understand this processing pipeline.

  1. In the receivers section, we declared an OTLP endpoint exposed over the port 5555 that binds to all network interfaces.

  2. In the exporters section, we have declared an exporter named awsemf responsible for sending the metrics to Amazon CloudWatch.

  3. Also in the exporters section, we declared another exporter named awsxray responsible for sending the traces to AWS X-Ray.

  4. A batch processor to send telemetry data to AWS in batches, either every 5 seconds or when the batch size reaches 1KB of data.

  5. All these components, after declared, were used in the service pipeline section.

You can learn more about the OpenTelemetry collector configuration in the official documentation.

Now all we need is to execute an instance of the OpenTelemetry collector. AWS provides a container image for the collector that is fully compatible with the OpenTelemetry implementation. You can use this image to execute a container instance. For example, you can create a Docker Compose file that defines a container named collector that uses the configuration file collector-config-aws.yaml created earlier.

version: '3.0'

services:

  collector:
    image: public.ecr.aws/aws-observability/aws-otel-collector:latest
    container_name: collector
    hostname: collector
    command: ["--config=/etc/collector-config.yaml"]
    environment:
      - AWS_PROFILE=default
    volumes:
      - ./collector-config-aws.yaml:/etc/collector-config.yaml
      - ~/.aws:/root/.aws
    ports:
      - "5555:5555"

Keep in mind though, that for the collector to use the exporters awsemf and awsxray properly, it needs to have a valid AWS credential. In the example above, we provided the AWS credential by creating a volume mount between the host folder ~/.aws and the container folder /root/.aws. This way, the container can read the credentials stored in the credentials file. To add your credentials to this file, you must run the command aws configure using the AWS CLI as explained here.

4
  • I have a python application, that is doing something similar to your java exemple application. In my case, each time an endpoint is accessed, it increments a counter meter (called ping) by 1 and record a random float value in a histogram meter (called ping_histogram). I can see both metrics on "All metrics" section of my CloudWatch dashboard, but I'm having a hard time working with those metrics, even for simple tasks. How do I write a query just tells number of times my endpoint has been called, per day, for instance? Do you have/know any material on that?
    – Diogo Melo
    Commented Dec 8, 2022 at 12:11
  • 👋🏻 Hi Diogo. One thing you can do is use the Log Insights feature from Amazon CloudWatch and create a query that sums your custom metric over a single day. Here is an how I would do for the example from this tutorial: stats sum(custom.metric.number.of.exec) by bin(1d) Here is a link with more examples: docs.aws.amazon.com/AmazonCloudWatch/latest/logs/… Alternatively, you can send the metrics to Amazon Prometheus and slide-and-dice your metrics there with more flexible retention time. Commented Dec 8, 2022 at 15:17
  • Hi @RicardoFerreira. Thanks for the help. I see the metrics coming both on Logs Insights and on Metrics. Say that I make only one request. From Logs Insight, it is sending around 2 events (not sure if that's the proper term) per minute, during 20 mins. So, If I try to group by bin(5m), one request will generate around 50 ticks/counts. I'm using the image public.ecr.aws/aws-observability/aws-otel-collector:latest with --config=/etc/ecs/ecs-cloudwatch-xray.yaml, on an ECS Fargate.
    – Diogo Melo
    Commented Dec 8, 2022 at 17:47
  • When the collector pushes the metrics to the backend, it does so by sending all metrics all over again, even if they have the same value or zero data. Hence the multiple "ticks". You may need to individualize your queries for the specific metric you want. Commented Dec 8, 2022 at 18:40