Java Guidelines: Implementation

Configuration

When configuring your client library, particular care must be taken to ensure that the consumer of your client library can properly configure the connectivity to your Azure service both globally (along with other client libraries the consumer is using) and specifically with your client library.

Client configuration

✅ DO use relevant global configuration settings either by default or when explicitly requested to by the user, for example by passing in a configuration object to a client constructor.

✅ DO allow different clients of the same type to use different configurations.

✅ DO allow consumers of your service clients to opt out of all global configuration settings at once.

✅ DO allow all global configuration settings to be overridden by client-provided options. The names of these options should align with any user-facing global configuration keys.

⛔️ DO NOT change behavior based on configuration changes that occur after the client is constructed. Hierarchies of clients inherit parent client configuration unless explicitly changed or overridden. Exceptions to this requirement are as follows:

Log level, which must take effect immediately across the Azure SDK.
Tracing on/off, which must take effect immediately across the Azure SDK.

Service-specific environment variables

✅ DO prefix Azure-specific environment variables with AZURE_.

✔️ YOU MAY use client library-specific environment variables for portal-configured settings which are provided as parameters to your client library. This generally includes credentials and connection details. For example, Service Bus could support the following environment variables:

AZURE_SERVICEBUS_CONNECTION_STRING
AZURE_SERVICEBUS_NAMESPACE
AZURE_SERVICEBUS_ISSUER
AZURE_SERVICEBUS_ACCESS_KEY

Storage could support:

AZURE_STORAGE_ACCOUNT
AZURE_STORAGE_ACCESS_KEY
AZURE_STORAGE_DNS_SUFFIX
AZURE_STORAGE_CONNECTION_STRING

✅ DO get approval from the Architecture Board for every new environment variable.

✅ DO use this syntax for environment variables specific to a particular Azure service:

AZURE_<ServiceName>_<ConfigurationKey>

where ServiceName is the canonical shortname without spaces, and ConfigurationKey refers to an unnested configuration key for that client library.

⛔️ DO NOT use non-alpha-numeric characters in your environment variable names with the exception of underscore. This ensures broad interoperability.

Parameter validation

The service client will have several methods that perform requests on the service. Service parameters are directly passed across the wire to an Azure service. Client parameters are not passed directly to the service, but used within the client library to fulfill the request. Examples of client parameters include values that are used to construct a URI, or a file that needs to be uploaded to storage.

✅ DO validate client parameters.

⛔️ DO NOT validate service parameters. This includes null checks, empty strings, and other common validating conditions. Let the service validate any request parameters.

✅ DO validate the developer experience when the service parameters are invalid to ensure appropriate error messages are generated by the service. If the developer experience is compromised due to service-side error messages, work with the service team to correct prior to release.

Network requests

Each supported language has an Azure Core library that contains common mechanisms for cross cutting concerns such as configuration and doing HTTP requests.

✅ DO use the HTTP pipeline component within azure-core library for communicating to service REST endpoints.

The HTTP pipeline consists of a HTTP transport that is wrapped by multiple policies. Each policy is a control point during which the pipeline can modify either the request and/or response. We prescribe a default set of policies to standardize how client libraries interact with Azure services. The order in the list is the most sensible order for implementation.

✅ DO implement the following policies in the HTTP pipeline:

Telemetry
Unique Request ID
Retry
Authentication
Response downloader
Distributed tracing
Logging

☑️ YOU SHOULD use the policy implementations in Azure Core whenever possible. Do not try to “write your own” policy unless it is doing something unique to your service. If you need another option to an existing policy, engage with the Architecture Board to add the option.

✅ DO make all custom policies (HTTP or otherwise) available as public API. This enables developers who choose to implement their own pipeline to reuse the policy rather than write it themselves.

Authentication

When implementing authentication, don’t open up the consumer to security holes like PII (personally identifiable information) leakage or credential leakage. Credentials are generally issued with a time limit, and must be refreshed periodically to ensure that the service connection continues to function as expected. Ensure your client library follows all current security recommendations and consider an independent security review of the client library to ensure you’re not introducing potential security problems for the consumer.

⛔️ DO NOT persist, cache, or reuse security credentials. Security credentials should be considered short lived to cover both security concerns and credential refresh situations.

If your service implements a non-standard credential system (that is, a credential system that is not supported by Azure Core), then you need to produce an authentication policy for the HTTP pipeline that can authenticate requests given the alternative credential types provided by the client library.

✅ DO provide a suitable authentication policy that authenticates the HTTP request in the HTTP pipeline when using non-standard credentials. This includes custom connection strings, if supported.

Native code

Native code plugins cause compatibility issues and require additional scrutiny. Certain languages compile to a machine-native format (for example, C or C++), whereas most modern languages opt to compile to an intermediary format to aid in cross-platform support.

⛔️ DO NOT write platform-specific / native code.

Error handling

Error handling is an important aspect of implementing a client library. It is the primary method by which problems are communicated to the consumer. There are two methods by which errors are reported to the consumer. Either the method throws an exception, or the method returns an error code (or value) as its return value, which the consumer must then check. In this section we refer to “producing an error” to mean returning an error value or throwing an exception, and “an error” to be the error value or exception object.

☑️ YOU SHOULD prefer the use of exceptions over returning an error value when producing an error.

✅ DO produce an error when any HTTP request fails with an HTTP status code that is not defined by the service/Swagger as a successful status code. These errors should also be logged as errors.

✅ DO use unchecked exceptions for HTTP requests. Java offers checked and unchecked exceptions, where checked exceptions force the user to introduce verbose try .. catch code blocks and handle each specified exception. Unchecked exceptions avoid verbosity and improve scalability issues inherent with checked exceptions in large apps.

✅ DO ensure that the error produced contains the HTTP response (including status code and headers) and originating request (including URL, query parameters, and headers).

In the case of a higher-level method that produces multiple HTTP requests, either the last exception or an aggregate exception of all failures should be produced.

✅ DO ensure that if the service returns rich error information (via the response headers or body), the rich information must be available via the error produced in service-specific properties/fields.

⛔️ DO NOT create a new error type when a language-specific error type will suffice. Use system-provided error types for validation.

✅ DO use the following standard Java exceptions for pre-condition checking:

Exception	When to use
`IllegalArgumentException`	When a method argument is non-null, but inappropriate
`IllegalStateException`	When the object state means method invocation can’t continue
`NullPointerException`	When a method argument is `null` and `null` is unexpected
`UnsupportedOperationException`	When an object doesn’t support method invocation

✅ DO document the errors that are produced by each method (with the exception of commonly thrown errors that are generally not documented in the target language).

✅ DO specify all checked and unchecked exceptions thrown in a method within the JavaDoc documentation on the method as @throws statements.

✅ DO use the existing exception types present in the Azure core library for service request failures. Avoid creating new exception types. The following list outlines all available exception types (with indentation indicating exception type hierarchy):

AzureException: Never use directly. Throw a more specific subtype.
- HttpResponseException: Thrown when an unsuccessful response is received with http status code (e.g. 3XX, 4XX, 5XX) from the service request.
  - ClientAuthenticationException: Thrown when there’s a failure to authenticate against the service.
  - DecodeException: Thrown when there’s an error during response deserialization.
  - ResourceExistsException: Thrown when an HTTP request tried to create an already existing resource.
  - ResourceModifiedException: Thrown for invalid resource modification with status code of 4XX, typically 412 Conflict.
  - ResourceNotFoundException: Thrown when a resource is not found, typically triggered by a 412 response (for PUT) or 404 (for GET/POST).
  - TooManyRedirectsException: Thrown when an HTTP request has reached the maximum number of redirect attempts.
- ServiceResponseException: Thrown when the request was sent to the service, but the client library wasn’t able to understand the response.
- ServiceRequestException: Thrown for an invalid response with custom error information.

Logging

Client libraries must support robust logging mechanisms so that the consumer can adequately diagnose issues with the method calls and quickly determine whether the issue is in the consumer code, client library code, or service.

✅ DO use the ClientLogger API provided within Azure Core as the sole logging API throughout all client libraries. Internally, ClientLogger wraps SLF4J, so all external configuration that is offered through SLF4J is valid. We encourage you to expose the SLF4J configuration to end users. For more information, see the SLF4J user manual.

✅ DO create a new instance of a ClientLogger per instance of all relevant classes, except in situations where performance is critical, the instances are short-lived (and therefore the cost of unique loggers is excessive), or in static-only classes (where there is no instantiation of the class allowed). In these cases, it is acceptable to have a shared (or static) logger instance. For example, the code below will create a ClientLogger instance for the ConfigurationAsyncClient:

public final class ConfigurationAsyncClient {
    private final ClientLogger logger = new ClientLogger(ConfigurationAsyncClient.class);

    // example async call to a service that uses the Project Reactor APIs to log request, success, and error
    // information out to the service logger instance
    public Mono<Response<ConfigurationSetting>> setSetting(ConfigurationSetting setting) {
        return service.setKey(serviceEndpoint, setting.key(), setting.label(), setting, getETagValue(setting.etag()), null)
            .doOnRequest(ignoredValue -> logger.info("Setting ConfigurationSetting - {}", setting))
            .doOnSuccess(response -> logger.info("Set ConfigurationSetting - {}", response.value()))
            .doOnError(error -> logger.warning("Failed to set ConfigurationSetting - {}", setting, error));
    }
}

Note that static loggers are shared among all client library instances running in a JVM instance. Static loggers should be used carefully and in short-lived cases only.

✅ DO use one of the following log levels when emitting logs: Verbose (details), Informational (things happened), Warning (might be a problem or not), and Error.

✅ DO use the Error logging level for failures that the application is unlikely to recover from (out of memory, etc.).

✅ DO use the Warning logging level when a function fails to perform its intended task. This generally means that the function will raise an exception. Do not include occurrences of self-healing events (for example, when a request will be automatically retried).

✔️ YOU MAY log the request and response (see below) at the Warning logging level when a request/response cycle (to the start of the response body) exceeds a service-defined threshold. The threshold should be chosen to minimize false-positives and identify service issues.

✅ DO use the Informational logging level when a function operates normally.

✅ DO use the Verbose logging level for detailed troubleshooting scenarios. This is primarily intended for developers or system administrators to diagnose specific failures.

✅ DO only log headers and query parameters that are in a service-provided “allow-list” of approved headers and query parameters. All other headers and query parameters must have their values redacted.

✅ DO log request line and headers as an Informational message. The log should include the following information:

The HTTP method.
The URL.
The query parameters (redacted if not in the allow-list).
The request headers (redacted if not in the allow-list).
An SDK provided request ID for correlation purposes.
The number of times this request has been attempted.

This happens within azure-core by default, but users can configure this through the builder httpLogOptions configuration setting.

✅ DO log response line and headers as an Informational message. The format of the log should be the following:

The SDK provided request ID (see above).
The status code.
Any message provided with the status code.
The response headers (redacted if not in the allow-list).
The time period between the first attempt of the request and the first byte of the body.

✅ DO log an Informational message if a service call is cancelled. The log should include:

The SDK provided request ID (see above).
The reason for the cancellation (if available).

✅ DO log exceptions thrown as a Warning level message. If the log level set to Verbose, append stack trace information to the message.

✅ DO throw all exceptions created within the client library code through the ClientLogger.logAndThrow() API.

For example:

// NO!!!!
if (priority != null && priority < 0) {
    throw new IllegalArgumentException("'priority' cannot be a negative value. Please specify a zero or positive long value.");
}

// Good
if (priority != null && priority < 0) {
    logger.logAndThrow(new IllegalArgumentException("'priority' cannot be a negative value. Please specify a zero or positive long value."));
}

Distributed tracing

Distributed tracing mechanisms allow the consumer to trace their code from frontend to backend. The distributed tracing library creates spans - units of unique work. Each span is in a parent-child relationship. As you go deeper into the hierarchy of code, you create more spans. These spans can then be exported to a suitable receiver as needed. To keep track of the spans, a distributed tracing context (called a context in the remainder of this section) is passed into each successive layer. For more information on this topic, visit the OpenTelemetry topic on tracing.

The Azure core library provides a service provider interface (SPI) for adding pipeline policies at runtime. The pipeline policy is used to enable tracing on consumer deployments. Pluggable pipeline policies must be supported in all client libraries to enable distributed tracing. Additional metadata can be specified on a per-service-method basis to provide a richer tracing experience for consumers.

✅ DO support pluggable pipeline policies as part of the HTTP pipeline instantiation.

Review the code sample below, in which a service client builder creates an HttpPipeline from its set of policies. At the same time, the builder allows plugins to add ‘before retry’ and ‘after retry’ policies with the lines HttpPolicyProviders.addBeforeRetryPolicies(policies) and HttpPolicyProviders.addAfterRetryPolicies(policies):

public ConfigurationAsyncClient build() {
    ...

    // Closest to API goes first, closest to wire goes last.
    final List<HttpPipelinePolicy> policies = new ArrayList<>();
    policies.add(new UserAgentPolicy(AzureConfiguration.NAME, AzureConfiguration.VERSION, buildConfiguration));
    policies.add(new RequestIdPolicy());
    policies.add(new AddHeadersPolicy(headers));
    policies.add(new AddDatePolicy());
    policies.add(new ConfigurationCredentialsPolicy(buildCredentials));
    HttpPolicyProviders.addBeforeRetryPolicies(policies);
    policies.add(retryPolicy);
    policies.addAll(this.policies);
    HttpPolicyProviders.addAfterRetryPolicies(policies);
    policies.add(new HttpLoggingPolicy(httpLogDetailLevel));

    ...
}

✅ DO accept a context from calling code to establish a parent span.

✅ DO pass the context to the backend service through the appropriate headers (traceparent, tracestate, etc.) to support Azure Monitor. This is generally done with the HTTP pipeline.

✅ DO create a new span for each method that user code calls. New spans must be children of the context that was passed in. If no context was passed in, a new root span must be created.

✅ DO create a new span (which must be a child of the per-method span) for each REST call that the client library makes. This is generally done with the HTTP pipeline.

✅ DO use the Azure core TracerProxy API to set additional metadata that should be supplied along with the tracing span. In particular, use the setAttribute(String key, String value, Context context) method to set a new key/value pair on the tracing context.

Some of these requirements will be handled by the HTTP pipeline. However, as a client library writer, you must handle the incoming context appropriately.

Dependencies

Dependencies bring in many considerations that are often easily avoided by avoiding the dependency.

Versioning - Many programming languages do not allow a consumer to load multiple versions of the same package. So, if we have an client library that requires v3 of package Foo and the consumer wants to use v5 of package Foo, then the consumer cannot build their application. This means that client libraries should not have dependencies by default.
Size - Consumer applications must be able to deploy as fast as possible into the cloud and move in various ways across networks. Removing additional code (like dependencies) improves deployment performance.
Licensing - You must be conscious of the licensing restrictions of a dependency and often provide proper attribution and notices when using them.
Compatibility - Often times you do not control a dependency and it may choose to evolve in a direction that is incompatible with your original use.
Security - If a security vulnerability is discovered in a dependency, it may be difficult or time consuming to get the vulnerability corrected if Microsoft does not control the dependency’s code base.

✅ DO depend on the com.azure.core library for functionality that is common across all client libraries. This library includes APIs for HTTP connectivity, global configuration, logging, and credential handling.

⛔️ DO NOT be dependent on any other packages within the client library distribution package, with the exception of the following:

Name	Role	Allowed in public API	Notes
Apache Avro	Avro parser	No	Used only in azure-core-serializer-avro-apache.
Jackson	JSON parser	No	Use azure-core `JsonSerializer` abstractions rather than Jackson directly.
JNA	Native access to OS	No	Used only in azure-identity.
Netty	HTTP client	No	Used only in azure-core-http-netty.
OkHttp	HTTP client	No	Used only in azure-core-http-okhttp.
OpenTelemetry	Telemetry library	No	Used only in azure-core-tracing-opentelemetry.
Reactor	Reactive library	Yes
slf4j	Logging framework	No	Use the azure core `ClientLogger` API rather than `slf4j` directly.
Apache Qpid Proton-J	AMQP messaging	No	Used only in azure-core-amqp.

Dependency versions are purposefully not specified in this table. The definitive source for the dependency versions being used in all client libraries is published in a separate document that is generated from the azure-sdk-for-java code repository. Transitive dependencies of these libraries, or dependencies that are part of a family of dependencies, are allowed. For example, reactor-netty is a child project of reactor.

⛔️ DO NOT introduce new dependencies on third-party libraries that are already referenced from the parent POM, without first discussing with the Architecture Board].

⛔️ DO NOT specify or change dependency versions in your client library POM file. All dependency versioning must be centralized through existing tooling.

⛔️ DO NOT include dependencies on external libraries that are -SNAPSHOT versions. All dependencies must be released versions.

☑️ YOU SHOULD consider copying or linking required code into the client library in order to avoid taking a dependency on another package that could conflict with the ecosystem. Make sure that you are not violating any licensing agreements and consider the maintenance that will be required of the duplicated code. “A little copying is better than a little dependency” (YouTube).

⛔️ DO NOT depend on concrete logging, dependency injection, or configuration technologies (except as implemented in the com.azure.core library). The client library will be used in applications that might be using the logging, DI, and configuration technologies of their choice.

Service-specific common library code

There are occasions when common code needs to be shared between several client libraries. For example, a set of cooperating client libraries may wish to share a set of exceptions or models.

✅ DO gain Architecture Board approval prior to implementing a common library.

✅ DO minimize the code within a common library. Code within the common library is available to the consumer of the client library and shared by multiple client libraries within the same namespace.

✅ DO store the common library in the same namespace as the associated client libraries.

A common library will only be approved if:

The consumer of the non-shared library will consume the objects within the common library directly, AND
The information will be shared between multiple client libraries.

Let’s take two examples:

Implementing two Cognitive Services client libraries, we find a model is required that is produced by one Cognitive Services client library and consumed by another Coginitive Services client library, or the same model is produced by two client libraries. The consumer is required to do the passing of the model in their code, or may need to compare the model produced by one client library vs. that produced by another client library. This is a good candidate for choosing a common library.
Two Cognitive Services client libraries throw an ObjectNotFound exception to indicate that an object was not detected in an image. The user might trap the exception, but otherwise will not operate on the exception. There is no linkage between the ObjectNotFound exception in each client library. This is not a good candidate for creation of a common library (although you may wish to place this exception in a common library if one exists for the namespace already). Instead, produce two different exceptions - one in each client library.

Testing

One of the key things we want to support is to allow consumers of the library to easily write repeatable unit-tests for their applications without activating a service. This allows them to reliable and quickly test their code without worrying about the vagaries of the underlying service implementation (including, for example, network conditions or service outages). Mocking is also helpful to simulate failures, edge cases, and hard to reproduce situations (for example: does code work on February 29th).

✅ DO support mocking of network operations.

✅ DO parameterize all applicable unit tests to make use of all available HTTP clients and service versions. Parameterized runs of all tests must occur as part of live tests. Shorter runs, consisting of just Netty and the latest service version, can be run whenever PR validation occurs.