Capture volatile. That is, a few words about ephemeral variables in Java

Radosław Kondziołka
Calendar icon
16 listopada 2020

The variable "volatile" is one of the keywords in Java. The meaning of this word is often misunderstood, if only due to the fact that it occurs in C/C++ and its purpose is quite different than in Java. Perhaps you have not used it so far, or have not even encountered it in your daily work with code.

The reason for this can be seen in the fact that it is possible to successfully write correct programs without even knowing about the existence of this construct. Therefore, volatile can be considered in a sense redundant. In other words, any program (including concurrent programs - in fact, only in such programs it makes sense to consider volatile) can be constructed correctly without using this word at all. Without the slightest hassle, we can replace ephemeral variables (i.e. volatile variables) with a synchronized block and base the correctness of the program on this. The relationship in the other direction, however, does not occur, i.e. it is impossible to replace any program based on synchronized with a volatile variable. It is for this reason that it is worth looking at this construct not only in terms of its meaning, but also its applications.

Operations on volatile variables

A volatile variable is essentially no different from its "normal" version. The difference arises when writing and reading operations on such variables. Let's imagine for a moment that we have a variable v declared:

1volatile [Type] v;

The Java specification guarantees that a thread reading the value of a variable v always sees the last write to that variable, perhaps performed in another thread. Moreover, the thread reading the value of variable v observes the result of all writes that were made in another thread before the write to variable v was made.

obraz1.webp

The figure above graphically illustrates what is described above.

In a nutshell, we can say that volatile signals to the compiler and the virtual machine that the variable so marked can be shared by threads. Therefore, the compiler and the runtime (JVM) should refrain from performing:

  • changing the order of memory operations performed,
  • various optimizations, such as those involving caching the value of a variable.

In order to better understand the issue, let's look at the following program:

1boolean x = false; T1: T2: while(!x) {} x = true

To make the message clear, the programs are simplified and have the form of pseudo-code. The contents of column T1 represent the code executed by thread T1, while the code in the second column represents the code executed by another thread - T2. At first glance, it may seem that this program must end. Nothing could be further from the truth. It may happen that either the compiler or the JVM decides that it will not execute a write to the global variable x. Both the JVM and the compiler are authorized to do this because the language specification allows it. A program written in this way is simply incorrect. By enriching the variable x with the volatile attribute, such code is already correct - because we have a guarantee that a read in thread T1 will observe a write performed in thread T2.

Now let's look at another program:

1int a = 0, b = 0; T1: T2: int r1 = b; a = 1; int r2 = a; b = 1;

Suppose we execute this program many times and record the results of the readings made in the T1 thread. After performing such a test, we can get the following results of r1*,r2* readings:

1[0,0; 1,1; 0,1, 1,0]

While the first three results are not surprising and are easily explained, the result of 1.0 seems impossible. After all, since thread T1 observed a write to variable b, and this write took place after a write to variable a in thread T2, then reading variable a, which takes place after reading b, should return a value of 1. The result of 1*.0*, however, contradicts such a course. The only explanation for such a result leads to the conclusion that the order of operations has been changed. Such execution is in accordance with the JLS (Java Language Specification). Marking the variables a*,b* as ephemeral guarantees that the order of these operations is not changed. Consequently, the only acceptable results are:

1[0,0; 1,1; 0,1]

This follows directly from the semantics of volatile reads and writes - if a T1 thread has read the value 1 from variable b, it is impossible for a later read of variable a in terms of program order to read the value 0 because the language standard guarantees that if a T1 thread has read the value of a volatile variable, it will also observe previous writes, and thus in particular writes to variable a.

It is worth mentioning that volatile has another, additional meaning in the case of variables of type long and double. In general, writes and reads to variables representing simple and reference types are atomic. The exceptions here are the previously mentioned long and double. Reads and writes to such variables are not guaranteed to be atomic. With help here comes the designation of them as volatile - read and write operations on such variables are atomic.

When is it worth using volatile?

The first reason for reaching for this mechanism may be performance issues. Not in every case in which threads share a resource it is necessary to use the synchronized section, so de facto locking (unlocking) the monitor. Sometimes the guarantees, discussed below, provided by the volatile variable are sufficient. Let's look at the following simple example:

1public class CompareSynchronizedAndVolatileRead { static volatile int v; static int i; @Benchmark @BenchmarkMode(Mode.AverageTime) @OutputTimeUnit(TimeUnit.NANOSECONDS) public static void takeLock(Blackhole blackhole) { synchronized(CompareSynchronizedAndVolatileRead.class) { blackhole.consume(i); } } @Benchmark @BenchmarkMode(Mode.AverageTime) @OutputTimeUnit(TimeUnit.NANOSECONDS) public static void volatileRead(Blackhole blackhole) { blackhole.consume(v); } } public static void main(String[] args) { org.openjdk.jmh.Main.main(args); } } }

The program shown determines using the Java Microbenchmark Harness (JMH) benchmark how long it takes to execute the two functions. Both methods perform the same functionality - reading an integer variable in a concurrency-correct manner, known as threadsafe. In the first case to do this we use a synchronized block, and in the second case to read a variable of volatile type. For the purposes of our considerations, we can omit the expression Blackhole.consume - it is not relevant. The results are as follows:

The result is, of course, as one would expect. Suffice it to say that operations on volatile variables are not blocking, unlike occupying the monitor. This is simply due to the previously mentioned fact that operations on a volatile variable are themselves a lighter synchronization mechanism.

The second argument in favor of volatile is that in some cases using it makes the implementation easier, and consequently this one is simpler to read. In general, except in some special cases, it is the latter argument that should prevail over the former. Here, however, one should be careful because the misuse of volatile can prove to be a double-edged weapon in this context.

obraz2.webp

The result is, of course, what one would expect. Suffice it to say that operations on volatile variables are not blocking, unlike occupying the monitor. This is simply due to the previously mentioned fact that operations on an "ephemeral" variable are themselves a lighter synchronization mechanism.

The second argument in favor of volatile is that in some cases using it makes the implementation easier, and consequently this one is simpler to read. In general, except in some special cases, it is the latter argument that should prevail over the former. Here, however, one should be careful because the abuse of volatile can prove to be a double-edged weapon in this context.

Summary

This short article has presented a simplified and intuitive view of volatile variables. If the topic interests you, a more formal description can be found in the language specification, which you can find, for example, at this link. If you want to use a volatile variable, you should always consider why I want to do so and whether, in fact, the guarantees provided by Java for these variables are sufficient in this case. The issue of volatile is discussed in detail in the training course Multithreading in Java.

Read also

Calendar icon

17 maj

Investor Data Room - analyzing startup data in R Shiny
Data Room is a tool for sharing key company information that investors or potential business partners need. Read the article on Inves...
Calendar icon

19 kwiecień

Overview and comparison of classification algorithms in Python
Artificial intelligence, including algorithms based on deep neural networks, have become increasingly popular recently. Image generat...
Calendar icon

27 luty

Looker Studio with data from its own API
Looker Studio commonly referred to as Google Reports is Google's browser-based answer to business analytics. Formerly, its full name ...