In this article, we will discuss the architecture of JVM specification, the various components involved, and their roles.
Before we dive deep into the internals of JVM Architecture, we need to understand certain terminologies.
Compiler vs Interpreter
A compiler is software that takes the entire program written in High-Level Language as input and converts it into platform-specific binary code (0s and 1s) that the CPU can understand.
The Compiler passes through multiple phases such as Tokenization, Abstract Syntax Tree Generation, Semantic Analysis, and Code Optimization before generating the target binary code.
As a result, the resultant binary code is highly optimized with faster execution, and errors are caught during the compilation phase itself.
An Interpreter, on the other hand, is software that reads the program line by line or instruction by instruction, converts them into platform-specific binary code (0s and 1s), and then executes it.
As a result, the execution of programs that are interpreted is comparatively slower than the compiled programs. Also, errors are caught during the execution phase in Interpreters unlike the compilation phase in Compilers.
JDK, JRE, and JVM
Java is a high-level, Object-Oriented programming language that is considered both a compiled and interpreted language.
The source code of Java ending with a .java file extension is first compiled into an intermediate code which is platform agnostic using a Java compiler (javac). This intermediate code is often referred to as bytecode ending with a .class file extension.
Bytecode is a form of instruction set designed for efficient execution by Interpreters which will be in hexadecimal format.
These bytecodes are then used as input to the Java Runtime environment which interprets the bytecode and executes it.
For this reason, Java is often regarded as WORA (Write Once, Run Anywhere) system since the generated bytecodes can be executed on any platform that hosts an appropriate runtime.
JVM (Java Virtual Machine) is an abstract computing machine that includes specifications of the runtime environment with which the bytecodes will be executed.
JRE (Java Runtime Environment) is essentially an implementation of the JVM specification providing an actual runtime environment to execute bytecodes.
JDK(Java Development Kit) is a collection of software tools and libraries required for the development of Java programs, a Compiler to convert source code into bytecodes, and a JRE that executes the bytecodes.
Components of JVM
Following are the major components of a JVM specification.
- Classloader subsystem
- Runtime Data Area
- Execution Engine
- Java Native Interface (JNI)
The classloader subsystem is a primary component in the JVM specification responsible for loading the bytecodes (.class files) from the filesystem or network into the main memory and preparing those bytecodes for execution.
Each of the classloader subsystems extends the “java.lang.ClassLoader” parent class.
The classloader subsystem performs three major functionalities namely Loading, Linking, and Initialization.
Loading is the process of bringing the compiled bytecodes (.class files) into RAM storing type metadata into the Runtime Data Area for later execution (discussed below).
Linking is the process of verifying the authenticity and integrity of bytecodes that were loaded and preparing them for execution.
Initialization is the process of initializing static final variables of the class and running the static blocks inside the class post linkage.
Loading is the process of bringing the compiled bytecodes (.class files) into the main memory.
Loading is hierarchical and classes are lazily loaded when they are referenced for the first time. Loading generates corresponding binary data and is stored in the Method Area of the Runtime Data Area (discussed below).
The classloading starts from the Main Class of the program and subsequent class loading attempts are done based on referencing under the following scenarios.
- When the bytecode makes a static reference to a class (Ex: system.out.println)
- When the bytecode creates an object of a particular class.
- When the class is referenced using Reflection.
Types of ClassLoaders
There are three different types of Classloaders.
- Bootstrap/Primordial ClassLoader: Bootstrap ClassLoader is the root of all class loaders which loads the JDK classes from rt.jar in the bootstrap classpath. The jar includes classes to support the runtime such as java.lang, applets, java.util, java.io, java.net, and the likes. These are written in native platform-dependent code.
- Extension ClassLoader: Extension ClassLoader is responsible for loading core Java extensions from the JDK extension library present in the “jre/lib/ext” folder such as locales, security providers, and other libraries. These can also include libraries present in the directories specified by the environment variable “java.ext.dirs”.
- System / Application ClassLoader: System / Application ClassLoader is responsible for loading the application-specific classes mentioned in the system classpath or the jars mentioned in -cp while running the application.
We can also incorporate a user-defined ClassLoader to define the order of class loading and isolation of different containers.
These ClassLoaders follow certain design principles which makes the dynamic class loading mechanism consistent.
JRE follows a Delegation Principle to implement the class loading mechanism.
Every class loading request to the System ClassLoader delegates the request to the Extension ClassLoader. Extension ClassLoader in turn delegates the request to Bootstrap ClassLoader.
If the Bootstrap ClassLoader can locate the class, it is loaded and returned to the Extension ClassLoader and in turn to the System ClassLoader which stores the class metadata in Method Area.
If the Bootstrap ClassLoader is not able to locate the class, the request is passed to Extension ClassLoader which in turn tries to locate the class.
If the Extension ClassLoader can locate the class, it is loaded and returned to the System ClassLoader which stores the class metadata in Method Area.
If the Extension ClassLoader is not able to locate the class, the request is passed to System ClassLoader.
If the System ClassLoader can locate the class, it is loaded and the class metadata is stored in Method Area. Otherwise, a ClassNotFoundException is thrown.
According to this, the Child ClassLoader in the tree hierarchy can see the classes loaded by the Parent ClassLoader, but a Parent ClassLoader in the tree hierarchy cannot see the classes loaded by the Child ClassLoader.
The classes loaded by Parent ClassLoader should not be loaded by the Child ClassLoader again thus avoiding duplicates.
The class loading subsystem can be presented with either a .class file or a jar in which case the jar contents are extracted to load the .class files inside it.
If multiple classes with the same name are available, the conflict resolution strategy is simple: the first appropriate class wins. The class loader subsystem will traverse the directories in the order of input presented in the classpath and loads the first class which matches the request.
The same principle applies if there are multiple jars.
Linking is the process of verifying already loaded classes, interfaces its parent classes, and interfaces post the Loading phase.
The process of Linking goes through three phases namely Verify, Prepare and Resolve.
Verifier checks if the bytecode is generated by a valid compiler and follows compatibility rules. Some of the checks performed by Verifier include
- Ensure variables are initialized before usage.
- Ensure final classes are not sub-classed and final methods are not overridden.
- Ensure methods and classes respect access rules and methods are invoked with the correct number and type of parameters and return values.
- Variables and classes are assigned and initialized with values of the correct type.
If any of these checks are violated, the Linker throws a VerifyError exception and the entire process comes to a halt.
Prepare phase is executed post the Verify phase that allocates memory and initializes static fields of the class to their default values.
For example, int is initialized to 0, objects are initialized to null, and so on. Prepare phase doesn’t execute static blocks which is the role of the Initialization phase.
The Resolve phase replaces all symbolic references present in the bytecode with actual references by doing a lookup into the Runtime Constant Pool (described below) into the Method Area.
This is the phase where the initialization logic for each class will be executed that initializes all static variables to the respective assigned values and executes the static block for each class.
Runtime Data Area
Runtime Data Area is essentially the memory region of JVM that acts as a storage area for class and class instance data.
The entire memory region of JVM is divided into three main areas namely Stack Area, Heap Area, and Non-Heap Area.
Thread is a single execution flow of control in a Java program that has a direct mapping with the OS native thread. The Operating System schedules the execution of the native thread on to CPU.
The native thread is initialized which in turn invokes the run() method of the Java thread. When the run() method of the Java thread returns successfully or is aborted (due to uncaught exceptions), the native thread also is terminated.
The JVM specification allows multiple threads to be running concurrently at any given point in time.
There are multiple types of Java threads each catering to a different use case.
- Main Thread: Created as part of invoking the public static void main(String args) method.
- VM Thread: Responsible for performing VM operations such as thread stack dumps, thread suspension, etc…
- Periodic Task Thread: Responsible for executing periodic operations within the JVM such as timer interrupts.
- GC Thread: Responsible for supporting different types of Garbage collections
- Compiler Thread: Responsible for performing runtime compilation of bytecode to machine code.
- Single dispatcher Thread: Responsible for dispatching process-directed signals and dispatching them to Java level signal handling method.
The Stack component of the Runtime Data Area includes three data structures namely Program Counter Register, Stack, and Native Stack.
Each thread of execution has its own Program Counter Register, a Stack, and an optional Native Stack making it thread-safe.
Program Counter (PC) Register
PC Register holds the address of the currently executing instruction (opcode) of a non-native method for the given Thread. It is updated with a new address after the instruction is executed.
These instructions include symbolic references that will be pointing to the Runtime Constant Pool of the Method Area (described below).
Each Thread has its Stack with LIFO (Last in First Out) data structure that holds one Frame for each method executed by that thread.
A new Frame is pushed to the top of the Stack for a method invocation and the Frame is popped out of the Stack when the method returns. This way the current execution method will be the Frame at the top of the Stack.
The Frames themselves are allocated on Heap and are essentially pointers to the Heap.
Each frame contains the following components
- Local Variable Array: Includes an array of local variables and their values in the current method. Index 0 points to the reference of the class if it is a class instance, followed by function parameters in the method ending with local variables in the method.
- Operand Stack: Operand Stack is used as an intermediate LIFO workspace during the execution of bytecode instructions for the current Stack Frame that can be used for computation, loading/unloading, swapping, and executing instructions.
- Constant Pool Pointer: Includes pointers to Runtime Constant Pool in the Method Area to resolve symbolic references to actual memory address aiding Dynamic Linking.
A Native Stack is created for each thread when the native methods are invoked. It stores information about the native methods.
If a JVM has been implemented using a C-linkage model for Java Native Invocation (JNI) then the native stack will be a C stack and the functionality of push and pop are similar to C-Stack push and pop.
Code Cache Region of the Non-Heap is used for storing the binary code which was compiled from the bytecode by the JIT Compiler. This is frequently referred to as JIT code cache.
Method Area of the Non-Heap store class-level metadata and the code for methods and constructors.
When the JRE loads the class, it uses ClassLoader Subsystem to locate the appropriate class file, reads the bytecode, and passes it to JRE which in turn extracts class type metadata and stores them in the Method Area.
All the threads have access to a shared Method Area and hence it is not thread-safe.
For each type, JRE stores the following basic information in the Method Area.
- The fully qualified name of the class and its superclass and/or super interfaces.
- Whether the type is a class or interface
- Modifiers of that particular type.
A fully qualified name is nothing but the package name followed by a dot ending with a class or interface name. (Ex: java.util.List)
In addition to the basic information, each type in Method Area also includes
- The Constant Pool
- Field information
- Method information
- All static variables declared in the class
- A pointer reference to classLoader.
Every type has a Constant Pool which is similar to a Symbol Table that includes a table of symbolic references generated by the javac compiler that maps symbols to references for variables, classes, interfaces, literals, etc… These symbolic references are resolved to actual memory address during Dynamic Linking Phase. The actual binary information associated with each of the resolved references is stored on Heap.
Field Information includes the field’s name, its type, and modifiers.
Method Information includes the method’s name, return type, arguments, their types, and modifiers.
ClassLoader Reference keeps track of the ClassLoader responsible for loading the type which is used for Dynamic Linking. When a type is referenced from another type, the same ClassLoader is used to load the new type.
Heap is a chunk of memory in JRE where all the class instances and arrays are allocated. Unlike Stack, objects allocated on Heap are never reclaimed when the method returns but is delegated to a daemon Garbage Collector to reclaim the space occupied by class instances.
All Threads share the same Heap region and hence are not thread-safe.
A Heap is divided into two regions namely Young Generation and the Old (Tenured) Generation.
Young Generation includes an Eden Space and two Survivor Spaces (S1 and S2).
Object allocation starts in the Eden Space of the Young Generation and is subsequently moved to Survivor Spaces of the Young Generation and finally to the Old (Tenured) Generation if the objects are alive for a longer tenure post multiple Garbage Collection cycles.
The size of the Old(Tenured) Generation is generally larger than the Young Generation.
The mechanics of Garbage Collection is explained in subsequent sections.
The Execution Engine is where the bytecodes are executed using an Interpreter or JIT Compiler using the data stored in Runtime Data Area.
The Execution Engine is composed of three main components: Interpreter, JIT Compiler and Garbage Collector.
The Interpreter reads the bytecode generated by Java Compiler and executes the instruction set (opcode) line by line.
Before executing the instructions, a sequence of steps are carried out as mentioned below.
- Types are loaded on-demand using ClassLoader Subsystem storing metadata in Method Area and binary data on Heap.
- Types are dynamically linked using the Linking module verifying the bytecodes and resolving symbolic references.
- Classes are initialized with static variables and static blocks.
Once these steps are executed, the opcode is ready to be executed line by line. The major shortcoming of interpreting every instruction is, that when a method is called multiple times, interpretation is required every time resulting in slower execution.
(Just in Time) JIT Compiler
JIT Compiler solves the shortcomings of the Interpreter by compiling the frequently executed instructions from bytecode into native code and executing the native code thereby improving the performance.
JIT Compiler performs various optimizations to improve the performance of the execution.
- Inlining is the process by which smaller methods are merged and inlined in the caller method thereby reducing the call graph, branching of methods, and stack allocation.
- Various local optimizations are performed such as tail recursion elimination into the iterative process, simplification of statements, replacing variables with literals to reduce lookups, etc…
- Control flow optimizations by rearranging code paths to improve efficiencies such as Loop reduction and inversion, Loop unwrapping, branching reduction, etc…
JIT Compiler analyzes the hotspots ie., the instructions which are frequently executed, and adds them to the eligible instructions to be compiled by JIT Compiler.
JIT Compiler includes the following components
- Intermediate Code Generator producing intermediate code
- Code Optimizer for optimizing the intermediate code generated.
- Profiler for analyzing hotspots based on frequently executing instructions with their thresholds. This data is stored in Code Cache for subsequent analysis.
- Target Code Generator for generating the target binary code and storing them in the Code Cache.
There are also advancements in Ahead of Time Compiler (AOT) that compiles the code into native code partially or completely beforehand during the javac compilation phase instead of converting it into bytecode and can be used in conjunction with JIT Compiler.
Local primitive variables are allocated on Stack whereas all objects and arrays whether it is localized to method or class are allocated on Heap.
The primitive variables allocated on Stack are freed up when the method returns.
The objects and arrays allocated on Heap are not freed up when the method is freed up. This is where Garbage collectors play a prominent role.
The primary responsibility of the Garbage Collector is to free up Heap memory for objects and arrays that are not referenced. An object is considered alive as long as it is referenced. If the reference count of the object is 0, it is considered eligible for Garbage collection.
There are two types of GC: Minor GC and Major GC
New objects are allocated in the Eden Space of the Young Generation. Once the Eden Space gets full, Minor GC is triggered. Unreachable objects in Eden Space are freed, live objects are moved from Eden Space to one of the empty Survivor Spaces (S1 and S2) and Eden Space is cleared.
This process continues every time the Eden Space is full. When the Minor GC is triggered again, unreachable objects in Eden Space and non-empty Survivor Space are freed. Live objects from Eden Space and non-empty Survivor Space are moved to empty Survivor Space.
Eden Space and previously occupied Survivor Space are cleared.
Minor GC alternates between Survivor Space S1 and S2 on every trigger and each trigger increments the reference count on live objects indicating the age of the object in the Young Generation.
Minor GC usually takes less time as the size of the Young Generation is generally small.
If the age of the objects exceeds a certain threshold in the Young Generation, they are moved to the Old (Tenured) Generation during the Minor GC phase.
Long-lived objects are eventually moved to Old Generation and remain there for a significant period. When the size of the Old Generation exceeds a certain threshold dictated by Heap Occupancy Percentage, Major GC is triggered which frees up Old Generation.
Major GC usually takes a lot of time as the size of the Old Generation is generally large and usually referred to as Stop the World as it pauses the application threads.
Minor GC is also Stop the World but the pause time is so minimal in the Young Generation that it is not noticeable.
Most of the GC algorithms follow the following three steps to reclaim space.
Mark Phase is responsible for identifying all live(reachable objects) from the Garbage Collection Roots (GC roots). Examples of GC Roots include Active Threads, JNI References, Static fields of the loaded class, etc… The algorithm traverses the entire object graph starting from the GC roots to every object in the graph. Once the Mark Phase is complete, every live object is identified and marked.
In the Sweep Phase, all the space occupied by the unvisited object in the Mark Phase is considered unreachable and can be used for allocating new objects. This approach provides provision for allocating new objects but the memory chunks need not be contiguous. Hence the request to allocate a large object can fail if the memory chunks are not contiguous.
In the Compact Phase, All the live objects are moved to the beginning of the memory region rewriting the memory region, its pointers, and GC Roots thus making the memory region contiguous, solving the shortcomings of the Sweep Phase.
The goal of every GC algorithm is to find an optimal tradeoff to reduce the pause time of the application and achieve maximum Garbage collection. The details of different types of GC algorithms are outside the scope of this article.
Java Native Interface (JNI)
JNI is used to connect with Native Method Libraries (typically written in other programming languages such as C/C++) necessary for the execution and to provide the features of such Native Libraries.
This allows JVM to call C/C++ libraries and be called by C/C++ libraries that are hardware-specific.
This completes the high-level architecture of JVM and its components. This Link from oracle includes a more in-depth explanation of JVM specifications.