Kotlin Compiler Plugin

Say Hello to the Kotlin Compiler Plugin

Start developing a Kotlin compiler plugin that’s even more powerful than KSP, and why?

Ji Sungbin
Better Programming
Published in
12 min readMay 16, 2023

--

Photo by Perry Merrity II on Unsplash

This is the first article in a three-part series about the Kotlin compiler plugin. In this article, we’ll take a quick look at the structure of the Kotlin Compiler and create a “Hello, World!” with the Kotlin compiler plugin.

What is the Kotlin Compiler Plugin?

I’m guessing that at least half of the people reading this are new to learning about the Kotlin compiler plugin. It doesn’t matter if you don’t know what it is, and there’s no API documentation for most of it. But I’m introducing this because it allows you to do things that aren’t possible at the language level.

A Kotlin compiler plugin is exactly what it sounds like — a technology that adds a plugin feature to the Kotlin Compiler.

I decided to use the Kotlin compiler plugin to overcome the limitations of KSP. The feature I wanted to implement was the ability to look up the default value of a function argument, but I couldn’t access the default value of a function argument using the KSP API. (Using the KSP Internal API is possible but doesn’t achieve the feature I wanted).

I could implement the features I wanted within three days of studying the Kotlin compiler plugin. It’s just that there’s very little documentation, but once you understand the core flow, writing code is easier than you think, so try to take time to learn if you are interested.

Kotlin Compiler Plugin vs. Kotlin Symbol Processing

If you’ve used the KSP API before, the Kotlin compiler plugin is relatively easy to understand. The official KSP documentation describes KSP as follows.

Kotlin Symbol Processing (KSP) is an API that you can use to develop lightweight compiler plugins. KSP provides a simplified compiler plugin API that leverages the power of Kotlin while keeping the learning curve at a minimum. Compared to kapt, annotation processors that use KSP can run up to two times faster.

This means that if you’ve used the KSP API, you’ve already used parts of the Kotlin compiler plugin. So what’s the difference between KSP and the Kotlin compiler plugin?

This time, let’s look at the limitations of KSP as described in the official KSP documentation.

  • Examining expression-level information of source code.
  • Modifying source code.
  • 100% compatibility with the Java Annotation Processing API.

Items related to the Kotlin compiler plugin are the first and second items. Since the Kotlin compiler plugin works as a plugin for a language compiler, all of these limitations are possible.

In other words, with a Kotlin compiler plugin, you can see your code from the compiler’s point of view, generate code during compilation, modify existing code, and change some of the features of the Kotlin language at will. However, all of this is impossible in KSP.

The Kotlin compiler plugin can be useful in this situation.

  • There is sufficient time available for feature implementation. (Most API documentation is missing, so learning can take a long time.)
  • Access language information at a low level of abstraction.
  • Existing code must be modified, or new code must be generated from the compiler’s point of view.

However, in this situation, KSP is a better choice.

  • There is insufficient time available for feature implementation. (There is a lot of documentation so that you can learn easily.)
  • Easy and fast access to language information at a high level of abstraction.
  • Generates new code and does not modify existing code.

So far, we’ve learned what the Kotlin compiler plugin means and why we use it.

Kotlin Compiler Under the Hood

From now on, we will look at the inner workings of the Kotlin compiler, which is the pre-knowledge needed to develop a Kotlin compiler plugin. (For those who already know the Kotlin compiler, this article was written based on the New IR Backend, which is the default backend at this point.)

The Kotlin compiler is divided into frontend and backend stages according to their roles.

  • frontend: Building PSI Tree and configuring BindingContext
  • backend: IR generation and target/machine code generation

Let’s start with the frontend stage. The Kotlin compiler starts building the PSI tree first in the frontend phase. PSI stands for Program Structure Interface and represents the result of source code parsing. It is important to note that PSI is only the result of syntactic analysis and does not contain semantic info.

Semantic information refers to the details of all the data used in the code and answering questions like “Where does this function come from?”, “Do these variables all refer to the same value?” and “What is this type?”.

For example, here’s some simple code:

fun main() {
if (pet is Dog) {
pet.woof()
} else {
println("*")
}
}

In the code above, the PSI Tree is built as shown below.

In the tree above, the 'pet', 'Dog', 'pet', 'woof', 'println', and '*' nodes corresponding to leaf nodes represent strings, but it is not known what semantic information the corresponding node has. In other words, if the above PSI Tree is converted into code, it becomes as follows:

fun main() {
if ("pet" is "Dog") {
"pet"."woof()"
} else {
"println"("*")
}
}

The semantic information each node represents is stored in a special map called the BindingContext.

In the frontend stage of the Kotlin compiler, the Kotlin source code is analyzed, the PSI tree is built, and the semantic information for each node is stored in the BindingContext.

When all the frontend stages are completed, the backend stage proceeds with the frontend result.

  • JS IR BackendIR generatorJavaScript (js file)
  • JVM IR BackendIR generatorJVM Bytecode (class file)
  • Native BackendIR generatorLLVM Bitcode (so file)

Looking at the process above, you can see that there are three backend engines used: js, jvm, and native, and the process called IR generator in the middle is common to all three engines.

Since Kotlin is a multi-platform language, each platform has its own backend engine, and IR is used to share logic between platform engines.

IR stands for Intermediate Representation and is an intermediate representation between Kotlin source code and target code (.js, .class, .so). IR allows you to get an intermediate expression common to all target code platforms, so you can avoid a lot of duplication of code by implementing the logic for this intermediate expression.

In the backend stage of the Kotlin compiler, an IR is created based on the PSI Tree and BindingContext prepared in the frontend stage, and the process of generating the target code using the generated IR proceeds.

So far, we have briefly looked at the inner workings of the Kotlin compiler.

Hello, World!

Now that we have all the basic concepts to create a Kotlin compiler plugin. The Kotlin compiler plugin has the following structure:

CommandLineProcessor can be seen as a compiler argument and CompilerPluginRegistrar can be seen as an integration point for compiler plugins. A Kotlin compiler plugin is configured by registering an Extension, which means a compiler plugin, to CompilerPluginRegistrar.

Now let’s create our own Kotlin compiler plugin. We will create a simple plugin that prints the signatures of all functions defined in a module.

First, we need the Kotlin compiler and the autoservice dependencies.

dependencies {
compileOnly("org.jetbrains.kotlin:kotlin-compiler-embeddable:1.8.20")
compileOnly("com.google.auto.service:auto-service-annotations:1.0.1")
kapt("com.google.auto.service:auto-service:1.0.1")
}

Since the Kotlin compiler is valid only during the Kotlin compilation process, I added it as compileOnly, and the CommandLineProcessor and CompilerPluginRegistrar introduced earlier are registered as ServiceLoaders, so I added autoservice for easy service registration.

Let’s implement the CommandLineProcessor. I said that the CommandLineProcessor is a compiler argument, so we need two items:

  1. Compiler plugin ID
  2. Compiler argument information

I will use land.sungbin.function.printer as the compiler plugin ID and add a String type tag as a compiler argument. To preview the result, we provide compiler arguments like this:

tasks.withType<KotlinCompile> {
val functionPrinterPluginId = "land.sungbin.function.printer"
kotlinOptions {
freeCompilerArgs = freeCompilerArgs + listOf(
"-P",
"plugin:$functionPrinterPluginId:tag=FP",
)
}
}

Let’s create a variable for each item.

const val PluginId = "land.sungbin.function.printer"

val KEY_TAG = CompilerConfigurationKey<String>("Tags to use for logging")
val OPTION_TAG = CliOption(
optionName = "tag",
valueDescription = "String",
description = KEY_TAG.toString(),
)

Compiler argument key can be defined as CompilerConfigurationKey<argument type>("argument description"), and argument key options can be defined as CliOption(optionName = "argument name", valueDescription = "argument value description", description = "argument description").

If you look at OPTION_TAG, KEY_TAG.toString() is given as the CliOption#description value. We can get the argument description given to the argument key with CompilerConfigurationKey#toString.

Now, let’s provide each variable to the CommandLineProcessor.

@AutoService(CommandLineProcessor::class)
class FPCommandLineProcessor : CommandLineProcessor {
override val pluginId = PluginId

override val pluginOptions = listOf(OPTION_TAG)

override fun processOption(
option: AbstractCliOption,
value: String,
configuration: CompilerConfiguration,
) {
when (val optionName = option.optionName) {
OPTION_TAG.optionName -> configuration.put(KEY_TAG, value)
else -> error("Unknown plugin option: $optionName")
}
}
}

The values provided as compiler arguments are passed to the CommandLineProcessor’s processOption callback. The option and value arguments of the processOption callback represents the compiler argument option and the supplied value. And the last configuration argument is a map containing the configuration to be used globally by the compiler.

If the compiler argument provided as processOption is the argument corresponding to OPTION_TAG, the supplied argument value is saved as the KEY_TAG key in the configuration map.

Throws an IllegalStateException if an unknown compiler argument is provided.

Next, let’s look at CompilerPluginRegistrar. I said that the CompilerPluginRegistrar is the integration point for compiler plugins. Therefore, Extension registration, which means compiler plugin, proceeds here.

@AutoService(CompilerPluginRegistrar::class)
class FPCompilerPluginRegistrar : CompilerPluginRegistrar() {
override val supportsK2 = false

override fun ExtensionStorage.registerExtensions(configuration: CompilerConfiguration) {
// configuration.get(key: CompilerConfigurationKey<T>, defaultValue: T (optional))

val logger = configuration.get(CLIConfigurationKeys.MESSAGE_COLLECTOR_KEY, MessageCollector.NONE)
val loggingTag = requireNotNull(configuration.get(KEY_TAG))
}
}

CompilerPluginRegistrar is an abstract class consisting of the supportsK2 variable and ExtensionStorage.registerExtensions extension function.

The supportsK2 variable indicates whether to support a new version of the Kotlin compiler. In this article, K2 support will not be provided for simplicity. The ExtensionStorage.registerExtensions extension function opens the environment for registering Extensions. In other words, Extension registration proceeds in ExtensionStorage.registerExtensions.

Looking at the body of ExtensionStorage.registerExtensions, logger and loggingTag are imported as CLIConfigurationKeys.MESSAGE_COLLECTOR_KEY and KEY_TAG keys from configuration, which is a function argument. While KEY_TAG is familiar because it is registered by CommandLineProcessor, but CLIConfigurationKeys.MESSAGE_COLLECTOR_KEY appeared for the first time. This key is provided by default in the Kotlin compiler and imports a logger to be used in the Kotlin compiler environment.

Now it’s time to register the Extension. There are many different types of Extensions. Representatively, there are ExpressionCodegenExtension that can access the bytecode generation process and IrGenerationExtension that can access the IR generation process.

This article aims to print the signatures of all functions defined in a module. To look up all the functions in a module, it seems best to approach the IR generator process, which is the point at which all Kotlin source codes are analyzed and semantic information lookup is finished. So I will use IrGenerationExtension.

class FPIrExtension(
private val logger: MessageCollector,
private val loggingTag: String,
) : IrGenerationExtension {
override fun generate(
moduleFragment: IrModuleFragment,
pluginContext: IrPluginContext,
) {
moduleFragment.accept(FPIrVisitor(logger, loggingTag), null)
}
}

IrGenerationExtension is an interface that has a generate function, and we can access the IR generation process by implementing generate. The generate function’s arguments are IrModuleFragment and IrPluginContext, which provides IR information of the module the compiler is running on and context that can help IR work.

Like IrModuleFragment, all IR elements have accept and transform functions. accept corresponds to the case of only visiting the IR without modifying it, and transform corresponds to the case of modifying at the same time as visiting the IR. In the case of this article, we only need to visit the IR, so let’s use the accept function.

As arguments to the accept function, we must provide an IrElementVisitor implementation and an object to pass to that implementation. In this article, a class called FPIrVisitor is created and used as an IrElementVisitor implementation, and since there is no additionally passed object, null is passed.

FPIrVisitor is a class that extends from IrElementVisitorVoid.

class FPIrVisitor(
private val logger: MessageCollector,
private val loggingTag: String,
) : IrElementVisitorVoid {
override fun visitModuleFragment(declaration: IrModuleFragment) {
TODO()
}

override fun visitFile(declaration: IrFile) {
TODO()
}

override fun visitFunction(declaration: IrFunction) {
TODO()
}
}

Since we don’t have an object to pass, FPIrVisitor extends from IrElementVisitorVoid, which is an IrElementVisitor designed to receive nothing.

IrElementVisitor provides IR access callbacks in very granular granularity to suit all situations. To access the IRs of all functions in a module, we should first access the module IRs, traverse all files included in that module, and visit the IRs of functions defined in those files. To do this, FPIrVisitor uses the visitModuleFragment, visitFile, and visitFunction callbacks.

Let’s implement visitModuleFragment callback to visit all Kotlin files included in the module given as an argument.

override fun visitModuleFragment(declaration: IrModuleFragment) {
declaration.files.forEach { file ->
file.accept(this, null)
}
}

We can use the files property on the IrModuleFragment to get a list consisting of IrFile representing the IRs of a file. It traverses the IrFile list and re-accepts the FPIrVisitor itself so that it can visit the file’s IR. Then the FPIrVisitor runs again, calling the visitFile callback with the given IrFile.

override fun visitFile(declaration: IrFile) {
declaration.declarations.forEach { item ->
item.accept(this, null)
}
}

The visitFile callback can use the declarations property on the IrFile given as an argument to look up all the elements defined in that file. It traverses the list of IrDeclaration obtained by the declarations property and re-accpets the FPIrVisitor itself to visit visitFunction callback if given IrDeclaration is belong to a function.

override fun visitFunction(declaration: IrFunction) {
val render = buildString {
append(declaration.fqNameWhenAvailable!!.asString() + "(")
val parameters = declaration.valueParameters.iterator()
while (parameters.hasNext()) {
val parameter = parameters.next()
append(parameter.name.asString())
append(": ${parameter.type.classFqName!!.shortName().asString()}")
if (parameters.hasNext()) append(", ")
}
append("): " + declaration.returnType.classFqName!!.shortName().asString())
}
logger.report(CompilerMessageSeverity.WARNING, "[$loggingTag] $render")
}

In the visitFunction callback that will be finally visited, there is no accept again, and the signature of the IrFunction given as an argument is searched. Let’s write the logic inside buildString to retrieve the function signature and display it as a string.

First, write the fully-qualified name of the function and open parentheses to write the arguments.

append(declaration.fqNameWhenAvailable!!.asString() + "(")

To write the arguments, we look up all the value parameters of the function. The lookup is done as an iterator to show commas only until the last argument.

Function arguments are divided into type parameters and value parameters. The generic part is called the type parameter, and the argument part is called the value parameter.

val parameters = declaration.valueParameters.iterator()

Now, let’s iterate over the parameters and write the name and type of the argument.

while (parameters.hasNext()) {
val parameter = parameters.next()
append(parameter.name.asString())
append(": ${parameter.type.classFqName!!.shortName().asString()}")
if (parameters.hasNext()) append(", ")
}

Finally, close the argument parentheses and write the return type of the function.

append("): " + declaration.returnType.classFqName!!.shortName().asString())

In this way, the function's signature was expressed as a string. Now it’s time to print. Log output is possible by using the report function of the MessageCollector given as an argument of the FPIrVisitor class. report function arguments include log level and log message.

logger.report(CompilerMessageSeverity.WARNING, "[$loggingTag] $render")

FPIrVisitor is complete!

class FPIrVisitor(
private val logger: MessageCollector,
private val loggingTag: String,
) : IrElementVisitorVoid {
override fun visitModuleFragment(declaration: IrModuleFragment) {
declaration.files.forEach { file ->
file.accept(this, null)
}
}

override fun visitFile(declaration: IrFile) {
declaration.declarations.forEach { item ->
item.accept(this, null)
}
}

override fun visitFunction(declaration: IrFunction) {
val render = buildString {
append(declaration.fqNameWhenAvailable!!.asString() + "(")
val parameters = declaration.valueParameters.iterator()
while (parameters.hasNext()) {
val parameter = parameters.next()
append(parameter.name.asString())
append(": ${parameter.type.classFqName!!.shortName().asString()}")
if (parameters.hasNext()) append(", ")
}
append("): " + declaration.returnType.classFqName!!.shortName().asString())
}
logger.report(CompilerMessageSeverity.WARNING, "[$loggingTag] $render")
}
}

Now, let’s register FPIrExtension that accepts FPIrVisitor with CompilerPluginRegistrar.

override fun ExtensionStorage.registerExtensions(configuration: CompilerConfiguration) {
val logger = configuration.get(CLIConfigurationKeys.MESSAGE_COLLECTOR_KEY, MessageCollector.NONE)
val loggingTag = requireNotNull(configuration[KEY_TAG])

// new!
IrGenerationExtension.registerExtension(FPIrExtension(logger, loggingTag))
}

Since FPIrExtension extended from IrGenerationExtension, it can be registered with IrGenerationExtension.registerExtension.

Kotlin compiler plugin development is all over. All that’s left to do now is run.

Kotlin compiler plugins can be added with the kotlinCompilerPluginClasspath configuration. If you want to avoid hardcoding you can use the constant PLUGIN_CLASSPATH_CONFIGURATION_NAME provided by the Kotlin Gradle plugin.

import org.jetbrains.kotlin.gradle.plugin.PLUGIN_CLASSPATH_CONFIGURATION_NAME

dependencies {
PLUGIN_CLASSPATH_CONFIGURATION_NAME(project(":function-printer-plugin"))
}

Don’t forget to add compiler arguments registered to CommandLineProcessor.

tasks.withType<KotlinCompile> {
val functionPrinterPluginId = "land.sungbin.function.printer"
kotlinOptions {
freeCompilerArgs = freeCompilerArgs + listOf(
"-P",
"plugin:$functionPrinterPluginId:tag=FP",
)
}
}

Next, let’s add a function to test.

package land.sungbin.sample

fun helloWorld() = Unit
fun helloWorld2(arg: Any) = arg
fun helloWorld3(arg: Int, arg2: Float) = arg2

Now, if you run ./gradlew build, you can see that the Kotlin compiler plugin created so far is working fine.

The projects used in this article can be found on GitHub below:

Thanks for reading this long article to the end. In the next part, we will modify the IR.

You can also read this article in Korean. (Actually, I’m Korean, so my English is immature)

Reference

--

--

Experience Engineers for us. I love development that creates references.