Kotlin Compiler Plugin
Say Hello to the Kotlin Compiler Plugin
Start developing a Kotlin compiler plugin that’s even more powerful than KSP, and why?
This is the first article in a three-part series about the Kotlin compiler plugin. In this article, we’ll take a quick look at the structure of the Kotlin Compiler and create a “Hello, World!” with the Kotlin compiler plugin.
What is the Kotlin Compiler Plugin?
I’m guessing that at least half of the people reading this are new to learning about the Kotlin compiler plugin. It doesn’t matter if you don’t know what it is, and there’s no API documentation for most of it. But I’m introducing this because it allows you to do things that aren’t possible at the language level.
A Kotlin compiler plugin is exactly what it sounds like — a technology that adds a plugin feature to the Kotlin Compiler.
I decided to use the Kotlin compiler plugin to overcome the limitations of KSP. The feature I wanted to implement was the ability to look up the default value of a function argument, but I couldn’t access the default value of a function argument using the KSP API. (Using the KSP Internal API is possible but doesn’t achieve the feature I wanted).
I could implement the features I wanted within three days of studying the Kotlin compiler plugin. It’s just that there’s very little documentation, but once you understand the core flow, writing code is easier than you think, so try to take time to learn if you are interested.
Kotlin Compiler Plugin vs. Kotlin Symbol Processing
If you’ve used the KSP API before, the Kotlin compiler plugin is relatively easy to understand. The official KSP documentation describes KSP as follows.
Kotlin Symbol Processing (KSP) is an API that you can use to develop lightweight compiler plugins. KSP provides a simplified compiler plugin API that leverages the power of Kotlin while keeping the learning curve at a minimum. Compared to kapt, annotation processors that use KSP can run up to two times faster.
This means that if you’ve used the KSP API, you’ve already used parts of the Kotlin compiler plugin. So what’s the difference between KSP and the Kotlin compiler plugin?
This time, let’s look at the limitations of KSP as described in the official KSP documentation.
- Examining expression-level information of source code.
- Modifying source code.
- 100% compatibility with the Java Annotation Processing API.
Items related to the Kotlin compiler plugin are the first and second items. Since the Kotlin compiler plugin works as a plugin for a language compiler, all of these limitations are possible.
In other words, with a Kotlin compiler plugin, you can see your code from the compiler’s point of view, generate code during compilation, modify existing code, and change some of the features of the Kotlin language at will. However, all of this is impossible in KSP.
The Kotlin compiler plugin can be useful in this situation.
- There is sufficient time available for feature implementation. (Most API documentation is missing, so learning can take a long time.)
- Access language information at a low level of abstraction.
- Existing code must be modified, or new code must be generated from the compiler’s point of view.
However, in this situation, KSP is a better choice.
- There is insufficient time available for feature implementation. (There is a lot of documentation so that you can learn easily.)
- Easy and fast access to language information at a high level of abstraction.
- Generates new code and does not modify existing code.
So far, we’ve learned what the Kotlin compiler plugin means and why we use it.
Kotlin Compiler Under the Hood
From now on, we will look at the inner workings of the Kotlin compiler, which is the pre-knowledge needed to develop a Kotlin compiler plugin. (For those who already know the Kotlin compiler, this article was written based on the New IR Backend, which is the default backend at this point.)
The Kotlin compiler is divided into frontend and backend stages according to their roles.
- frontend: Building PSI Tree and configuring BindingContext
- backend: IR generation and target/machine code generation
Let’s start with the frontend stage. The Kotlin compiler starts building the PSI tree first in the frontend phase. PSI stands for Program Structure Interface and represents the result of source code parsing. It is important to note that PSI is only the result of syntactic analysis and does not contain semantic info.
Semantic information refers to the details of all the data used in the code and answering questions like “Where does this function come from?”, “Do these variables all refer to the same value?” and “What is this type?”.
For example, here’s some simple code:
fun main() {
if (pet is Dog) {
pet.woof()
} else {
println("*")
}
}
In the code above, the PSI Tree is built as shown below.
In the tree above, the 'pet'
, 'Dog'
, 'pet'
, 'woof'
, 'println'
, and '*'
nodes corresponding to leaf nodes represent strings, but it is not known what semantic information the corresponding node has. In other words, if the above PSI Tree is converted into code, it becomes as follows:
fun main() {
if ("pet" is "Dog") {
"pet"."woof()"
} else {
"println"("*")
}
}
The semantic information each node represents is stored in a special map called the BindingContext
.
In the frontend stage of the Kotlin compiler, the Kotlin source code is analyzed, the PSI tree is built, and the semantic information for each node is stored in the BindingContext
.
When all the frontend stages are completed, the backend stage proceeds with the frontend result.
JS IR Backend
→IR generator
→JavaScript
(js file)JVM IR Backend
→IR generator
→JVM Bytecode
(class file)Native Backend
→IR generator
→LLVM Bitcode
(so file)
Looking at the process above, you can see that there are three backend engines used: js
, jvm
, and native
, and the process called IR generator
in the middle is common to all three engines.
Since Kotlin is a multi-platform language, each platform has its own backend engine, and IR is used to share logic between platform engines.
IR stands for Intermediate Representation and is an intermediate representation between Kotlin source code and target code (.js, .class, .so). IR allows you to get an intermediate expression common to all target code platforms, so you can avoid a lot of duplication of code by implementing the logic for this intermediate expression.
In the backend stage of the Kotlin compiler, an IR is created based on the PSI Tree and BindingContext prepared in the frontend stage, and the process of generating the target code using the generated IR proceeds.
So far, we have briefly looked at the inner workings of the Kotlin compiler.
Hello, World!
Now that we have all the basic concepts to create a Kotlin compiler plugin. The Kotlin compiler plugin has the following structure:
CommandLineProcessor
can be seen as a compiler argument and CompilerPluginRegistrar
can be seen as an integration point for compiler plugins. A Kotlin compiler plugin is configured by registering an Extension
, which means a compiler plugin, to CompilerPluginRegistrar
.
Now let’s create our own Kotlin compiler plugin. We will create a simple plugin that prints the signatures of all functions defined in a module.
First, we need the Kotlin compiler and the autoservice dependencies.
dependencies {
compileOnly("org.jetbrains.kotlin:kotlin-compiler-embeddable:1.8.20")
compileOnly("com.google.auto.service:auto-service-annotations:1.0.1")
kapt("com.google.auto.service:auto-service:1.0.1")
}
Since the Kotlin compiler is valid only during the Kotlin compilation process, I added it as compileOnly
, and the CommandLineProcessor
and CompilerPluginRegistrar
introduced earlier are registered as ServiceLoaders, so I added autoservice for easy service registration.
Let’s implement the CommandLineProcessor
. I said that the CommandLineProcessor
is a compiler argument, so we need two items:
- Compiler plugin ID
- Compiler argument information
I will use land.sungbin.function.printer
as the compiler plugin ID and add a String
type tag
as a compiler argument. To preview the result, we provide compiler arguments like this:
tasks.withType<KotlinCompile> {
val functionPrinterPluginId = "land.sungbin.function.printer"
kotlinOptions {
freeCompilerArgs = freeCompilerArgs + listOf(
"-P",
"plugin:$functionPrinterPluginId:tag=FP",
)
}
}
Let’s create a variable for each item.
const val PluginId = "land.sungbin.function.printer"
val KEY_TAG = CompilerConfigurationKey<String>("Tags to use for logging")
val OPTION_TAG = CliOption(
optionName = "tag",
valueDescription = "String",
description = KEY_TAG.toString(),
)
Compiler argument key can be defined as CompilerConfigurationKey<argument type>("argument description")
, and argument key options can be defined as CliOption(optionName = "argument name", valueDescription = "argument value description", description = "argument description")
.
If you look at OPTION_TAG
, KEY_TAG.toString()
is given as the CliOption#description
value. We can get the argument description given to the argument key with CompilerConfigurationKey#toString
.
Now, let’s provide each variable to the CommandLineProcessor
.
@AutoService(CommandLineProcessor::class)
class FPCommandLineProcessor : CommandLineProcessor {
override val pluginId = PluginId
override val pluginOptions = listOf(OPTION_TAG)
override fun processOption(
option: AbstractCliOption,
value: String,
configuration: CompilerConfiguration,
) {
when (val optionName = option.optionName) {
OPTION_TAG.optionName -> configuration.put(KEY_TAG, value)
else -> error("Unknown plugin option: $optionName")
}
}
}
The values provided as compiler arguments are passed to the CommandLineProcessor
’s processOption
callback. The option
and value
arguments of the processOption
callback represents the compiler argument option and the supplied value. And the last configuration
argument is a map containing the configuration to be used globally by the compiler.
If the compiler argument provided as processOption
is the argument corresponding to OPTION_TAG
, the supplied argument value is saved as the KEY_TAG
key in the configuration
map.
Throws an IllegalStateException
if an unknown compiler argument is provided.
Next, let’s look at CompilerPluginRegistrar
. I said that the CompilerPluginRegistrar
is the integration point for compiler plugins. Therefore, Extension
registration, which means compiler plugin, proceeds here.
@AutoService(CompilerPluginRegistrar::class)
class FPCompilerPluginRegistrar : CompilerPluginRegistrar() {
override val supportsK2 = false
override fun ExtensionStorage.registerExtensions(configuration: CompilerConfiguration) {
// configuration.get(key: CompilerConfigurationKey<T>, defaultValue: T (optional))
val logger = configuration.get(CLIConfigurationKeys.MESSAGE_COLLECTOR_KEY, MessageCollector.NONE)
val loggingTag = requireNotNull(configuration.get(KEY_TAG))
}
}
CompilerPluginRegistrar
is an abstract class consisting of the supportsK2
variable and ExtensionStorage.registerExtensions
extension function.
The supportsK2
variable indicates whether to support a new version of the Kotlin compiler. In this article, K2
support will not be provided for simplicity. The ExtensionStorage.registerExtensions
extension function opens the environment for registering Extension
s. In other words, Extension
registration proceeds in ExtensionStorage.registerExtensions
.
Looking at the body of ExtensionStorage.registerExtensions
, logger
and loggingTag
are imported as CLIConfigurationKeys.MESSAGE_COLLECTOR_KEY
and KEY_TAG
keys from configuration
, which is a function argument. While KEY_TAG
is familiar because it is registered by CommandLineProcessor
, but CLIConfigurationKeys.MESSAGE_COLLECTOR_KEY
appeared for the first time. This key is provided by default in the Kotlin compiler and imports a logger to be used in the Kotlin compiler environment.
Now it’s time to register the Extension
. There are many different types of Extension
s. Representatively, there are ExpressionCodegenExtension
that can access the bytecode generation process and IrGenerationExtension
that can access the IR generation process.
This article aims to print the signatures of all functions defined in a module. To look up all the functions in a module, it seems best to approach the IR generator
process, which is the point at which all Kotlin source codes are analyzed and semantic information lookup is finished. So I will use IrGenerationExtension
.
class FPIrExtension(
private val logger: MessageCollector,
private val loggingTag: String,
) : IrGenerationExtension {
override fun generate(
moduleFragment: IrModuleFragment,
pluginContext: IrPluginContext,
) {
moduleFragment.accept(FPIrVisitor(logger, loggingTag), null)
}
}
IrGenerationExtension
is an interface that has a generate
function, and we can access the IR generation process by implementing generate
. The generate
function’s arguments are IrModuleFragment
and IrPluginContext
, which provides IR information of the module the compiler is running on and context that can help IR work.
Like IrModuleFragment
, all IR elements have accept
and transform
functions. accept
corresponds to the case of only visiting the IR without modifying it, and transform
corresponds to the case of modifying at the same time as visiting the IR. In the case of this article, we only need to visit the IR, so let’s use the accept
function.
As arguments to the accept
function, we must provide an IrElementVisitor
implementation and an object to pass to that implementation. In this article, a class called FPIrVisitor
is created and used as an IrElementVisitor
implementation, and since there is no additionally passed object, null
is passed.
FPIrVisitor
is a class that extends from IrElementVisitorVoid
.
class FPIrVisitor(
private val logger: MessageCollector,
private val loggingTag: String,
) : IrElementVisitorVoid {
override fun visitModuleFragment(declaration: IrModuleFragment) {
TODO()
}
override fun visitFile(declaration: IrFile) {
TODO()
}
override fun visitFunction(declaration: IrFunction) {
TODO()
}
}
Since we don’t have an object to pass, FPIrVisitor
extends from IrElementVisitorVoid
, which is an IrElementVisitor
designed to receive nothing.
IrElementVisitor
provides IR access callbacks in very granular granularity to suit all situations. To access the IRs of all functions in a module, we should first access the module IRs, traverse all files included in that module, and visit the IRs of functions defined in those files. To do this, FPIrVisitor
uses the visitModuleFragment
, visitFile
, and visitFunction
callbacks.
Let’s implement visitModuleFragment
callback to visit all Kotlin files included in the module given as an argument.
override fun visitModuleFragment(declaration: IrModuleFragment) {
declaration.files.forEach { file ->
file.accept(this, null)
}
}
We can use the files
property on the IrModuleFragment
to get a list consisting of IrFile
representing the IRs of a file. It traverses the IrFile
list and re-accept
s the FPIrVisitor
itself so that it can visit the file’s IR. Then the FPIrVisitor
runs again, calling the visitFile
callback with the given IrFile
.
override fun visitFile(declaration: IrFile) {
declaration.declarations.forEach { item ->
item.accept(this, null)
}
}
The visitFile
callback can use the declarations
property on the IrFile
given as an argument to look up all the elements defined in that file. It traverses the list of IrDeclaration
obtained by the declarations
property and re-accpet
s the FPIrVisitor
itself to visit visitFunction
callback if given IrDeclaration
is belong to a function.
override fun visitFunction(declaration: IrFunction) {
val render = buildString {
append(declaration.fqNameWhenAvailable!!.asString() + "(")
val parameters = declaration.valueParameters.iterator()
while (parameters.hasNext()) {
val parameter = parameters.next()
append(parameter.name.asString())
append(": ${parameter.type.classFqName!!.shortName().asString()}")
if (parameters.hasNext()) append(", ")
}
append("): " + declaration.returnType.classFqName!!.shortName().asString())
}
logger.report(CompilerMessageSeverity.WARNING, "[$loggingTag] $render")
}
In the visitFunction
callback that will be finally visited, there is no accept
again, and the signature of the IrFunction
given as an argument is searched. Let’s write the logic inside buildString
to retrieve the function signature and display it as a string.
First, write the fully-qualified name of the function and open parentheses to write the arguments.
append(declaration.fqNameWhenAvailable!!.asString() + "(")
To write the arguments, we look up all the value parameters of the function. The lookup is done as an iterator
to show commas only until the last argument.
Function arguments are divided into type parameters and value parameters. The generic part is called the type parameter, and the argument part is called the value parameter.
val parameters = declaration.valueParameters.iterator()
Now, let’s iterate over the parameters
and write the name and type of the argument.
while (parameters.hasNext()) {
val parameter = parameters.next()
append(parameter.name.asString())
append(": ${parameter.type.classFqName!!.shortName().asString()}")
if (parameters.hasNext()) append(", ")
}
Finally, close the argument parentheses and write the return type of the function.
append("): " + declaration.returnType.classFqName!!.shortName().asString())
In this way, the function's signature was expressed as a string. Now it’s time to print. Log output is possible by using the report
function of the MessageCollector
given as an argument of the FPIrVisitor
class. report
function arguments include log level and log message.
logger.report(CompilerMessageSeverity.WARNING, "[$loggingTag] $render")
FPIrVisitor
is complete!
class FPIrVisitor(
private val logger: MessageCollector,
private val loggingTag: String,
) : IrElementVisitorVoid {
override fun visitModuleFragment(declaration: IrModuleFragment) {
declaration.files.forEach { file ->
file.accept(this, null)
}
}
override fun visitFile(declaration: IrFile) {
declaration.declarations.forEach { item ->
item.accept(this, null)
}
}
override fun visitFunction(declaration: IrFunction) {
val render = buildString {
append(declaration.fqNameWhenAvailable!!.asString() + "(")
val parameters = declaration.valueParameters.iterator()
while (parameters.hasNext()) {
val parameter = parameters.next()
append(parameter.name.asString())
append(": ${parameter.type.classFqName!!.shortName().asString()}")
if (parameters.hasNext()) append(", ")
}
append("): " + declaration.returnType.classFqName!!.shortName().asString())
}
logger.report(CompilerMessageSeverity.WARNING, "[$loggingTag] $render")
}
}
Now, let’s register FPIrExtension
that accept
s FPIrVisitor
with CompilerPluginRegistrar
.
override fun ExtensionStorage.registerExtensions(configuration: CompilerConfiguration) {
val logger = configuration.get(CLIConfigurationKeys.MESSAGE_COLLECTOR_KEY, MessageCollector.NONE)
val loggingTag = requireNotNull(configuration[KEY_TAG])
// new!
IrGenerationExtension.registerExtension(FPIrExtension(logger, loggingTag))
}
Since FPIrExtension
extended from IrGenerationExtension
, it can be registered with IrGenerationExtension.registerExtension
.
Kotlin compiler plugin development is all over. All that’s left to do now is run.
Kotlin compiler plugins can be added with the kotlinCompilerPluginClasspath
configuration. If you want to avoid hardcoding you can use the constant PLUGIN_CLASSPATH_CONFIGURATION_NAME
provided by the Kotlin Gradle plugin.
import org.jetbrains.kotlin.gradle.plugin.PLUGIN_CLASSPATH_CONFIGURATION_NAME
dependencies {
PLUGIN_CLASSPATH_CONFIGURATION_NAME(project(":function-printer-plugin"))
}
Don’t forget to add compiler arguments registered to CommandLineProcessor
.
tasks.withType<KotlinCompile> {
val functionPrinterPluginId = "land.sungbin.function.printer"
kotlinOptions {
freeCompilerArgs = freeCompilerArgs + listOf(
"-P",
"plugin:$functionPrinterPluginId:tag=FP",
)
}
}
Next, let’s add a function to test.
package land.sungbin.sample
fun helloWorld() = Unit
fun helloWorld2(arg: Any) = arg
fun helloWorld3(arg: Int, arg2: Float) = arg2
Now, if you run ./gradlew build
, you can see that the Kotlin compiler plugin created so far is working fine.
The projects used in this article can be found on GitHub below:
Thanks for reading this long article to the end. In the next part, we will modify the IR.
You can also read this article in Korean. (Actually, I’m Korean, so my English is immature)