Overcoming macro annotations

and

Jun 14, 2024

Early adoption of experimentals as a blocker for improvements

Scala 2 accumulated multiple amazing features that either improve the readability of the code, make it easier to write and maintain, or reduce the amount of boilerplate. Even though most of them have become an integral part of the language, some were only treated as experimental—their goal was to discover new language capabilities. Still, it was treated as a temporal and prone to change solution.

A great example of this case is meta-programming in Scala. Scala 2.10 introduced us to the experimental quasi-quotes that allowed us to manipulate the code to make our life easier rapidly and relatively easily. However, this solution at the time contained multiple rough edges and was inherently unsafe. Theoretically, the macros were hidden behind the magical incantation - import scala.language.experimental.macros.

It already hinted at the possible troubles. I believe everyone can agree that we should not depend heavily on experimental features of either the external dependencies or the parts of the language. Without a warning, these kinds of utilities can break at any time, leaving us with a big technological debt requiring us to rewrite large amounts of the codebase.

Unfortunately, the benefits of introducing macros have exceeded the possible risks. It led to the massive adoption of macros in both open-source and commercial projects, and successfully blocked the Scala compiler engineers from introducing breaking changes to their behavior. Even though the Scala team knew well how to improve the beloved feature of their language, they had their hands tied for the rest of Scala 2 development. All they could do was collect feedback, prepare prototypes, and introduce a completely different, type-safe, and sound solution in Scala 3 many years later.

> Based on the problems caused by too wide adoption of experimental features which was mentioned in the Scala 3 Roadmap for 2024, Scala 3 has imposed a limitation when using experimental language features - these are only available in unstable (nightly) compiler versions limiting their usage before stabilisation.

The fallout of the breaking changes

Scala 3 introduced a small revolution to meta-programming capabilities available to the developers - these were completely rewritten. Instead of unverified quasi-quote strings, the developers got access to the typesafe Quotes API and its extensions for compile-time reflection and multi-stage programming. These, by being fully typed, are preventing the engineers from shooting themselves in the foot by common mistakes or by abusing the compiler in an unsafe way. What’s more, lots of tasks no longer require the use of advanced meta-programming at all - these now can be replaced with simple inline expressions.

Unfortunately, with all the good parts of breaking compatibility, there also come a number of problems. We already knew the old, quasi-quotes powered macros were heavily used by every major Scala ecosystem. They already have become a fundamental building block for multiple popular libraries and closed-source projects, and their replacement would become hard work taking tens of work hours. In fact, to this day, 3 years after the initial Scala 3 release, some projects have not fully migrated to the new macros, making them unusable in Scala 3 projects.

Sometimes it’s not only due to the lack of effort of their maintainers. The new macros become safer and easier to use, but it was only possible at the cost of some limitations that have not yet been filled in with replacements. This means that the public API of the tools might need to change, or they might require a completely different approach.

Macro-annotations - the major migration blocker?

One of the missing features that existed in Scala 2 is macro-annotations. The standard macros are used as normal expressions that are interpreted during compilation, and they only produce new code. Macro annotations, on the other hand, are used to annotate code statements that should be transformed. The possible transformations could have been very powerful, allowing even for creation of new types and structures at compile time.

Such a powerful set of tools required a careful redesign before making it a part of Scala 3 meta-programming. Currently, macro-annotations in Scala 3 are available only as an experimental feature and have limited capabilities when compared to their Scala 2 counterpart.

At first glance, it might seem like all the previous users of macro annotations would not be able to migrate from Scala 2. For some use cases, it might be true, but for many others, we can use a combination of other stable features or tools. At VirtusLab, we recently took part in migrating one of the projects that was a heavy user of macro-annotation for our client. During the initial 10 days of our free Scala 3 migration help package, we have come up with easily maintained replacements for 3 libraries dependent on macro-annotations:

circe/circe-generic-extras - macro-generated JSON codecs configured using annotations
zio/zio-mock - macro-generated mocks for services
zio/zio - macro-generated accessors for service methods

Exploring the alternative solutions

Our cooperation started with testing how we could rewrite macro annotations without making complicated changes to the code base. Each had unique features and drawbacks we needed to consider when choosing a replacement.

The advanced ones: macros, structural types and dynamic access

One of our first thoughts was to use the new, advanced features offered by the Scala 3. One of these is structural types. It’s a special kind of extension to a type that allows to specify a structure of a given type - the methods and types it needs to contain but without the need to inherit some exact common type.

The structural types used directly might not be the best citizen for performance-critical applications, as they would use a reflection to extract the values from the types.

However, we can mitigate this issue by combining them with Selectable - a special trait that informs the compiler that we provide a way to extract values or apply methods in a sane way. For that, we would implement a selectDynamic and applyDynamic methods. By combining the 2 features, we get both type-safety because we would not be able to extract non-existing value and near-optimal performance.

To copy the code, please click the image, or click here.

However, to make the structural typing useful, we would like to create the refinements describing available values automatically. One way of doing this is by implementing their derivation using transparent inline macros - these are similar to whitebox macros from Scala 2, where the final type of a macro expression would be narrowed to the result of macro execution. This allows us to collect additional information about the types at compile time using the compile-time reflection API and use them to produce refinements of a type, thus turning it into a structural type.

In this example, we can see the power of transparent inlines. Even though the result type of surprise is very wide and can represent Any uncorrelated result type, the result of this method is delayed and can be narrowed to a very specific type of either EmptyBox or BoxOfCandies depending on the inputs at compile time.

We tried to use this approach to model and implement the accessors for ZIO and mocks. When doing so, the only change in the existing code base could be the replacement of macro annotations with a method call to a dedicated macro returning a structural type while keeping the rest of the code backwards source-compatible.

However, it would impose 2 major problems.

The first one, and the most important for us, was the tooling support. The preferred choice of our client for IDE is IntelliJ IDEA, which reimplements the compiler and its types. Because transparent inline needs to interpret the expression before inferring a final type it would impose an enormous amount of work to reliably match its outputs outside the Scala compiler. An alternative to this solution would be to use an IDE powered directly by compiler outputs like Metals, however, such a transition might be difficult to impose for the team of developers.

> We’ve recently worked on prototypes for the IntelliJ Scala Plugin that might allow us to get rid of these limitations in the future. Contact us if you’re interested in helping to bridge the Scala 3 features with the most popular IDE choice of developers.

Additionally, in the case of nested types required to correctly implement ZIO Mocks, we would need to use an experimental part of the quotes API required to create dynamically new symbols within the macro expression.

The wild one: Compiler plugin

Another approach we could have tried would be to create a plugin for the compiler to replace macro annotations with direct operations on Abstract Syntax Trees in one of the early compilation phases. By having access to all the power of the compiler, we would be able to relatively easily create new methods, classes and symbols without the limitations of the Quotes API. Using stubs to existing annotations, we would also be able to fully replicate applications of macro-annotations.

Our team is already experienced with it by creating and maintaining compiler plugins for projects we maintain: Besom, Scala Native, Pekko Serialization Helpers, and others, as well as based on our expertise in the Scala compiler codebase itself. To some extent, it might work, but it’s probably the least efficient solution:

the compiler plugins need to be frequently updated and released for every version of the Scala compiler, as there are no guarantees about the internal APIs of the compiler
transformation or creating new statements in the plugin is verbose and does not provide any helpers like quasi-quotes - it requires constructing AST from scratch, which is both difficult and error-prone,
compiler plugins cannot be applied before the typer phase. This means we would not be able to mix newly generated methods and symbols in the same compilation unit, as their references would not exist in the initial type checking by the compiler. It might require creating stub methods that would be replaced at compile time

The simplest one: Source code generation

All of the previous solutions had one thing in common—they were difficult to inspect and debug. With the additional complexity of macro annotations we wanted to replace, these might quickly become overengineered, complex, and too hard to maintain. That’s why the next idea was to make the replacement of macro annotations easy to understand, extend, or even manually adapt in case of some small bugs. That’s why the next solution replaced sophisticated techniques with a simple, automatic, code-generation tool to create boilerplate parts of the code.

We, software engineers, typically tend to remove as much low-significance code as possible. That’s why in Scala we prefer built-in, compiler-generated methods of case classes or tend to use generics or macros to reduce the amount of similar code patterns. However, sometimes taking a step back is the easiest way to overcome the problems.

Our experiments in this area started with scalafix, a tool allowing us to define code rewrites. It can be used for simple synthetic rewrites based only on the source code - it’s perfect for linting. But, it can also be used with the combination of outputs from the compiler in the form of SemanticDB containing limited information about symbols and types that can be used for more advanced usages.

The prototype was based on the idea of a one-time rewrite in which we would search usages of annotations and methods specific to circe/circe-generic-extras and would replace them by generating their counterpart in the source code. The initial work didn’t go as smoothly as we thought it would:

the information about symbols was sometimes incorrect or placed in the wrong trees, especially information about the annotation;
there were no utilities for working with types obtained from the SemanticDB which made de-aliasing or comparing types difficult;
the created patches didn’t always compose correctly, leading to missing top-level imports or ugly formatting;

However, after initial struggles, this solution worked surprisingly well! By paying a small cost of boilerplate we obtained a code generator that was relatively easy to create and maintain. When implementing logic we could either create a typed AST or iterate quickly with unsafe string concatenation, which could have been easily copied from the quasi-quotes in the macro implementation. This part became crucial as we wanted our code generation to behave as closely as possible to replace macros.

We’ve quickly decided that in our use case, source generators were the best way to go.

Migrating a Scala project with macro annotations

Based on the chosen strategy to use code generators, we’ve started to prepare for migration of the first project - a microservice with ~15k lines of Scala 2.13 code. The migration of the project to Scala 3 on its own, excluding the macro-annotations part, was easy:

there were no macros to replace - all of these were coming from the external dependencies;
set of internal libraries used by the client was already cross-compiled to Scala 3;
it was free from language-related deprecations and warnings, eg. made sure to use explicit result types for implicits.

All of these made the usage of scala3-migration-plugin or other migration tools obsolete - if not the macro annotations, the team would be able to just upgrade the Scala version to finish migration.

We’ve analyzed the use cases that require rewrites and prepared scalafix rules, including the set of extensive test cases based on the found, real-world applications we were going to migrate:

VirtuslabRnD/scalafix-migrate-circe-generic-extras was handling all usages or circe, including the generation of explicit Configuration and translation of annotations to Scala 3 native derivation clauses obtain instances of codecs. It also handled small quirks in using some of the circe utilities, eg. the json string interpolator that now requires a slightly different import
VirtuslabRnD/scalafix-migrate-zio-macros defined rewrite for both ZIO @mockable and @accessible annotations - both of these shared parts of the common logic that we were able to reuse to bootstrap the rewrites quickly

Feel free to use them if you're preparing to migrate any of your projects!

Finally, after a few iterations of fixing small bugs in our code generators, we got to the point where we could proceed with the migration in a few simple steps:

Compile the project using Scala 2.13 to create a SemanticDB
Apply the migration rules using scalafixAll in sbt - including both our custom rewrite rules and optionally the external ones
Audit the rewrites and adjust them if needed
Switch to the next project and repeat the steps!

At that point there might be some remaining compilation errors - these can be present due to changes in type-system between Scala 2 and Scala 3, eg. Nothing might be inferred instead of Any. Luckily, these are typically easy to fix.

Conclusion

Sometimes the objections to migration might be more scary and problematic than they are. Even though upgrades come with the risks and costs of the process, they usually pay off in the long term. In the case of Scala 3 for the cost of the single, last migration, we receive new compatibility guarantees making upgrades easy and manageable in the future.

What’s more, at the same time we receive access to new, powerful tools that allow engineers to create more safe, concise or performant solutions.

We have also learned we don’t always need to use the most complex solutions to reach our goals. Sometimes, the simpler, more verbose solutions might just be enough to overcome temporary problems. Taking a step back might be all you need to take two steps forward!

A guest post by

Wojciech Mazur

Scala 3 compiler engineer, mostly focused on preventing new regressions in the compiler. Involved in multiple Scala ecosystem OSS initiatives, especially as a team leader of the Scala Native project.

VirtusLab Developer Insights

Discussion about this post