Writing a Maven plugin with Scala

When you’re going to write Maven plugins with Scala you’ve to consider two things:

  1. XDoclet plugin descriptors won’t work with Scala, so you’ve to use Java 5 annotations (in org.apache.maven.plugin-tools:maven-plugin-annotations)
  2. You have to make sure that the scala:compile goal is executed prior to plugin:descriptor, otherwise the maven-plugin-plugin is not able to detect the annotations. So, either you change the default phase of plugin:descriptor (which is generate-resources) to compile (or later), or you compile the Scala files before generate-resources, e.g. in the process-sources phase (which is BTW the same you would do in a mixed Java/Scala environment – see here).

Here an example Mojo written in Scala:

import java.io.File
import org.apache.maven.plugin.AbstractMojo
import org.apache.maven.plugins.annotations.Mojo
import org.apache.maven.plugins.annotations.Parameter

@Mojo(name = "test")
class TestMojo extends AbstractMojo {

  @Parameter(required = true)
  var myParam1: String = _

  @Parameter(required = false)
  var myParam2: String = _
  
  @Parameter(defaultValue = "${project.basedir}")
  var baseDirectory: File = _

  def execute() = {
    getLog.debug("Running Mojo test");
    getLog.debug("Base dir: " + baseDirectory)
    
    //TODO
  }
}

And the corresponding pom.xml:

<dependencies>
	<dependency>
		<groupId>org.apache.maven</groupId>
		<artifactId>maven-plugin-api</artifactId>
		<version>3.0.4</version>
	</dependency>
	<dependency>
		<groupId>org.apache.maven.plugin-tools</groupId>
		<artifactId>maven-plugin-annotations</artifactId>
		<version>3.2</version>
		<scope>provided</scope>
	</dependency>
	<dependency>
		<groupId>org.scala-lang</groupId>
		<artifactId>scala-library</artifactId>
		<version>${scala.version}</version>
	</dependency>
</dependencies>

<build>
	<plugins>
		<plugin>
			<groupId>org.scala-tools</groupId>
			<artifactId>maven-scala-plugin</artifactId>
			<version>2.15.2</version>
			<executions>
				<execution>
					<id>scala-compile-first</id>
					<phase>process-sources</phase>
					<goals>
						<goal>compile</goal>
					</goals>
				</execution>
				<execution>
					<id>scala-test-compile</id>
					<phase>process-test-sources</phase>
					<goals>
						<goal>testCompile</goal>
					</goals>
				</execution>
			</executions>
		</plugin>

		<plugin>
			<artifactId>maven-plugin-plugin</artifactId>
			<version>3.2</version>
			<configuration>
				<goalPrefix>myPlugin</goalPrefix>
			</configuration>
		</plugin>
	
		
		<plugin>
			<groupId>org.codehaus.mojo</groupId>
			<artifactId>build-helper-maven-plugin</artifactId>
			<version>1.8</version>
			<executions>
				<execution>
					<id>add-source</id>
					<phase>generate-sources</phase>
					<goals>
						<goal>add-source</goal>
					</goals>
					<configuration>
						<sources>
							<source>src/main/scala</source>
						</sources>
					</configuration>
				</execution>

				<execution>
					<id>add-test-source</id>
					<phase>generate-test-sources</phase>
					<goals>
						<goal>add-test-source</goal>
					</goals>
					<configuration>
						<sources>
							<source>src/test/scala</source>
						</sources>
					</configuration>
				</execution>
			</executions>
		</plugin>
	</plugins>
</build>

Structural Types

Something I sadly miss in statically typed languages – such as Java – is duck typing. Because it is incredible useful, especially for cross cutting concerns and situations, where you want to interact with library classes in a way not foreseen by their authors.

Of course, if you’re are willing to sacrifice the type checking during compile time, you could always use reflection (or something like dynamic member lookup in C# 4, which is essentially the same).

If you don’t want to do that, you would need a way to give the compiler a hint, what you expect from a duck (probably a method void quack()). You would need to define a type by its method signatures: This is called a Structural Type.

Structural types have some interesting advantages:

  • They shift the focus from abstractions to the actual implementation, since you can introduce any number of interfaces afterwards without touching the code.
  • Decoupling of interface declaration and implementation (physically and temporally). Even for legacy library classes new interfaces can be introduced.
  • A type (interface) can be seen as a View. A concept we are used to from relational databases, where we also almost always define the concrete data objects (tables) first.
  • Think of WebServices: It would be quite handy to provide multiple interfaces for the same service without changing anything in the existing service code.

Unfortunately almost no programming language implements such a type system. Googles Go language is one of the few. Go goes so far to omit classic inheritance (sub-classing) at all: There is no way to implement an interface or extend an existing class like in Java or C#. That’s a bold design decision, but I really like it. In most projects I’ve came across, class inheritance caused a bunch of problems and there seems really no good reason to make use of it, except maybe for domain models. For the rare exception where you might need it, Go comes with the concept of embedding, the automatic delegation to subtype methods, which is a pretty cool and unique feature though.

In Go you can summarize the type concept like this: “If something can do this, then it can be used here” (from the official documentation). The following example (also from the official documentation) shows how to use the library function sort.Sort(sort.Interface) with any custom type. It only has to implement three methods (Len, Less and Swap):

type Sequence []int

// Methods required by sort.Interface.
func (s Sequence) Len() int {
    return len(s)
}
func (s Sequence) Less(i, j int) bool {
    return s[i] < s[j]
}
func (s Sequence) Swap(i, j int) {
    s[i], s[j] = s[j], s[i]
}

// Method for printing - sorts the elements before printing.
func (s Sequence) Print() string {
    sort.Sort(s)
    //...
}

(Note the elegant swap implementation!)

Scala also has limited support for Structural Types. It is possible to use them in generic type parameters. Here for example in the generic method makeNose():

object StructureTypeTest extends App {

  class Duck {
    def quack() = { println("Quack") }
  }

  class Frog {
    def quack() = { println("Quack") }
  }

  def makeNoise[T <: { def quack(): Unit }](quackable: T) = {
    quackable.quack
  }

  val duck = new Duck
  val frog = new Frog

  makeNoise(duck)
  makeNoise(frog)
}

The difference to duck typing is, if you’d pass an object to makeNoise() which doesn’t implement quack(), Scala would complain at compile time (and an IDE would immediately mark the code as erroneous)!

Below an interesting real world example: To simulate the automatic resource management in Java 7, we define a method using(), that takes any resource with a close() method, which is called after executing the given closure.

trait ResourceManagement {
  def using[A <: { def close(): Unit }, B](resource: A)(f: A => B): B =
    try {
      f(resource)
    } finally {
      try {
        resource.close()
      } catch { case _ => () }
    }
}

object file extends ResourceManagement {
  def copy(source: String, dest: String) = {
    using(new FileInputStream(source)) { int =>
      using(new FileOutputStream(dest)) { out =>
        out.getChannel()
          .transferFrom(int.getChannel(), 0, Long.MaxValue)
      }
    }
  }
}

//Usage
file copy ("test1.txt", "test2.txt")

So you can avoid cluttering the code with try/catch causes.

Java 8 closures compared to C# and Scala

One of the longest awaited and most urgently demanded features for Java are Closures, also called Lambda Expressions or Anonymous Functions. C# has it, Scala and most other modern dynamic languages anyway, so, it’s about time 😉 Since early access releases of Java 8 are already available, time to have look and compare it with other languages.

Closures can be used to pass functions as method arguments, or even as return values. One particularity of such anonymous methods is, that they have access to the local context (scope) they have been created in, e.g. to local variables. They “enclose” this context – and that’s the origin of the name.

Closures are actually a pretty old concept and Lisp already used them more than 60 years ago. They have always been basic build blocks of functional languages (so called first class objects) and are also widely used in hybrid languages such as Ruby, Python or Scala. Even C# supports Closures since version 3.0 and they unfold their strength especially in LINQ, where you can write strongly typed queries we can only dream of in JPA:

var boyNames = names
  .Where((n) => n.Gender == Gender.Boy)
  .Select((n) => new { n.Name });

Java 8

So, let’s have a look how a Closure or Lambda Expression is going to look in Java 8. A lot of proposals have been made and discussed, but at the end it looked quite similar to Scala and C#:

Calculator adder =
 (double summand1, double summand2) -> summand1 + summand2;

The basic syntax is (arg1, arg2, …) -> { <expression> }. Please note the single dash arrow instead of the equal sign arrow in Scala and C#. But what is on the left hand of the assignment, how do we hold references to Closures? Instead of introducing something new (e.g. a generic interface Function as also proposed), Interfaces with a single method are used and they also got a fancy name: SAM – Single Abstract Method. The Calculator SAM from the example above is defined like this:

//Single Abstract Method (SAM) interface to hold Closures
public static interface Calculator {
  double execute(double arg1, double arg2);
}

One good thing is, there are already plenty of existing SAMs in Java and all methods which take one of them as an argument are now automatically higher order functions (functions that take functions). E.g. Runnable:

new Thread(
 () -> { for (int i = 0; i < 10; i++) System.out.println(i); } 
).start(); 

I mean, the first time I’ve seen that, I thought, wow that’s smart. And all the books which state, that in Java Interfaces are used to pass “methods” or some logic to other method, are still correct ;-). But at the other hand, it makes calling Closures quite clumsy:

 
Calculator adder =    
  (double summand1, double summand2) -> summand1 + summand2;

double result = adder.execute(10, 5);
//How it should look:
//double result = adder(10, 5);

Why do I have to call execute() here? It seems to me that Closures in Java 8 are mainly a convenient syntax sugar for anonymous inner classes, why not introducing a shortcut for calling the single method of a SAM?

Well, in fact Closures are a bit more than anonymous inner classes, for example, non-final local classes can be part of the Closure expression:

int localNonFinalVar = 1;

Calculator multiplier =
  (factor1, factor2) -> factor1 * factor2 * localNonFinalVar;

The second thing you can see in this example is, that the argument types can be omitted, since the compiler can infer it from the interface Calculator.

A last interesting features is, that existing methods can be used as Closures, well, more precise, SAMs can refer to existing methods. For that aim the double colon operator will be introduced in Java 8 (in older drafts the hash symbol has been used):

public class Test {
  private static double substractor
    (double minuend, double subtrahend) {
      return minuend - subtrahend;
  }
  
  public static void main(String[] args) {
    Calculator substractor = Test::substractor;
		
    //Closure from a build in method!
    Calculator maximum = Math::max;
  }
}

So we can just pass java.lang.Math::max as an argument to a higher order function – pretty neat!

C#

The C# delegates are used to hold references to Closures. This decision was somehow obvious, since a delegate already represented a function signature:

delegate double Calculator(double arg1, double arg2);

Calculator adder =
 (Double summand1, Double summand2) => summand1 + summand2;

The argument types can also be omitted and it is also possible to include local variables:

int localVariable = 1;

Calculator multiplier = 
  (factor1, factor2) => factor1 * factor2 * localVariable;

And referring to existing methods is of course possible, that’s the original aim of delegates. Although the syntax feels a bit odd compared to Java 8:

class Test
{
  private static double Substractor
  (double minuend, double substrahend)
  {
    return minuend - substrahend;
  }
  
  static void main(string[] args(string[] args)
  {
    Calculator substractor = new Calculator(Substractor);
  
    //Closure from a build in method
    Calculator maximum = new Calculator(Math.Max);
  }
}

Calling C# Closures works like it is supposed to be, that’s an advantage over the Java 8 approach with SAMs:

var result = adder(10, 5);

Short and simple. By the way, implicitly typed variables (var) would also be handy in Java… which brings me to Scala.

Scala

In Scala neither SAMs nor something like a delegate is needed and thanks to type inference we can just write:

val adder =
 (summand1: Double, summand2: Double) => summand1 + summand2

Here you can of course not simply omit the argument types, but defining a function type in Scala is straight forward: (Double, Double) => Double. Or in a full example:

val multiplier: (Double, Double) => Double 
  = (factor1, factor2) => factor1 * factor2;

Referring to existing methods is concise and clean, as to be expected from Scala:

def substractor(minuend: Double, substrahend: Double) =
  minuend - substrahend

val substractorRef: (Double, Double) => Double = substractor

//Closure from a build in method
val maximum : (Double, Double) => Double = math.max

And executing a Closure is as simple as in C#:

val result = adder(10, 5)

Summary

Java 8 Closures will greatly enhance the readability, especially in GUI development, where a lot of anonymous inner classes for event handling clutter the code. Also, they will give a boost to internal DSLs (see LINQ example at the top of the post) and allow functional coding style to some extend (e.g. map/reduce on collections).

I’m glad that they’ve reused the existing syntax of Scala and C#, the dash arrow is probably an attempt to appear distinct.

There is not much bad to say, I personally don’t like the approach with SAMs, because that will lead to confusion. I would have preferred an approach which separates Closures and the concept of anonymous inner classes more strictly. It seems to much just a syntactic sugar, a very useful though. To draw the equation anonymous inner class = anonymous function | when interface is SAM is maybe too simple.

A closer look into the generated bytecode reinforces the impression of syntactic sugar: Closures are just compiled to anonymous inner classes – currently it is even possible to run the generated bytecode with an Java 7 JRE. Used local variables are simply passed to the constructor and stored in attributes of the inner class. Here the multiplier example from above on the byte-code level:

//Calculator multiplier =
//  (factor1, factor2) -> factor1 * factor2 * localNonFinalVar;

class Test$2 implements Test$Calculator {
  int cap$0; // localNonFinalVar

  Test$2(int);
       0: aload_0
       1: invokespecial #1 // Super constructor call
       4: aload_0
       5: iload_1
       6: putfield      #2 // Field cap$0:I
       9: return

  public double execute(double, double);
       0: dload_1
       1: dload_3
       2: dmul
       3: aload_0
       4: getfield      #2 // Field cap$0:I
       7: i2d
       8: dmul
       9: dreturn
}

To sum up, the implementation is not as clean and concise as in Scala, but comparable to C#, and yet a big improvement. That we actually have to call the single method of the SAM when executing a Closure is quite confusing.

So, I’m, really looking forward to Java 8.