Friday, January 16, 2009

Extending Maven with Ant

Of late, I have been working with frameworks that seem to be doing an awful lot of bytecode manipulation, annotation processing and the like. Readers of past posts would already know about my experiments with Kilim and ActorFoundry, and I've recently started using JiBX for an application at work, which also does bytecode manipulation. If you've been reading my blog for a while, you'll also know that I'm a big Maven2 fan. I've been using Maven2 for almost couple of years (or more) now, and until recently, I hadn't really missed Ant that much.

I find that Maven2 makes standard build tasks trivial to non-existent, but non-standard tasks (for which there isn't already a plugin available) incredibly hard. With Ant, the level of effort is similar for a standard versus a non-standard task. This is because the design of Ant is imperative in nature, while Maven2's is declarative. With Ant, you tell it how to do a particular task using an XML based scripting language. With Maven2, you provide a standard project structure, and it knows how to do the standard tasks (called "goals" in Maven-speak). Not that I think this was a bad design decision, by the way - the declarative nature of Maven2 has served me (and I suspect most Maven2 users) quite well, with its automatic dependency management, standard goals, etc. And there are a huge number of plugins available - its only when you come across a situation where you need to roll your own is when you will have a problem.

Because I didn't know how to handle this sort of thing, my approach so far has been to build an Ant build.xml file from my Maven2 POM (using mvn ant:ant), and then add the non-standard task into the build.xml file. Of course, now anytime I need to add a new dependency into my pom, I have to add it in by hand into the build.xml. I still want to be able to generate IDE descriptors, build up the project on another machine, etc, so dispensing with the POM altogether is not an option.

One such example from the recent past (2 blog posts ago) is this little monster, refactored a bit for external access, detailing the steps to build my ActorFoundry and Kilim client code.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
  <target name="compile" depends="get-deps" 
      description="Compile the code">
    <mkdir dir="${maven.build.output}"/>
    <javac srcdir="${maven.src.dir}"
           destdir="${maven.build.output}" 
           excludes="**/package.html" 
           debug="true" 
           deprecation="true" 
           optimize="false">
      <classpath refid="build.classpath"/>
    </javac>
    <ant target="check-local-constraints"/>
    <ant target="generate-af-executors"/>
    <ant target="compile-af-executors"/>
    <ant target="weave-classes"/>
  </target>
  <target name="check-local-constraints" 
      depends="_init" 
      description="Check local constraints">
    <!-- check local constraints happens for af only -->
    <apt 
         srcdir="${maven.src.dir}/${af.path.prefix}"
         compile="false"
         classpathref="build.classpath"
         debug="true"
         factory="osl.foundry.preprocessor.LocalSynchConstAPF"
         factorypathref="build.classpath"/>
  </target>
  <target name="generate-af-executors" 
      depends="_init" 
      description="Generate ActorFoundry Executors">
    <!-- code generation for af only -->
    <delete dir="${maven.src-gen.dir}"/>
    <mkdir dir="${maven.src-gen.dir}"/>
    <javadoc private="true"
         doclet="osl.foundry.preprocessor.ExecutorCodeGen"
         docletpathref="build.classpath"
         classpathref="build.classpath"
         sourcepath="${maven.src.dir}"
         packagenames="${af.pkg.prefix}">
      <arg line="-outdir ${maven.src-gen.dir}"/>
    </javadoc>
  </target>
  <target name="compile-af-executors" 
      depends="_init" 
      description="Compile ActorFoundry Executors">
    <!-- compile generated code: for af only -->
    <javac srcdir="${maven.src-gen.dir}"
           destdir="${maven.build.output}"
           debug="on"
           fork="on">
      <classpath refid="build.classpath"/>
    </javac>
  </target>
  <target name="weave-classes" 
      depends="_init" 
      description="Enhance classes using Kilim Weaver">
    <!-- weaving happens for kilim and af files -->
    <java classname="kilim.tools.Weaver" fork="yes">
      <classpath refid="weave.classpath"/>
      <assertions>
        <enable/>
      </assertions>
      <arg value="-x"/>
      <arg value="ExInvalid|test"/>
      <arg value="-d"/>
      <arg value="${maven.build.output}"/>
      <arg line="${kilim.pkg.prefix}.ActorManager 
                    ${kilim.pkg.prefix}.Actor 
                    ${kilim.pkg.prefix}.DownloadActor 
                    ${kilim.pkg.prefix}.IndexActor 
                    ${kilim.pkg.prefix}.WriteActor 
                    ${af.pkg.prefix}.ActorManagerExecutor
                    ${af.pkg.prefix}.DownloadActorExecutor
                    ${af.pkg.prefix}.IndexActorExecutor
                    ${af.pkg.prefix}.WriteActorExecutor"/>
    </java>
  </target>

As you can see, my compile target calls four other custom targets following the compilation phase. In Maven2's Default Build Lifecycle, these four targets would be called in the process-classes phase.

Approach #1: Build Custom Mojo(s)

The "pure" Maven way to address this is to build one or more MOJO (Maven pOJO) classes in Java that fires in the process-classes phase. This was my initial approach, which I later abandoned. However, in the process I learned some useful things, which I would like to describe here before going to my final solution.

Building a MOJO requires you to first build a Maven2 plugin project. The Plugin Developer's Guide page has quite a bit of information if you are interested. There is an archetype available for this, so you run:

1
2
3
4
5
prompt$ mvn archetype:create \
          -DgroupId=com.mycompany.plugins \
          -DartifactId=maven-mycompany-plugin \
          -DarchetypeGroupId=org.apache.maven.archetypes \
          -DarchetypeArtifactId=maven-archetype-mojo

This will create your project. Remove the url field from the POM, since this is going to be a local plugin. Since my MOJOs would need to walk directories and such, I needed commons-io (the recommended IO library for Maven) and plexus-utils (to access Plexus, the IoC container used by Maven), so I added them to the POM as shown below. I also made it Java 1.5 source/target compatible. Then I ran mvn eclipse:eclipse to generate the descriptors for Eclipse. The plugin project can then be opened in Eclipse as a standard Java project.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
  <dependencies>
    ...
    <dependency>
      <groupId>commons-io</groupId>
      <artifactId>commons-io</artifactId>
      <version>1.4</version>
    </dependency>
    <dependency>
      <groupId>org.codehaus.plexus</groupId>
      <artifactId>plexus-utils</artifactId>
      <version>1.5.6</version>
    </dependency>
  </dependencies>

I started writing some code for a plugin that wrapped the Kilim Weaver. Essentially, it takes three parameters classpath, includes and excludes, and uses that to run the Weaver by calling java directly. I don't like this too much, but the alternative was to call the Exec plugin from the command line, which seemed much too heavyweight.

There are two popular books available on Maven, Better Builds with Maven and Maven: The Definitive Guide, (both free to download) and both have some information on how to build MOJOs, but you may have to peek at the sources of a similar plugin to figure out how to build your own. In my case, the sources for the Exec plugin were very helpful. Here is the code for the WeaverMojo.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
// Project: maven-mycompany-plugin
// Source: src/main/java/com/mycompany/plugin/WeaverMojo.java
package com.mycompany.plugin;

import java.io.File;
import java.io.FileFilter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collection;
import java.util.List;

import org.apache.commons.io.DirectoryWalker;
import org.apache.maven.plugin.AbstractMojo;
import org.apache.maven.plugin.MojoExecutionException;
import org.codehaus.plexus.util.StringUtils;
import org.codehaus.plexus.util.cli.Commandline;
import org.codehaus.plexus.util.cli.StreamConsumer;

/**
 * Maven Mojo for Kilim's Weaver.
 * 
 * @goal weave
 * @phase process-classes
 */
public class WeaverMojo extends AbstractMojo {
  
  /**
   * Input directory
   * @parameter default-value="${project.build.directory}"
   * @required
   * @readonly
   */
  private File inputDirectory;
  
  /**
   * Output directory
   * @parameter default-value="${project.build.directory}"
   * @required
   * @readonly
   */
  private File outputDirectory;
  
  /**
   * The Maven classpath, must be supplied manually in the config.
   * @parameter alias="classpath"
   * @required
   */
  private String mavenClassPath;
  
  /**
   * Specifies patterns to exclude. Multiple patterns can be specified
   * and are treated as exclude ${exclude[0]} OR ${exclude[1]} OR ...
   * @parameter alias="excludes"
   */
  private String[] excludes;

  /**
   * Specifies patterns to include. Multiple patterns can be specified
   * and are treated as include ${include[0]} OR ${include[1]} OR ...
   * @parameter alias="includes"
   */
  private String[] includes;
  
  public void execute() throws MojoExecutionException {
    getLog().info("Weaving classes...");
    WeaverDirectoryWalker walker = new WeaverDirectoryWalker();
    List<File> inputFiles = new ArrayList<File>();
    try {
      walker.walk(inputFiles);
    } catch (IOException e) {
      throw new MojoExecutionException("Problem walking directory", e);
    }
    // convert these from file name to package name notation
    List<String> classnames = new ArrayList<String>();
    String inputDirectoryPrefix = inputDirectory.getAbsolutePath(); 
    for (File inputFile : inputFiles) {
      classnames.add(inputFile.getAbsolutePath().
        replaceFirst(inputDirectoryPrefix, ""). // get rid of absolute path
        replaceFirst(".classes.", "").          // get rid of /classes/
        replaceAll("/", ".").                   // convert / to .
        replaceAll(".class", ""));              // remove trailing .class
    }
    // Call using java from command line
    Commandline commandline = new Commandline();
    commandline.setExecutable("java"); // assume that java is in PATH
    commandline.addArguments(new String[] {
      "-cp",
      mavenClassPath,
      "kilim.tools.weaver",
      "-x",
      "ExInvalid|Test",
      "-d",
      outputDirectory.getAbsolutePath(),
      StringUtils.join(classnames.iterator(), " ")
    });
    StreamConsumer stdout = new StreamConsumer() {
      public void consumeLine(String line) {
        getLog().info(line);
      }
    };
    StreamConsumer stderr = new StreamConsumer() {
      public void consumeLine(String line) {
        getLog().info( line );
      }
    };
    getLog().info("Running command: java " + 
      StringUtils.join(commandline.getArguments(), " "));  
    try {
      CommandLineUtils.executeCommandLine(commandline, stdout, stderr);
    } catch (CommandLineException e) {
      throw new MojoExecutionException("Java execution failed", e);
    }
  }
  
  private class WeaverDirectoryWalker extends DirectoryWalker {
    
    public WeaverDirectoryWalker() {
      super(new FileFilter() {
        public boolean accept(File f) {
          if (f.isDirectory()) {
            // we don't want directories in our list
            return true;
          }
          if (! f.getName().endsWith("class")) {
            // we only want .class files in our list
            return false;
          }
          if (f.getName().contains("$")) {
            // don't include inner class class files
            return false;
          }
          boolean included = false;
          boolean excluded = false;
          String filename = f.getAbsolutePath();
          if (includes != null) {
            for (int i = 0; i < includes.length; i++) {
              if (filename.matches(includes[i])) {
                getLog().info(filename + " == " + includes[i]);
                included = true;
                break;
              }
            }
          }
          if (excludes != null) {
            for (int i = 0; i < excludes.length; i++) {
              if (filename.matches(excludes[i])) {
                excluded = true;
                break;
              }
            }
          }
          return included && (! excluded);
        }
      }, -1);
    }

    public void walk(List<File> filenames) throws IOException {
      walk(inputDirectory, filenames);
    }
    
    @Override
    protected void handleFile(File file, int depth, Collection results) 
        throws IOException {
      results.add(file);
    }
  }
}

The work that the MOJO does is defined in its execute() method. The private member variables are annotated with commons-attribute annotations. Getting/setting the variables are handled by Plexus. The annotations are also used to generate the plugin.xml (plugin descriptor) file.

Incidentally, commons-io has a set of ready made FileFilters, which can be ANDed and ORed. It also has a RegexFilter, which uses Java regular expressions similar to my implementation. However, it applies the regular expression on the file name alone (not the entire path), so it did not work for me. It would be nicer to build up a composite filter using the AND/OR/NOT filters using my inputs and outputs arrays as the inputs and then pass it into the DirectoryWalker. It would perhaps also be nicer to be able to use an Ant style file filter in order to make the configuration easier to read, but I guess Java developers should be equally at home with either style.

To compile the MOJO, generate the plugin descriptor (plugin.xml) and install to your local repository, run mvn install.

On the client side (where you want to run the new plugin), you need to configure the build section with the plugin's configuration information. Here is the snippet from my client POM.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
  <build>
    ...
    <plugins>
      ...
      <!-- WeaverMojo configuration -->
      <plugin>
        <groupId>com.mycompany.plugin</groupId>
        <artifactId>maven-mycompany-plugin</artifactId>
        <version>1.0-SNAPSHOT</version>
        <executions>
          <execution>
            <phase>process-classes</phase>
            <goals>
              <goal>weave</goal>
            </goals>
          </execution>
        </executions>
        <configuration>
          <classpath>full runtime classpath here</classpath>
          <includes>
            <param>^.*kilim.*$</param>
            <param>^.*?actorfoundry.*?Executor.*$</param>
          </includes>
        </configuration>
      </plugin>
      ...
    </plugins>
    ...
  </build>

You can run this plugin using either mvn process-classes or mvn mycompany:weave (according to the docs, the second option is automatic if the plugin project's artifactId is either mycompany-maven-plugin or maven-mycompany-plugin). However, I had to add the project's groupId to my settings.xml file.

1
2
3
4
5
6
<settings>
  ...
  <pluginGroups>
    <pluginGroup>com.mycompany.plugin</pluginGroup>
  </pluginGroups>
</settings>

However, as mentioned before, I ultimately abandoned this approach in favor of the one described below. The plugin as described does fire in the appropriate place in the client's build lifecycle, but because the other components are missing, it does not help too much to run it.

Approach #2: Call Ant with AntRun

Looking through various Maven2 plugin sites for their source code, I came across the AntRun plugin. I had heard of it in the past, but did not like the idea of having to depend on Ant. However, given my newly found knowledge of "pure" Maven2 plugins, AntRun seemed to be a ready-made solution to my problem.

The first step was refactoring the extra steps in the compile target into separate Ant targets so they could be called individually, as shown in the build.xml snippet above. The second step was simply add the AntRun plugin descriptor and its configuration into the client POM. There were two gotchas here, however:

  1. Ant's properties were not getting initialized from build.xml, in spite of setting the inheritRefs attribute to true.
  2. Apt was not getting recognized as a valid Ant task, even though its a core task in Ant 1.7.1.

The first problem is easily solved. When mvn ant:ant is used to generate the build.xml, the properties are stored as globals (i.e. within the scope of the project tag), which don't get initialized when called with <ant target="..."/>. The workaround was to create a separate _init target which wrapped the property initialization, and make all tasks dependent on _init at the lowest level (ie, if a task has no dependencies, then it should depend on _init now. You will notice that all our tasks have the depends="_init" set in the build.xml snippet above.

The second problem took me a while to figure out. Apparently, AntRun uses an internal version of Ant (the default version is 1.6.5), and Apt was not a core Ant task in that version. To reset the version, you have to inject the correct version of Ant jars in the plugin descriptor - many thanks to Jason Lee for this post, which I reached through this JIRA page.

The plugin descriptor for AntRun to run the tasks in the process-classes phase is quite simple and self-explanatory, and is shown below. To run this, you need to do mvn process-classes (no fancy aliases here).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
  <build>
    ...
    <plugins>
      ...
      <!-- Antrun plugin -->
      <plugin>
        <artifactId>maven-antrun-plugin</artifactId>
        <executions>
          <execution>
            <phase>process-classes</phase>
            <configuration>
              <tasks>
                <property name="user.home" value="${user.home}"/>
                <ant antfile="build.xml" 
                  target="check-local-constraints" 
                  inheritRefs="true"/>
                <ant antfile="build.xml" 
                  target="generate-af-executors" 
                  inheritRefs="true"/>
                <ant antfile="build.xml" 
                  target="compile-af-executors" 
                  inheritRefs="true"/>
                <ant antfile="build.xml" 
                  target="weave-classes" 
                  inheritRefs="true"/>
              </tasks>
            </configuration>
            <goals>
              <goal>run</goal>
            </goals>
          </execution>
        </executions>
        <dependencies>
          <dependency>
            <groupId>org.apache.ant</groupId>
            <artifactId>ant</artifactId>
            <version>1.7.1</version>
          </dependency>
          <dependency>
            <groupId>org.apache.ant</groupId>
            <artifactId>ant-launcher</artifactId>
            <version>1.7.1</version>
          </dependency>
          <dependency>
            <groupId>org.apache.ant</groupId>
            <artifactId>ant-nodeps</artifactId>
            <version>1.7.1</version>
          </dependency>
          <dependency>
            <groupId>org.apache.ant</groupId>
            <artifactId>ant-apache-bsf</artifactId>
            <version>1.7.1</version>
          </dependency>
          <dependency>
            <groupId>org.apache.bsf</groupId>
            <artifactId>bsf-all</artifactId>
            <version>3.0-beta2</version>
          </dependency>
          <dependency>
            <groupId>rhino</groupId>
            <artifactId>js</artifactId>
            <version>1.7R1</version>
          </dependency>
        </dependencies>
      </plugin>
      ...
    </plugins>
    ...
  </build>

One document that you may find useful if you go the AntRun route is this list of properties accessible in the POM, which you can pass to Ant's task.

There is a lot of XML in here (more than if I went the MOJO route), but no extra plugin code to write. I also don't have to maintain both Ant and Maven2 XMLs simultaneously, which was my original gripe. The ant tasks here call the target in build.xml, but I could just as easily have built a standalone XML file which contained the scripts to do the various tasks, and which would not do the standard stuff such as compile, jar, etc.

Conclusions

When building goals involving third party components, where you either don't have visibility into or control of the source code, it may be preferable to use the AntRun plugin. Ant, notwithstanding its limitations (some of which Maven2 addresses), is likely to be with us for the forseeable future, so it makes sense to leverage it if it makes sense.

However, for goals involving internal components, building MOJO based custom Maven2 plugins would probably provide more flexibility and remove the need for having two build frameworks in place in an organization. That is really the reason I went as far into developing the WeaverMojo as I did, to provide me with an understanding of how to build a Maven2 custom plugin should I need to at some point in the future.

Its paradoxical that Maven offers a built-in feature to facilitate project documentation (mvn site), yet it is harder to find information for Maven than for Ant, which does not offer any such feature. In all fairness, all the Maven plugin projects I have looked, are,without exception, very well documented. However, there is no one-stop shop such as the Ant manual.

6 comments (moderated to prevent spam):

Julia O said...

Great article! Thanks for sharing this. Can you elaborate on how you forced ant to load maven's references? I set "inheritRefs" flag to true when calling the ant file from maven script, but its not doing much.

Thnx

Sujit Pal said...

Hi Julia, thanks and glad you liked it. So, when you build a build.xml file using mvn ant:ant, all the maven.* properties are created as globals. When antrun calls the specific ant target, these global properties don't get set. However, if you change the build.xml file a bit so the properties are created inside an _init target, then make the antrun task dependent on _init, then you have your properties set. Thats basically all I did to work around this problem.

Chandra Mohan said...

good info.. Have you tried using Custom Ant plugin. any idea of how to refer scripts in the plugin itself..

Sujit Pal said...

Hi Chandra, by "custom ant plugin" do you mean "antrun"? If not, then the answer is no. Not sure what you mean by "referring to scripts in the plugin itself" - can you provide an example or elaborate?

Chandra Mohan said...

Hi Sujith,
thanks for the reply. I'm referring to ant mojo plugin as described in the link '
http://maven.apache.org/guides/plugin/guide-ant-plugin-development.html '-- I've developed a plugin as described above ,but gotcha some technical issues as below:
1) suppose you place some scripts/ files in 'src/main/resources/...' [say delete_files.sh ] folder of plugin ,then the plugin cannot refer to this files.
2)Not able to call targets from different ant script files [say -- A.build.xml file calling target in B.build.xml' ] ..Any help is appreciated. .Thanks again .

Sujit Pal said...

Hi Chandra, I am probably not the right person to advise you since my knowledge of Ant plugins for maven is limited - however, you may want to check out the exec plugin, that may have what you want for your first problem. For the second one, you should probably ensure that you can call an ant job from within ant first (ie call B_build.xml from A_build.xml, then try to call A_build.xml from mvn).