Saturday, June 20, 2015

Not as 'final' as you think

The final keyword is used on methods to prevent subclasses from overriding them and on classes to prevent subclass definition. The idea behind it, is to prevent the class behavior modification. In many cases, this requirement comes to enforce security, e.g. with String objects passed to Class.forName(), or used as keys of a HashMap. But how 'final' these classes are? This posts shows they are not as 'final' as you think. More precisely, I show how you can:
  • add methods to final classes;
  • override/redefine methods of final classes;
  • override/redefine final methods.
To achieve these goals two different approaches are followed, namely a) exploiting the meta-programming Groovy features and b) exploiting the Byte Buddy tool in Java for byte code manipulation.

The Groovy Way

The code presented in this section is on github. With Groovy you can add or redefine methods, e.g. I am going to add methods on the final java.lang.String class and override its startsWith method.

Let's take a closer look at the addMethod method. String.metaclass is equivalent with String.class.getMetaClass(). In Groovy classes are first class citizens and so you don't need to write the .class ending. String.class is an object of type Class and each Groovy object has an associated Metaclass object. All Groovy classes implement the GroovyObject interface, which exposes a getMetaClass() method for each object.

With String.metaClass.someMethodName = {...} you add a new method called someMethodName into the Metaclass object of String class. Its implementation is included in the closure (the '=' symbol is optional). Also groovy allows for String variables to be used as method names, e.g. if I have a String s and another String name="length", then s.length() and s."$name"() are equivalent. So I believe you are now able to understand that addMethod adds a method on the String class that
  • is named after the name parameter;
  • has no parameters;
  • returns the String "funny method: String."+ name (the return keyword is optional in the last command of a method/closure).
From Groovy for Domain-Specific Languages:
"A method invocation on an object is always dispatched in the first place to the GroovyObject.invokeMethod() of the object. In the default case, this is relayed onto the MetaClass.invokeMethod() for the class and the MetaClass is responsible for looking up the actual method."

The same mechanism is also used in overriddeStartsWith method. But since a method startsWith: String => boolean already exists in String class this will be hidden. Instead the implementation we set in the closure will be called by the MetaClass.invokeMethod().

You can exercise the above code with Spock:

Run the tests with Gradle:  gradlew test

When people from the Groovy community present this short of power in public, they often get the question: "does Groovy open a security hole into Java?". This question is justified by the fact that Groovy code can be called by Java code as they both translate into JVM bytecode. Groovy guys, usually respond that groovy is entirely written in Java, so it does not add a security issue, but exposes what already existed in Java. Clever answer, certainly says the truth but not the whole truth, as is shown in the following paragraph.

Leverage Groovy Magic to Redefine Methods when Running Java?

The above paragraph, was about how to redefine methods of final Java classes, when you launch the jvm with the groovy command. This paragraph is about the same but when you launch the jvm with the java command. In this case the Metaclass mechanism is not there and therefore a method invocation invokes the class method directly.

I have prepared another test project to show this, clone it from github. This uses the Groovy code presented in the previous paragraph as a dependency. So you need to install it in your local maven repo. To do this run from the command line (assuming you have cloned the groovy project):
  • cd not-as-final-as-you-think
  • gradlew publishToMavenLocal
The following Java test calls the addMethod of the StringModifier class which was written in Groovy.

The test passes as a NoSuchMethodException is thrown when calling the getDeclaredMethod method on String.class.

Even if Groovy code can be called from Java, the Groovy meta-programming "magic" is lost. Groovy can't help in achieving our goal as long as we run java. Instead we need to use a byte code manipulation tool. I will use Byte Buddy, as this is the most easy and flexible byte code manipulation tool I am aware of.

With Byte Buddy

Let's say we have a final Java class with a final returnFalse method as follows:

Then we can use Byte Buddy, to redefine the returnFalse implementation, e.g. to make it return always true. This can be done as follows:

The next question is can we do the same for a method of String or any other of the final JDK classes? E.g. would the following work?

The above code attempts to redefine the isEmpty method of the String class so that it always returns true. Compare redefineMethodOfStringClass with redefineFinalMethodOfAclass. They are almost identical, but redefineMethodOfStringClass throws a NullPointerException in line:
.load(String.class.getClassLoader(), classReloadingStrategy);

This is due to the fact that JDK classes are loaded with the Bootstrap Classloader which is not exposed to the developer. Therefore String.class.getClassLoader() returns null. Byte Buddy reloads the class whose method is redefined with a custom classloader. To do this it needs the parent classloader so that all the class dependencies can be resolved by the custom classloader. Notice that this is not a Byte Buddy's limitation, this is inherent for any tool that reloads classes.

We conclude that in Java it is possible to change the behavior of final methods, or methods of final classes, for all classes except for those included in the JDK.

No comments:

Post a Comment