Exception With Butterfly Wings: 2013

Wednesday, November 27, 2013

Date formatting in Velocity templates

Here's how to format a date inside a velocity template. Add an additional velocity-tools library in your dependencies:


     org.apache.velocity
     velocity-tools
     2.0

Import the DateTool class:

import org.apache.velocity.tools.generic.DateTool;

Add an instance of this class to the VelocityContext:

VelocityContext context = new VelocityContext();
context.put("date", new DateTool());

Add your date object to the context:

context.put("some_date", new Date());

Use the DateTool parameter in the template to format date:

$date.format('dd.MM.yyyy', $some_date)

Thursday, July 4, 2013

Sending large attachments via SOAP and MTOM in Java

     Sometimes you need to pass a large chunk of unstructured (possibly even binary) data via SOAP protocol — for instance, you wish to attach a file to a message. The default way to do this is to pass the data in an XML element with base64Binary type. What it effectively means is, your data will be Base64-encoded and passed inside the message body. Not only your data gets enlarged by about 30%, but also any client or server that sends or receives such message will have to parse it entirely which may be time and memory consuming on large volumes of data.

     To solve this problem, the MTOM standard was defined. Basically it allows you to pass the content of a base64Binary block outside of the SOAP message, leaving a simple reference element instead. As for the correspondent HTTP binding, the message is transferred as a SOAP with attachments with a multipart/related content type. I won't go into the details here, you may learn it all straight from the above mentioned standards and RFCs.

     The tricky part is, although we've disposed of a 30% volume overhead by passing the data outside of the message, the standards themselves don't specify the ways of processing the messages by the implementations of clients and servers — whether the messages should be completely read into memory with all their attachments during sending and receiving or offloaded on external storage. By default, the implementations (including Java's SAAJ) usually read the attachments completely into memory, thus causing a possibility of running out of memory on large files or heavy-loaded systems. In Java, this is usually signified by a "java.lang.OutOfMemoryError: Java heap space" error.

     In this post I will demonstrate a simple client-server application that can transfer SOAP attachments of arbitrary volume with disk offloading, using Apache CXF on the client and Oracle's SAAJ implementation (a part of JDK 6+) on the server. This will require some tuning for the mentioned frameworks. The complete code of the application is available on GitHub.

     First, we will place the common files (XSD and WSDL) in a separate project, as they will be used by both client and sever. The WSDL schema of the service is relatively straightforward: we have a port with a single operation that consists of a SimpleRequest request and a SimpleResponse response from the server. The file is transferred in the request to the server. The XSD schema of request and response is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<s:schema elementFormDefault="qualified"
          targetNamespace="http://forketyfork.ru/mtomsoap/schema"
          xmlns:s="http://www.w3.org/2001/XMLSchema"
          xmlns:xmime="http://www.w3.org/2005/05/xmlmime">

    <s:element name="SampleRequest">
        <s:annotation>
            <s:documentation>Service request</s:documentation>
        </s:annotation>
        <s:complexType>
            <s:sequence>
                <s:element name="text" type="s:string" />
                <s:element name="file" type="s:base64Binary" xmime:expectedContentTypes="*/*" />
            </s:sequence>
        </s:complexType>
    </s:element>

    <s:element name="SampleResponse">
        <s:annotation>
            <s:documentation>Service response</s:documentation>
        </s:annotation>
        <s:complexType>
            <s:attribute name="text" type="s:string" />
        </s:complexType>
    </s:element>

</s:schema>

Take a note of the imported xmime schema, and the usage of xmime:expectedContentTypes="*/*" attribute on a binary data element. This enables us to generate correct JAXB code out of this schema, because by default the base64Binary element corresponds to a byte[] array field in the JAXB-mapped class. But as we'll see, the expectedContentTypes attribute alters the generation of the class:

@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "", propOrder = {
    "text",
    "file"
})
@XmlRootElement(name = "SampleRequest")
public class SampleRequest {

    @XmlElement(required = true)
    protected String text;
    @XmlElement(required = true)
    @XmlMimeType("*/*")
    protected DataHandler file;

    ...

Note that the file field is of type DataHandler, which allows for streaming processing of the data.

We shall generate the JAXB classes for both client and server, and a service class for the client, using Apache CXF cxf-codegen-plugin for Maven during build-time. The configuration is as follows:

<plugin>
    <groupId>org.apache.cxf</groupId>
    <artifactId>cxf-codegen-plugin</artifactId>
    <version>${cxf.version}</version>
    <executions>
        <execution>
            <id>generate-sources</id>
            <phase>generate-sources</phase>
            <configuration>
                <sourceRoot>${project.build.directory}/generated-sources/cxf</sourceRoot>
                <wsdlOptions>
                    <wsdlOption>
                        <wsdl>${basedir}/src/main/resources/service.wsdl</wsdl>
                        <wsdlLocation>classpath:service.wsdl</wsdlLocation>
                    </wsdlOption>
                </wsdlOptions>
            </configuration>
            <goals>
                <goal>wsdl2java</goal>
            </goals>
        </execution>
    </executions>
</plugin>

In this Maven plugin configuration we explicitly specify the wsdlLocation property that will be included into the generated service class. Without it, the generated path to the WSDL file will be a local path on the developer's machine, which we obviously don't want.

The client (module mtom-soap-client) is plain simple, as it is based on Apache CXF and a generated SampleService class. Here we only enable MTOM for underlying SOAP binding and specify an infinite timeout, as the transfer of large files may take time:


        // Creating a CXF-generated service
        Sample sampleClient = new SampleService().getSampleSoap12();

        // Setting infinite HTTP timeouts
        HTTPClientPolicy httpClientPolicy = new HTTPClientPolicy();
        httpClientPolicy.setConnectionTimeout(0);
        httpClientPolicy.setReceiveTimeout(0);
        HTTPConduit httpConduit = (HTTPConduit) ClientProxy.getClient(sampleClient).getConduit();
        httpConduit.setClient(httpClientPolicy);

        // Enabling MTOM for the SOAP binding provider
        BindingProvider bindingProvider = (BindingProvider) sampleClient;
        SOAPBinding binding = (SOAPBinding) bindingProvider.getBinding();
        binding.setMTOMEnabled(true);

        // Creating request object
        SampleRequest request = new SampleRequest();
        request.setText("Hello");
        request.setFile(new DataHandler(new FileDataSource(args[0])));

        // Sending request
        SampleResponse response = sampleClient.sample(request);

        System.out.println(String.format("Server responded: \"%s\"", response.getText()));

The server is based on the Spring WS framework. Only we won't use a typical default <annotation-config /> configuration here and specify a custom DefaultMethodEndpointAdapter configuration, because we need Spring WS to use our custom-configured jaxb2Marshaller bean:

<!-- The service bean -->
<bean class="ru.forketyfork.mtomsoap.server.SampleServiceEndpoint" p:uploadPath="/tmp"/>

<!-- SAAJ message factory configured for SOAP v1.2 -->
<bean id="messageFactory" class="org.springframework.ws.soap.saaj.SaajSoapMessageFactory"
      p:soapVersion="#{T(org.springframework.ws.soap.SoapVersion).SOAP_12}"/>

<!-- JAXB2 Marshaller configured for MTOM -->
<bean id="jaxb2Marshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller"
      p:contextPath="ru.forketyfork.mtomsoap.schema"
      p:mtomEnabled="true"/>

<!-- Endpoint mapping for the @PayloadRoot annotation -->
<bean class="org.springframework.ws.server.endpoint.mapping.PayloadRootAnnotationMethodEndpointMapping" />

<!-- Endpoint adapter to marshal endpoint method arguments and return values as JAXB2 objects -->
<bean class="org.springframework.ws.server.endpoint.adapter.DefaultMethodEndpointAdapter">
    <property name="methodArgumentResolvers">
        <list>
            <ref bean="marshallingPayloadMethodProcessor" />
        </list>
    </property>
    <property name="methodReturnValueHandlers">
        <list>
            <ref bean="marshallingPayloadMethodProcessor" />
        </list>
    </property>
</bean>

<!-- JAXB@ Marshaller/Unmarshaller for method arguments and return values -->
<bean id="marshallingPayloadMethodProcessor" class="org.springframework.ws.server.endpoint.adapter.method.MarshallingPayloadMethodProcessor">
    <constructor-arg ref="jaxb2Marshaller" />
</bean>

Important thing to notice here is a mtomEnabled property of jaxb2Marshaller, the rest of the configuration is quite typical.

The SampleServiceEndpoint class is a service that is bound via the @PayloadRoot annotation to process our SampleRequest requests:

    @PayloadRoot(namespace = "http://forketyfork.ru/mtomsoap/schema", localPart = "SampleRequest")
    @ResponsePayload
    public SampleResponse serve(@RequestPayload SampleRequest request) throws IOException {

        // randomly generating file name as a UUID
        String fileName = UUID.randomUUID().toString();
        File file = new File(uploadPath + File.separator + fileName);

        // writing attachment to file
        try(FileOutputStream fos = new FileOutputStream(file)) {
            request.getFile().writeTo(fos);
        }

        // constructing the response
        SampleResponse response = new SampleResponse();
        response.setText(String.format("Hi, just received a %d byte file from ya, saved with id = %s",
                file.length(), fileName));

        return response;
    }

     Notice how we work with the request.getFile() field of the request. Remember, the type of the field is DataHandler. What actually happens is, the request.getFile() wraps an InputStream that points to the attachment that was offloaded by SAAJ to disk when the request was received. So we may copy this file to another location or process it in any way while not loading it completely into memory.

     A final trick is to enable the attachment offloading for the Oracle's SAAJ implementation that is bundled with Oracle's JDK starting from version 6. To do that, we must run our server with the -Dsaaj.use.mimepull=true JVM argument.

     Once again, the complete code for the article is available on GitHub.

Friday, June 21, 2013

How to return a file, a stream or a classpath resource from a Spring MVC controller

You can use AbstractResource subclasses as return values from the controller methods, combining them with the @ResponseBody method annotation.

Consequently, as soon as you know the filesystem path of the file or have its URI, returning a file from a Spring MVC controller is as easy as:

    @RequestMapping(value = "/file", method = RequestMethod.GET, 
                produces = MediaType.IMAGE_JPEG_VALUE)
    @ResponseBody
    public Resource getFile() throws FileNotFoundException {
        return new FileSystemResource("/Users/forketyfork/cat.jpg");
    }

The code to return a classpath resource is quite similar:

    @RequestMapping(value = "/classpath", method = RequestMethod.GET, 
                produces = MediaType.IMAGE_JPEG_VALUE)
    @ResponseBody
    public Resource getFromClasspath() {
        return new ClassPathResource("cat.jpg");
    }

But how about outputting data from a stream? A common advice is to inject HttpServletResponse as a method parameter and write directly to the output stream of the response. But this badly breaks the abstraction, not to mention the testability. Technically we can write to a Writer introduced as a method parameter, like this:

    @RequestMapping(value = "/writer", method = RequestMethod.GET, 
                produces = MediaType.TEXT_PLAIN_VALUE)
    @ResponseBody
    public void getStream(Writer writer) throws IOException {
        writer.write("Hello World!");
    }

A seemingly simple one-liner. But if you consider serving a large chunk of binary data, this approach appears to be quite slow, memory-consuming and not very handy as it uses the Writer which deals in chars. Moreover, Spring MVC is not able to set the Content-Length header until the output is finished. Here's a slightly more verbose solution, which however does not break the abstraction and is fast and testable.

    @RequestMapping(value = "/stream", method = RequestMethod.GET, 
                produces = MediaType.TEXT_PLAIN_VALUE)
    @ResponseBody
    public Resource getStream() {

        String string = "Hello World!";
        // acquiring the stream
        InputStream stream = new ByteArrayInputStream(string.getBytes());
        // counting the length of data
        final long contentLength = string.length();

        return new InputStreamResource(stream){
            @Override
            public long contentLength() throws IOException {
                return contentLength;
            }
        };

    }

First, we acquire the stream. Then we count the length of the content we need to output. This may be done in some optimized fashion so as not to process the content entirely. Spring MVC first calls the contentLength() method of the InputStreamResource, sets the Content-Length header and then pipes the stream to the client.

Here we touch on a bit of inconsistency in Spring API. The class InputStreamResource extends the AbstractResource, which in turn implements the method contentLength() by processing the whole incapsulated stream to count its length. InputStreamResource does not override the contentLength() method, but does override the getInputStream() method, prohibiting to call it more than once, which effectively does not allow for direct usage of this class as a controller method return value. In the example above we override the contentLength() method and provide the correct functionality.

Tuesday, May 28, 2013

"stack shape inconsistent" error during Spring/Jackson application initialization

I have a JSON-service client that's implemented in Spring and Jackson and deployed on WebSphere Application Server. The client worked properly, but on a single machine I encountered a strange classloading issue during Spring initialization:

java.lang.VerifyError: JVMVRFY012 stack shape inconsistent; class=org/codehaus/jackson/map/ObjectMapper

The reason was having two incompatible dependencies in the effective pom of the project:


    org.codehaus.jackson
    jackson-mapper-asl
    1.4.2



    org.codehaus.jackson
    jackson-mapper-lgpl
    1.9.12

Both of those jars had ObjectMapper class defined, and both ended up in WEB-INF/lib directory. The error was unstable, because on some machines the correct (latest) versions of the libraries took precedence during classloading.

Friday, February 1, 2013

Of Domain Modeling, Separation of Concerns, and How the JPA Annotations Fail at Both

Not all of the JPA annotations are actually about mapping the entities to the database, or even about persistence at all. Some of them are intended to instrument the Java language for better modeling of the domain. Here's a simple example of two domain classes that are somehow associated with each other (accessor methods are omitted for clarity):

public class User {

  private String name;

  private Set<Role> roles;

}

public class Role {

  private String name;

}

Observing those classes, can we unambiguously determine the relationship between them? A User has a set of Roles, that we may say for sure. But for all we know, this association may be either one-to-many or many-to-many, from the domain view point. There's no reverse association from the Role to the User, so we don't actually know if the same Role can be assigned to several different Users or not. Well, let's suppose we wanted to model a many-to-many relationship. Let's add the reverse link to the Role class and see if it helps, though we may have no use for the reverse connection in the context of our model at all.

public class User {

  private String name;

  private Set<Role> roles;

}

public class Role {

  private String name;

  private Set<User> users;

}

        Does it feel better now? Oh, surely we do have a many-to-many association between those two classes now... or not? Actually, we made quite a heck of an assumption that the User.roles set and the Role.users set point at each other, i.e., they model the same association, but that may certainly not be the case. For example, User.roles set may be a set of Roles that a User has as a user of a system, but the Role.users set may be a completely unrelated set in the context of the domain model.

        For example, a Role may have a set of Users that have an authority to grant that role. Surely in this case we could make a better job of naming those two fields somehow differently, so that noone would confuse them as representing the same association. But now we have two different associations, and we still have no clue as of what their actual relationship is. We're still missing the point of modeling the domain with the Java programming language which is considered to be an Object-Oriented language — seemingly a right choise for the job!

        That's where the association annotations come in. Here's the annotated version of the first case — unidirectional association:

public class User {

  private String name;

  @ManyToMany
  private Set<Role> roles;

}

public class Role {

  private String name;

}

Now we see clearly that the association between those entities is many-to-many, and there's no ambiguity in it. As for the second case:

public class User {

  private String name;

  @ManyToMany
  private Set<Role> roles;

}

public class Role {

  private String name;

  @ManyToMany(mappedBy = "roles")
  private Set<User> users;

}

        The mappedBy attribute for the @ManyToMany annotation in the Role class is what makes those two sets "click" together. The "roles" string is the name of the User class field (not a database field!). OMG, is that a String pointer to a Java class field? Yeah, yeah, we should probably have a more obvious and compile-checked pointer to the field from the other side, but, alas, the Java programming language is so poor that it does not leave us any options. Though some IDEs may help you and actually highlight the value of this attribute if you misspell it, or even navigate you to the connected field with a somethihg+click on it, but still, I'd argue that calling a Java bean field by it's name in a String is quite a poor (yet inevitable) implementation of "binding" the bidirectional association together.

        But wait! "mappedBy"?! Seems like we have another fallacy here. The word "mapping" is surely from another story. What mapping is this all about? We haven't said a word yet about the mapping of the entities into a relational data source, all we did was modeling the domain. But let's blame this poor choice of the attribute name (and breaking the separation of concerns) on the developers of the JPA standard.

        Another weirdness here is the "fetch" attribute that every association annotation has. "Fetch" is actually a concept of the data source query optimization that allows us to lazily load some heavily packed associations that may not be always of use. For instance, if we only want to show the User's name, why should the data source fetch the collection of roles for us? That's where the "fetch" attribute comes in:

public class User {

  private String name;

  @ManyToMany(fetch = FetchType.LAZY)
  private Set<Role> roles;

}

But wait, we find ourselves even more into the data source and mapping concerns here. Why do those "fetch" types even matter if all we want for now is to simply model the domain?

To conclude, I believe that these four annotations — @OneToOne, @OneToMany, @ManyToOne and @ManyToMany — are intended for the developer to model the domain, and they surely would be better off in another package or even another API that has nothing to do with "persistence". Maybe even somewhere in Java SE. But we have them only in Enterprise Edition — as if domain modeling has to be done only in Enterprise, and only in connection with the underlying relational data sources. But that's not always the case. Those annotations would be useful in single-user, desktop applications as well, or even in applications that have no persistence whatsoever but still need to have a domain model. And those annotations are no good place to specify fetching attributes, either.

Exception With Butterfly Wings