JBossWS and Apache CXF collaboration

I’ve just pubblished a post on the JBosWS blog regarding the JBossWS involvement in the Apache CXF project. In few words, the JBossWS team is increasing its collaboration with the CXF developers, the target being to improve both projects.

It’s not that simple to achieve an active bi-directional collaboration, with both parties’ needs being considered, but this is working quite well now. For instance, read what Daniel Kulp (CXF lead) writes about the collaboration. Needless to say I like this, that’s a nice example of what open source can make possible.

Ant 1.7.1 and package-info.java compilation problem of JAX-WS generated classes

Today I have spent a lot of time with a very strange issue compiling JAX-WS generated classes. I have been using jbossws wsconsume to generate some classes from a .NET wsdl and I had a very strange behaviour:

  1. Generate class calling wsconsume
  2. compile them and its client using ant and run my test perfectly working
  3. then calling my clean task to remove .class files and recompile them and run my tests doesn’t work!

IOW JAX-WS client have been working only the first time I compile them. I couldn’t figure out why it have been working in that manner, but after a lot of google search I got this commit. In practice what have been happening is better described in “Note on package-info.java” paragraph of javac target in ant manual. Starting from version 1.7.1 ant compile package-info.java only in these 3 case:

  1. If a package-info.class file exists and is older than the package-info.java file.
  2. If the directory for the package-info.class file does not exist.
  3. If the directory for the package-info.class file exists, and has an older modification time than the the package-info.java file. In this case <javac> will touch the corresponding .class directory on successful compilation.

In practice if you havea ant task like mine:

<target name=”compile” depends=”init” description=”Compile the Java source code”>
<javac destdir=”${classes.dir}” classpathref=”build.classpath” debug=”${javac.debug}” deprecation=”${javac.deprecation}” target=”1.5″>
<src path=”${src.java.dir}” />
</javac>

Here the compilation target compile all files and so create directory where package-info.class will be contained during other generated files compilation. In this case the compilation target never re-generate package-info.class because no one of the 3 conditions is true. My workaround have been to change my build.xml file and have this compile target:

<target name=”compile” depends=”init” description=”Compile the Java source code”>
<touch>
<fileset dir=”${src.java.dir}” includes=”**/package-info.java”/>
</touch>
<javac destdir=”${classes.dir}” classpathref=”build.classpath” debug=”${javac.debug}” deprecation=”${javac.deprecation}” target=”1.5″>
<src path=”${src.java.dir}” />
</javac>

Hoping this post would be useful for someone, let me remark that is a problem of ant javac task, not of jbossws and you will get the same problem with any other jaxws stack.

Let me remark also that Wise perfectly work in this case regenerating it’s classes on the fly :P

JBossESB and Wise to implement ETL phase for a big DataWareHouse

As I wrote in some previous posts me and my fine team are working from a while to a project using JBossESB Wise action in a real world enterprise application. We are using it for the ETL (Extract Transfor Load) phase for a big DWH (Data Ware House) with an incremental loading of data.

In a nutshell we trace logical changes on an OLTP database (it’s a financial DB where all changes can be associated logically to a single company or at least to a network of company related for various reasons). Then we use JBossESB (and in particular SQLGateway) to periodically treat modified companies and extracting and enriching information to be loaded on the DWH instance. Where wise have its place? Well a lot of information and business rule to extract or enrich data have been implemented as webservices in last 3/4 years. So it’s pretty natural to reuse them to implement this last application.

Ok, it’s the bird eye view of the problem and the solution. On the rest of the post I’ll go in more formal details, starting with requirement and environment description

Requirement and environment description

The main requirement have been to collect a set of data regarding a large set of company (about 5 million) in a DWH for a marketing analysis. This data comes from different systems: 3 different OLTP relational database, and legacy host based system, an external provider. The good news is that both host system and external provider are accessible using webservices. Moreover OLTP databases have some webservices extracting data applying complex business rules; they doesn’t cover all requirements, but these DBs are completely under control of our development team, and dedicated jdbc and/or EJB3 access could be developed for new goals.

The final users would update it’s DWH with daily frequency. The large amount of data made impossible to extract transform and load the whole data every night. We have decided to keep track of changes on the main OLTP DB, and reload completely companies changed (some thousands a day).

Of course this approach isn’t totally new, incremental ETL are pretty common in DWH world, and all vendors have its own proprietary solution. While these proprietary system have its place and its plus, isn’t IMHO sufficient flexible to support an heterogeneous environment as one described. I thought it’s better to track with proprietary triggers logical significative changes (not a lot in fact) and adopt a SOA solution for ETL. It would be better in terms of flexibility and would permit us to reuse much more easily a lot of already written services containig complex business rules.

So the solution adopted have been based on JBossESB ant its composed by these macro steps:

  1. A set of triggers on 2 of 3 named OLTP DB collect changes and write a unique identifier of the company in a dedicated table
  2. A SQLGateway consume this table (the frequency of wake up and filters of the query are designed to avoid excessive and and not useful double treatment of companies due to double linked changes)
  3. Any company is processed by a set of action chains. This actions could be locally defined actions reading relational database or Wise based web services invocations. A content based router policy route messages from an action chain to the next one.
  4. Finally data extracted and transformed are written on the DWH.

Point 3 is of course the core of the system. The SQLGateway create a message containing a pojo object called Company and any successive action trasform or enrich this object with data collected and business rules applied. Wise’s based action calls webservices and use smooks to transform and enrich input object with ws returned values. Using CBR and continuous enrichment of the same object we get at last action (writeOnDWH) an object with all data needed t be written on the DWH.

Focus on Wise

A lot of actions are simply webservices calls implemented with a zero-code approach using Wise. We had just to write jboss-esb.xml fragment for webservice call and smooks config files to get a lot of business rules reused. It have been really GREAT!

I need to add some patch to current integration in ESB to obtain the max response from wise, but results have been really impressive: we had something like 90K company processed in an hour. What does it mean in finer details? Well from wise point of view about 300K web services calls in an hour!
Well also performance and numbers of ESB have been impressive: we are running on a single Linux64 machine (AMD64 double dual core) with 10 jms-listener processing 10 different chains  (200 concurrent 3ad for any jms-listener) for a total of 1.7M (wise and not) of actions called in an hour.

Isn’t it impressive numbers?

There is a list of patches I applied to wise/esb integration to support my requirement. All the code are committed on my workspace (maeste) in ESB svn:

Feature Request JBESB-2019 wise should pass to smooks response mapper also input data to permit continuos enrichement of message Major
Bug JBESB-2020 wise have a bug for which it may download too many wsdls and store them in a temporary dir Major
Feature Request JBESB-2021 add configurability for location where wise store smooks reports for its transformation Major
Bug JBESB-2022 wise doesn’t clean its internal smooks cache Major
Bug JBESB-2023 Wise is failed to consume a wsdl which contains two schema element with same name and different namespace . Major
Bug JBESB-2036 wise’s sample have problem because targetPackage not specified in properties files Major
Feature Request JBESB-2037 Avoid excessive reflective inspection of wise classes for better performance Major

I can’t go in more detail of the implementation or put here configs files because I cna’t reveal any business details of the application. I’ll try in next future to arrange an example totally equivalent in technology content, but without any link to real business content. If you are interested let me know, but be patients…it’s not a joke and I’m very very busy these days.

Thanks to my team (special thanks to Paolo and Luca)  and all contributors of Wise and ESB to make it possible :)

PS: what about huge split and route qs included in ESB 4.4. Well they cover different problems, even if not far each other. The main difference is that here we haven’t a huge message to split and route, but a lot of little message to enrich and then route (content based) to next enrichment phases.

JBossESB 4.4 have a new zero-code webservice invoker

We are proud to announce that recently released JBossESB 4.4 contain a wise based implementation of webservice client invoker.

In a nutshell it is a zero-code webservice caller supporting smooks based mapping, and pluggable JAX-WS handler. Here is an abstract of the message with which I presented it to ESB community (here you find original message and related discussion):

It uses wsconsume API to dynamically generate client object and invoke web service, delagating to JBossWS JAX-WS implementation the dirty job.
It use smooks under the hood to transform user defined object into JAX-WS generated ones.

It support also standard JAX-WS handler and a generic smooks transformation handler to apply transformation to generated soap messages.

You can find it in my workspace under product/services/soap/src/main/java/org/jboss/soa/esb/actions/soap/wise/
I also wrote javadoc for the action class explaining how to use it and e example demonstrating 3 common use case:

* Direct call of a simple service without any mapping is needed
* Call of a service using a smooks mapper java-to-java
* Call a simple webservices without mapping, but with an handler
modifying header with smooks and an handler logging on System.out
request and response
In this 3 examples don’t forget to have a look to wise-core.properties for some important configs. Of course they could be integrated in action’s config in jboss-esb.xml in next future, but this first implementation leave them there.

On wise roadmap I have the implementation of webservices’ call receiving different resources (CSV, XML and so on) using smooks to map it on JAX-WS generated client objects, giving another interesting opportunity in ESB environment.

It is an initial implementation, and I need to integrate wise objects generation with new smooks configgenerator ( http://milyn.codehaus.org/Smooks+User+Guide#SmooksUserGuide-GeneratingtheSmooksBindingConfiguration ) to make user experience easier.

Moreover we are working on wise-core to improve it and make it more configurable an pluggable and support much more stuffs. I’ll post a roadmap soon.

Stay tuned!