I’m partecipating in JBoss Community Leadership Awards

I’m participating in this poll as candidate for New features contribution.

As you probably remember I’ve contributed in DNA, JBossWS (sometime in the past), and JBossESB integrating Wise into ESB 4.4.  Moreover Wise is now a JBoss.org project: I’ve donated it some months ago and I’m leading the project there, and we have already released 0.9 and 1.0 versions.

If you like my efforts and would support me, and/or you are using Wise (within JBossESB or not) may I kindly ask  your vote there (you need a JBoss.org account, but it’s quite easy to register one):

http://www.jboss.org/community/poll.jspa?poll=1003

More infos from JBoss.og homepage:

Voting will end on January 30th 2009 and winners will be announced at the JBoss Virtual Experience, a web-based JBoss technology conference which will be held February 11th 2009. There is no charge for admission, but please advance register if you’d like to attend.
Terms and conditions are here. Please join us in giving these community members the recognition they deserve.

As said in this post, a lot of the strength spurring on an open source developer is narcissism. Help mine to grow up :P

JBossESB and Wise to implement ETL phase for a big DataWareHouse

As I wrote in some previous posts me and my fine team are working from a while to a project using JBossESB Wise action in a real world enterprise application. We are using it for the ETL (Extract Transfor Load) phase for a big DWH (Data Ware House) with an incremental loading of data.

In a nutshell we trace logical changes on an OLTP database (it’s a financial DB where all changes can be associated logically to a single company or at least to a network of company related for various reasons). Then we use JBossESB (and in particular SQLGateway) to periodically treat modified companies and extracting and enriching information to be loaded on the DWH instance. Where wise have its place? Well a lot of information and business rule to extract or enrich data have been implemented as webservices in last 3/4 years. So it’s pretty natural to reuse them to implement this last application.

Ok, it’s the bird eye view of the problem and the solution. On the rest of the post I’ll go in more formal details, starting with requirement and environment description

Requirement and environment description

The main requirement have been to collect a set of data regarding a large set of company (about 5 million) in a DWH for a marketing analysis. This data comes from different systems: 3 different OLTP relational database, and legacy host based system, an external provider. The good news is that both host system and external provider are accessible using webservices. Moreover OLTP databases have some webservices extracting data applying complex business rules; they doesn’t cover all requirements, but these DBs are completely under control of our development team, and dedicated jdbc and/or EJB3 access could be developed for new goals.

The final users would update it’s DWH with daily frequency. The large amount of data made impossible to extract transform and load the whole data every night. We have decided to keep track of changes on the main OLTP DB, and reload completely companies changed (some thousands a day).

Of course this approach isn’t totally new, incremental ETL are pretty common in DWH world, and all vendors have its own proprietary solution. While these proprietary system have its place and its plus, isn’t IMHO sufficient flexible to support an heterogeneous environment as one described. I thought it’s better to track with proprietary triggers logical significative changes (not a lot in fact) and adopt a SOA solution for ETL. It would be better in terms of flexibility and would permit us to reuse much more easily a lot of already written services containig complex business rules.

So the solution adopted have been based on JBossESB ant its composed by these macro steps:

  1. A set of triggers on 2 of 3 named OLTP DB collect changes and write a unique identifier of the company in a dedicated table
  2. A SQLGateway consume this table (the frequency of wake up and filters of the query are designed to avoid excessive and and not useful double treatment of companies due to double linked changes)
  3. Any company is processed by a set of action chains. This actions could be locally defined actions reading relational database or Wise based web services invocations. A content based router policy route messages from an action chain to the next one.
  4. Finally data extracted and transformed are written on the DWH.

Point 3 is of course the core of the system. The SQLGateway create a message containing a pojo object called Company and any successive action trasform or enrich this object with data collected and business rules applied. Wise’s based action calls webservices and use smooks to transform and enrich input object with ws returned values. Using CBR and continuous enrichment of the same object we get at last action (writeOnDWH) an object with all data needed t be written on the DWH.

Focus on Wise

A lot of actions are simply webservices calls implemented with a zero-code approach using Wise. We had just to write jboss-esb.xml fragment for webservice call and smooks config files to get a lot of business rules reused. It have been really GREAT!

I need to add some patch to current integration in ESB to obtain the max response from wise, but results have been really impressive: we had something like 90K company processed in an hour. What does it mean in finer details? Well from wise point of view about 300K web services calls in an hour!
Well also performance and numbers of ESB have been impressive: we are running on a single Linux64 machine (AMD64 double dual core) with 10 jms-listener processing 10 different chains  (200 concurrent 3ad for any jms-listener) for a total of 1.7M (wise and not) of actions called in an hour.

Isn’t it impressive numbers?

There is a list of patches I applied to wise/esb integration to support my requirement. All the code are committed on my workspace (maeste) in ESB svn:

Feature Request JBESB-2019 wise should pass to smooks response mapper also input data to permit continuos enrichement of message Major
Bug JBESB-2020 wise have a bug for which it may download too many wsdls and store them in a temporary dir Major
Feature Request JBESB-2021 add configurability for location where wise store smooks reports for its transformation Major
Bug JBESB-2022 wise doesn’t clean its internal smooks cache Major
Bug JBESB-2023 Wise is failed to consume a wsdl which contains two schema element with same name and different namespace . Major
Bug JBESB-2036 wise’s sample have problem because targetPackage not specified in properties files Major
Feature Request JBESB-2037 Avoid excessive reflective inspection of wise classes for better performance Major

I can’t go in more detail of the implementation or put here configs files because I cna’t reveal any business details of the application. I’ll try in next future to arrange an example totally equivalent in technology content, but without any link to real business content. If you are interested let me know, but be patients…it’s not a joke and I’m very very busy these days.

Thanks to my team (special thanks to Paolo and Luca)  and all contributors of Wise and ESB to make it possible :)

PS: what about huge split and route qs included in ESB 4.4. Well they cover different problems, even if not far each other. The main difference is that here we haven’t a huge message to split and route, but a lot of little message to enrich and then route (content based) to next enrichment phases.

JBossESB 4.4 have a new zero-code webservice invoker

We are proud to announce that recently released JBossESB 4.4 contain a wise based implementation of webservice client invoker.

In a nutshell it is a zero-code webservice caller supporting smooks based mapping, and pluggable JAX-WS handler. Here is an abstract of the message with which I presented it to ESB community (here you find original message and related discussion):

It uses wsconsume API to dynamically generate client object and invoke web service, delagating to JBossWS JAX-WS implementation the dirty job.
It use smooks under the hood to transform user defined object into JAX-WS generated ones.

It support also standard JAX-WS handler and a generic smooks transformation handler to apply transformation to generated soap messages.

You can find it in my workspace under product/services/soap/src/main/java/org/jboss/soa/esb/actions/soap/wise/
I also wrote javadoc for the action class explaining how to use it and e example demonstrating 3 common use case:

* Direct call of a simple service without any mapping is needed
* Call of a service using a smooks mapper java-to-java
* Call a simple webservices without mapping, but with an handler
modifying header with smooks and an handler logging on System.out
request and response
In this 3 examples don’t forget to have a look to wise-core.properties for some important configs. Of course they could be integrated in action’s config in jboss-esb.xml in next future, but this first implementation leave them there.

On wise roadmap I have the implementation of webservices’ call receiving different resources (CSV, XML and so on) using smooks to map it on JAX-WS generated client objects, giving another interesting opportunity in ESB environment.

It is an initial implementation, and I need to integrate wise objects generation with new smooks configgenerator ( http://milyn.codehaus.org/Smooks+User+Guide#SmooksUserGuide-GeneratingtheSmooksBindingConfiguration ) to make user experience easier.

Moreover we are working on wise-core to improve it and make it more configurable an pluggable and support much more stuffs. I’ll post a roadmap soon.

Stay tuned!

SOA and heterogeneous technology environmet: eggs and chicken problem

One of the use case for witch a SOA (ESB) solutions is recommended is when you have to manage a complex “technology heterogeneous” environment.

Well, I’m thinking about a good design for some new important feature to be added to our complex environment. Our environment is indeed complex, with wide impact, with heterogeneous needing, but it is quite homogeneous in technology. OK, it isn’t a monolithic system, it is build by a lot of part, but a lot of this part are java(2ee)/oracle based.

But the question is:do I like to keep my system so homogeneous? IOW if I invest a lot of money adding these new features to my system, which involve to use/review most of developed software, is it really the right choice to keep it all based on java?

I’m a java guru and fun using it as my main development language in last 10 years, but my answer is

NO

Why NO? Because if I take a look behind in the past I can see a lot of system architects answering “yes!” at same question 20 years ago substituting “Java” with “COBOL”. And a shudder come on my back…would I really sentence my system to be so strictly coupled with a single technology and loose flexibility and cool feature of newer technology? I’m not sure Java will become the next COBOL going to be static and legacy, but for sure, if I would answer yes I would be disown my ideas of “open system”.

There are so good languages and technologies kicking around, which probably solve better some kind of problem. Groovy, Scala and Ruby are the most famous, but we have also Erlang, Factor (with good ideas and a friend of mine behind), and even more legacy language like perl could have its place in some specific use cases. In general if something could be more productive or more flexible than java for some specific problem, I’d like to keep doors open. Randall did an interesting post saying java developers should learn other languages, I make a step over saying java developers should USE other languages

I’ve been always open to new technology and solution, would I miss my freedom of choice in favour of my beloved language? No, my freedom is much more important than java :)

Designing my new system I would use best technology and language for each part of the system. It’s always a good decision, the good news is integration of these parts could be seamless and painless, we haveSOA/ESB solution.

My conclusion is that isn’t necessary to have heterogeneous system to go for SOA, probably is the contrary: nowadays we need heterogeneous system to be time to market, to have easier maintenance, and so we need SOA to build and manage it.

SOA and heterogeneous technology environment seems to be the  eggs and chicken problem :)

Thoughts?

wise-core in jbossesb first implementation

As said in this post one of possible use of wise-core (the new core we get independent from Wise) is to integrate it in JBossESB to make a generic soap client invoking web service using Smooks transformation to hide final user the gap between their own object models and one generated by JAX-WS tools dynamically.

Well I contribute with some code to JBossESB providing an action which does what I described in a nutshell here. My efforts and possible improvements are described in this post on ESB developer forum.

Give your feed back there.

BTW I’m developing a real world application based on ESB and this wise-action: it takes some date from a db, enrich the message calling a set of webservices using wise, conditioning these calls with a content based routing approach, and then write the databack into db. I’m planning a post about this application as soon as me and my team will finish it….stay tuned!