Thursday, May 13, 2010

Parallel Querying using Oracle

Perhaps a long awaited feature, this one is clearly the one that I like most.

Using pre-built packages its now possible to break a single (relatively heavy) operation into multiple small operations, which run in parallel. The benefits are obvious, and cant be ignored.

I am yet to ascertain the usability of such a feature with tools like informatica or pentaho, but I am quite sure a lot can be achieved, especially in the direction of updates to huge tables in data warehouses.

And looking at the implementation, its genuinely simple and straightforward.

Would be nice to see more applications utilizing the benefits of such features..

Read more on Oracle magazine, article written by Steven Feuerstein...

in reference to: PL/SQL Practices: On Working in Parallel (view on Google Sidewiki)

Thursday, May 6, 2010

Ubuntu 10.04 is here.... Upgrade today...

Its only been a few hours since I have had the pleasure to upgrade to the latest version of Ubuntu.

And without any doubt, or any detailed analysis/investigation I can say this much for sure -

- The upgrade process was super easy. No hassles whatsoever, you just need to have a decent internet connection, few times you need to confirm the suggested decision, and done...

- The new system looks cleaner, much much cleaner. Its neat, to some extent beautiful. Its not about the background or themes or anything like that, but the overall look and feel is genuinely cool. Perhaps its to do with one thing that I have done on my part. My system font is "Lucida Grande" all across. And, for some reason the shapes and looks on this font are far better than anyone else I have encountered.

- The boot time has come down a good notch.

Well, I guess this should be a good starting point for my review on the new version, lets see how it comes across further.

in reference to: my tech playground (view on Google Sidewiki)

Saturday, May 1, 2010

Implementing Cartesian Product in Informatica Mapping

As against pentaho, Informatica doesnt provide a ready made transformation for implementing cartesian product in a mapping. Although, most of us would agree that  its not often that we tend to go for cartesian product joins. [the instinct generally is to do enough to avoid a cartesian product, because its a performance killer in general]

However, when your requirements need this, there is no direct way to do it in informatica joiner transformation.  Either you do it in the db side, by overriding your source qualifier sql statement and building it in there.

However, I have seen that some designers dont like to override sql statements, in such cases you'd have to implement it inside the mapping only. Here's a workaround for achieving that. Here goes -

  1. Read both the sources using their own source qualifiers, normally.
  2. For both of them, put in an Expression Transformation after the source
  3. In both the expression transformations, create an output port with a constant value. For Example, call it dummy1 for stream 1 and assign it a value -1.  Similarly, a port would be created in the second pipeline, lets call it dummy2 and assign it a value -1.
  4. Now create a joiner transformation. Link ports [including the one that we created with a constant value] to the joiner from both the expressions.
  5. In the join condition, choose to compare the dummy columns. 
  6. The rest of the joiner configuration would have to be like any other joiner. Nothing specific.
You might want to keep the smaller source as the master in the joiner, since it would save on the caching.

Before implementing the above solution, be sure to go back and check if its actually required for your application to have cartesian product !!!!