Skip to main content

Posts

Showing posts with the label Pentaho

Pentaho Data Integration - Multi-part Form submission with file upload using the User Defined Java Class Step

I recently needed to use Pentaho Data Integration (PDI) to send a file to a server for processing using HTTP Post. I spent several hours trying to use the existing steps HTTP Post, HTTP Client & Rest Client but I couldn't get it to work. After some more research I came across the issue PDI-10120 - Support for Multi-part Form Submittal In Web Service Steps  and I thought I was out of luck. I previously wrote a small Java client for a similar use case and remembered the PDI has a step called User Defined Java Class  (UDJC). After reading this great tutorial I created the following basic transaction. I have a dataset with the URL and the full file path and use the UDJC to make the HTTP call. HTTP Post using User Defined Java Class The Java class handles the actual HTTP Post. It uses 2 input variables, the URL (url) which is used for the call and the file name (longFileName). The HTTP call then contains the file (line 30) and the file name (line 31). I included some...

Using the Community Build Framework for Pentaho

Recently I had to prepare a installation for the Pentaho BI Server (CE) and I decided to try the Community Build Framework (CBF) from Pedro Alves. I had to install the server on a test and a production environment so it seemed to fit perfectly for my requirements. It is working fine now and helps a lot in applying changes to the installation having a clean structure but it took me quiet a few hours till I had it working (probably because I'm not an expert when it comes to using ant & Co.) Here are some issues you should be aware of: You'll need Java 1.6. Make sure your path to ant, java but especially the project folder doesn't contain any spaces. Spaces will only cause problems. Tomcat 6 is not supported yet. I recommend setting the solution paths to the folder "C:/...../project-client/solution" until you figured out how CBF works in detail. You will have your CBF ready to run a lot faster than I did if you keep these issues in mind. I'm sure I...

Delta Generation with Kettle

In one of my current project I have to do lots delta generation to figure out if any data changed and be able to work differently with the data depending if it's similar, new, changed, or deleted. I came up with the following transformation:

Mondrian Schema design with Metadata Editor and Power*Architect

After not working with Mondrian for a few months I had to design a schema for Mondrian again. As usual I first used the Schema Workbench . U can do everything with it but the user interface is a pain I think, especially if you are used to the other Open Source BI tools that are around for other tasks. I remembered that at least version 2 of the Pentaho Metadata Editor has a hidden Schema Editor , you have to press CTRL-ALT-O to activate it. If you check out the newest version (Version 3 RC1) you will be surprised how much you can do with it already! After building your relational model in the Metadata editor you can quickly design your cube and reuse your relational model which can save you a lot of time. But remember, it's not supported by Pentaho yet and you have to copy and paste your schema to test it. I'm sure in the near future Pentaho will have a solution for it. Another open source tool I found (I started working for the company recently) is Power*Architect from SQLPo...

What is my thesis all about

The last weekend I discussed in a forum what my thesis is about. I spent some time in the answer, so why not posting it in my blog to give people a better idea what I'm working on: When users first see Palo and Pentaho, they don't really know there is such a big difference, both claim to be a BI suite. It was the same for me, when I first started my thesis. In my thesis I'm working with Palo OLAP Server, Palo Worksheet Server and also with Palo ETL Server compared to the Pentaho BI Suite (without Weka), otherwise it wouldn't be a good comparison. I created a test scenario, and try to provide a solution with both Palo and Pentaho. The focus of my thesis is not to say, Palo is good, Pentaho isn't or the other way around. the result might be "if you focus on planning, you better use Palo, if you do lots of reporting, Pentaho provides more options,... The scenario is more the practical side of an implementation: How to get data into the system (ETL), how can you de...

Pentaho BI Server: Using action sequences as a web service with PHP

For my masterthesis I had to figure out, how to use the action sequences as webservice with PHP. According to the documentation you can receive soap messages but the action sequences don't offer a WSDL that would help you building your client. I also had problems with the http basic authentication, that Pentaho uses. After a couple hours of research and try and error, I found a solution. I doubt thats the best way to go, but at least it works. All you need is the PEAR HTTP Request class. Here is the code: //PEAR Request require_once 'Request.php'; $response = $req->sendRequest(); if (PEAR::isError($response)) { echo $response->getMessage(); } else { $req->clearPostData(); $req->setURL("localhost:8080/pentaho/ServiceAction"); $req->addQueryString("solution", "bi-developers"); $req->addQueryString("path", "reporting"); $req->addQueryString("action", "Testreport.xact...