Pentaho Data Integration - Multi-part Form submission with file upload using the User Defined Java Class Step
I recently needed to use Pentaho Data Integration (PDI) to send a file to a server for processing using HTTP Post. I spent several hours trying to use the existing steps HTTP Post, HTTP Client & Rest Client but I couldn't get it to work. After some more research I came across the issue PDI-10120 - Support for Multi-part Form Submittal In Web Service Steps and I thought I was out of luck.
I previously wrote a small Java client for a similar use case and remembered the PDI has a step called User Defined Java Class (UDJC). After reading this great tutorial I created the following basic transaction. I have a dataset with the URL and the full file path and use the UDJC to make the HTTP call.
The Java class handles the actual HTTP Post. It uses 2 input variables, the URL (url) which is used for the call and the file name (longFileName). The HTTP call then contains the file (line 30) and the file name (line 31). I included some basic error handling based on the HTTP status code.
To get this example to work you will have to download the Apache HTTP Client and add the JAR files to the lib folder from PDI.
Click here to download the sample transformation. I hope it will be helpful for others, use at your own risk.
I previously wrote a small Java client for a similar use case and remembered the PDI has a step called User Defined Java Class (UDJC). After reading this great tutorial I created the following basic transaction. I have a dataset with the URL and the full file path and use the UDJC to make the HTTP call.
HTTP Post using User Defined Java Class |
The Java class handles the actual HTTP Post. It uses 2 input variables, the URL (url) which is used for the call and the file name (longFileName). The HTTP call then contains the file (line 30) and the file name (line 31). I included some basic error handling based on the HTTP status code.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | import java.io.File; import org.apache.http.HttpResponse; import org.apache.http.client.HttpClient; import org.apache.http.client.methods.HttpPost; import org.apache.http.entity.mime.*; import org.apache.http.impl.client.HttpClients; public boolean processRow(StepMetaInterface smi, StepDataInterface sdi) throws KettleException { Object[] r = getRow(); if (r == null) { setOutputDone(); return false; } r = createOutputRow(r, data.outputRowMeta.size()); // Get the value from url String urlString = get(Fields.In, "url").getString(r); // Get the file name String longFileNameString = get(Fields.In, "longFileName").getString(r); // Load the file File file = new File(longFileNameString); HttpPost post = new HttpPost(urlString); MultipartEntityBuilder entity = MultipartEntityBuilder.create(); entity.setMode(HttpMultipartMode.BROWSER_COMPATIBLE); entity.addBinaryBody("file", file); entity.addTextBody("fileName", file.getName()); post.setEntity(entity.build()); HttpClient httpClient = HttpClients.createDefault(); try{ //Make HTTP Call HttpResponse response = httpClient.execute(post); // Check if response code is as expected if (response.getStatusLine().getStatusCode() != 200) { putError(data.outputRowMeta, r, 1, "Error returned", "HTTP Status Code", "HTTP Status: " + response.getStatusLine().getStatusCode()); return true; } // Set value of HTTP Status, integer values need conversion get(Fields.Out, "http_status").setValue(r, Long.valueOf(response.getStatusLine().getStatusCode())); } catch (Exception e) { // Set value of HTTP Status to -1 since HTTP Post caused exception get(Fields.Out, "http_status").setValue(r, Long.valueOf(-1)); return true; } // Send the row on to the next step. putRow(data.outputRowMeta, r); return true; } |
To get this example to work you will have to download the Apache HTTP Client and add the JAR files to the lib folder from PDI.
Click here to download the sample transformation. I hope it will be helpful for others, use at your own risk.
Comments
httpclient-4.3.3.jar
httpmime-4.3.3.jar
httpcore-4.3.2.jar
The response field is also empty so i can't figure out what the error is.
ERROR (version 9.0.0.0-423, build 9.0.0.0-423 from 2020-01-31 04.53.04 by buildguy) : java.lang.LinkageError: loader constraint violation: when resolving method "org.apache.http.client.methods.HttpEntityEnclosingRequestBase.setEntity(Lorg/apache/http/HttpEntity;)V" the class loader (instance of org/codehaus/janino/ByteArrayClassLoader) of the current class, Processor, and the class loader (instance of org/apache/catalina/loader/ParallelWebappClassLoader) for the method's defining class, org/apache/http/client/methods/HttpEntityEnclosingRequestBase, have different Class objects for the type org/apache/http/HttpEntity used in the signature
This solved the issue for me