Parquet is a binary file format designed with big data in mind where we must access data frequently and efficiently. The way it stores file on the disk is also different from other file formats. It is a column-based data file. And in reality it uses both row based and column based approach to bring the best of both worlds. The data is encoded on disk which ensures that the size remains small compared to actual data and is then compressed where the file is scanned as whole and cut out redundant parts. The query/read speed is dramatically fast when compared to other file formats. Nested data is handled efficiently which is quite cumbersome in other file format to achieve. Doesn’t require to parse the entire file to find data due to its way of storing data. This makes it efficient in reading data. Works quite efficiently with data processing frameworks. Automatically stores schema information. SQL querying is possible with this file format using proper tools.

Data formats:

Data formats can be

  • Unstructured – When there is no specific structure. e.g Text, csv
  • Semi structured – XML, Json
  • Structured – Has records and rows, well defined schema, has very predictable locations where you can find the data - SQL, Parquet
Continue Reading ...

Here in the scenario, what I have was a folder repository where users will commit the changes. Users will treat the folder which is in a network location and will commit the changes to this repository. Now I want this to be moved to a bit bucket server so that I can use the proper process to get the changes merged. Here the challenge is to retain the history of changes without losing it while porting. Let’s see how this can be achieved. for this what we need is

Continue Reading ...

It seems to be difficult to check the read permission on a directory in Delphi 2009. Here we will see how to find the read permission in an alternate way. In later version of Delphi we have more straight way of achieving this.

Continue Reading ...

In this post we will evaluate a scenario where we try to execute a process and are trying to collect the output from the process but the output from a process is not collected by the calling process. If you are in this page, then you must have experienced the similar issue where the process is tested to collect the output from the process which we executed but it fails intermittently while collecting the output from the process. You might have done everything right by collecting the output from the process in an asynchronous manner but still it fails to collect the output in between. If you are using process.WaitForExit with a timeout, then it is the culprit. process.WaitForExit with a timeout is known to create issue when we have some parallelism in place and we are trying to execute say the same process multiple times in parallel or many different processes in parallel. The way to get out of this is to wait for the process without a timeout. This may have practical difficulties because if we don’t have a timeout, then there can be situation where we may wait forever for the process to exit. The way to get out of this is to run the process and wait for it without a timeout but timeout via the thread which is executing it. The below example will show you how to implement this.

Continue Reading ...

How to compile Delphi without license in CI build server

In this post I will walk you through the activities required to build Delphi on build servers without the need for a costly code-gear license on all build servers. One of a typical reason why Delphi is still used is because of the cost involved in converting those code base to a newer technology. This is not the topic of discussion here, but we will look at how the code base can be maintained and can be integrated into the Continuous Integration (CI) system that we have. As you know, CI system today runs on cluster and when it comes to build, we must keep the cost under control by opting for the opensource version of dependent tools or by using methods where we need only limited license or no license at all. Here we will see how to build Delphi in the build server with no license requirement for Code Gear. The environment that I am using is Bamboo and powershell.

How to build without license

Continue Reading ...