5 November 20186 November 2018 Pradip Khakurel

Discovering Kubernetes

Kubernetes is a distributed cluster management application that makes easy to deploy, manage and scale applications in a public or private cloud. It moves away the complexity of an infrastructure by abstracting the cluster into a single gigantic machine with plenty of ressources. Core Concepts Pod: unit of deployment A pod is collection of one … Continue reading Discovering Kubernetes

16 January 201821 June 2020 Pradip Khakurel

Understanding Spark

Spark is a cluster computing framework for building parallel and distributed processing on massive amount of data. It somehow replaces MapReduce, but yet is not as simple. A better Hadoop Hadoop MapReduce is effective on processing huge amount of data. It divides data into many tiny parts, processes them locally in parallel on a cluster, and produces an output. … Continue reading Understanding Spark

30 September 2017 Pradip Khakurel

Using Zookeeper together with a distributed system

Zookeeper is a distributed service that helps coordinating distributed applications. It can be used to manage configuration, synchronization or naming. Let's see how this Zookeeper is architectured as well as an example in Java. Architecture Zookeeper core concept is a hierarchical structure where each node contains data, as well as other nodes. Each node is … Continue reading Using Zookeeper together with a distributed system

14 September 2017 Pradip Khakurel

Kafka core concepts

Kafka is a messaging framework for building real time streaming applications. It allows to build distributed publish-subscribe systems. In this article we will present the core concepts of the framework. APIs One way to start with Kafka is to understand its APIs. There are four of them : The Producer API allows to publish records to a … Continue reading Kafka core concepts

4 June 201721 June 2020 Pradip Khakurel

Things to know when switching from a RDMS to MongoDB

Before switching from a RDMS such as Oracle or SQL Server to MongoDB, one should be familiar with some key concepts of the NoSQL DataBase. Translation needed Before starting, we have should translate some traditional concepts of the RDMS world : A collection is like a tableA document is a like a rowA field is … Continue reading Things to know when switching from a RDMS to MongoDB

30 April 2017 Pradip Khakurel

A word on HDFS (Hadoop Distributed File System)

Last time we have made an introduction to Hadoop MapReduce. We have seen that it relied on its file system: Hadoop Distributed File System (HDFS). Today we are going to take a look at this. What is HDFS? The goal of HDFS is to stored large amount of data in a distributed and fault tolerant way. A … Continue reading A word on HDFS (Hadoop Distributed File System)

22 April 201726 April 2017 Pradip Khakurel

An introduction to Hadoop MapReduce

Today we are going to talk about a famous BigData framework called Hadoop MapReduce. In this article, after presenting the framework, we will make a small example using Java and MapReduce Hadoop on Linux Raspbian (yes I am testing Hadoop on a raspberry pi!). What is MapReduce? MapReduce is a framework to process very large … Continue reading An introduction to Hadoop MapReduce

2 April 20172 April 2017 Pradip Khakurel

A brief look at MongoDB

In this article, we are going to look at a NoSQL Database named MongoDB. MongoDB is a document oriented database that targets high performance and high volume. In this article, I am going to install a MongoD server on windows 10 64bits and use the C# client, but other operating systems for the server, as well … Continue reading A brief look at MongoDB

19 March 201719 March 2017 Pradip Khakurel

Introducing ZeroMQ with C# and Java native port

Zeromq is a messaging library written in C++ for building distributed applications. The library can be used when performance and stability both matters. In addition to the original library, there are bindings in language such as C# or Java that wraps the C++ dll. Moreover, 100% native ports exists in these same languages. In this post, … Continue reading Introducing ZeroMQ with C# and Java native port

5 March 201715 September 2017 Pradip Khakurel

A quick review of the Concurrency Visualizer in Visual Studio 2015

The Concurrency visualizer is a free extension available in Visual Studio 2015 that can be used to analyse the performance of a concurrent application. I am going to do a simple overview of this extension. The tested code This is the code I am going to profile with the extension. It basically creates four non … Continue reading A quick review of the Concurrency Visualizer in Visual Studio 2015