Kubernetes is a distributed cluster management application that makes easy to deploy, manage and scale applications in a public or private cloud. It moves away the complexity of an infrastructure by abstracting the cluster into a single gigantic machine with plenty of ressources. Core Concepts Pod: unit of deployment A pod is collection of one … Continue reading Discovering Kubernetes
Understanding Spark
Spark is a cluster computing framework for building parallel and distributed processing on massive amount of data. It somehow replaces MapReduce, but yet is not as simple. A better Hadoop Hadoop MapReduce is effective on processing huge amount of data. It divides data into many tiny parts, processes them locally in parallel on a cluster, and produces an output. … Continue reading Understanding Spark
Using Zookeeper together with a distributed system
Zookeeper is a distributed service that helps coordinating distributed applications. It can be used to manage configuration, synchronization or naming. Let's see how this Zookeeper is architectured as well as an example in Java. Architecture Zookeeper core concept is a hierarchical structure where each node contains data, as well as other nodes. Each node is … Continue reading Using Zookeeper together with a distributed system
Kafka core concepts
Kafka is a messaging framework for building real time streaming applications. It allows to build distributed publish-subscribe systems. In this article we will present the core concepts of the framework. APIs One way to start with Kafka is to understand its APIs. There are four of them : The Producer API allows to publish records to a … Continue reading Kafka core concepts
Things to know when switching from a RDMS to MongoDB
Before switching from a RDMS such as Oracle or SQL Server to MongoDB, one should be familiar with some key concepts of the NoSQL DataBase. Translation needed Before starting, we have should translate some traditional concepts of the RDMS world : A collection is like a tableA document is a like a rowA field is … Continue reading Things to know when switching from a RDMS to MongoDB
A word on HDFS (Hadoop Distributed File System)
Last time we have made an introduction to Hadoop MapReduce. We have seen that it relied on its file system: Hadoop Distributed File System (HDFS). Today we are going to take a look at this. What is HDFS? The goal of HDFS is to stored large amount of data in a distributed and fault tolerant way. A … Continue reading A word on HDFS (Hadoop Distributed File System)
An introduction to Hadoop MapReduce
Today we are going to talk about a famous BigData framework called Hadoop MapReduce. In this article, after presenting the framework, we will make a small example using Java and MapReduce Hadoop on Linux Raspbian (yes I am testing Hadoop on a raspberry pi!). What is MapReduce? MapReduce is a framework to process very large … Continue reading An introduction to Hadoop MapReduce
A brief look at MongoDB
In this article, we are going to look at a NoSQL Database named MongoDB. MongoDB is a document oriented database that targets high performance and high volume. In this article, I am going to install a MongoD server on windows 10 64bits and use the C# client, but other operating systems for the server, as well … Continue reading A brief look at MongoDB
Introducing ZeroMQ with C# and Java native port
Zeromq is a messaging library written in C++ for building distributed applications. The library can be used when performance and stability both matters. In addition to the original library, there are bindings in language such as C# or Java that wraps the C++ dll. Moreover, 100% native ports exists in these same languages. In this post, … Continue reading Introducing ZeroMQ with C# and Java native port
A quick review of the Concurrency Visualizer in Visual Studio 2015
The Concurrency visualizer is a free extension available in Visual Studio 2015 that can be used to analyse the performance of a concurrent application. I am going to do a simple overview of this extension. The tested code This is the code I am going to profile with the extension. It basically creates four non … Continue reading A quick review of the Concurrency Visualizer in Visual Studio 2015