Knowledge Hub

How to choose your ETL solution | Data Integration

ETL stands for Extraction Transform Load is a common concept in data engineering, and as we can imply from the name of the concept that this concept has three types of operations, Extract which indicate the process of extracting data ...

Apache Kafka and Apache Spark Integration | Apache Kafka | Apache Spark

Introduction Apache Kafka is a scalable, high performance, low latency platform that allows reading and writing streams of data like a messaging system. We can start writing Kafka applications using Java fairly easily, check our previous article on how to design a Kafka pipeline ...

Create a Kafka Pipeline using Java Application | Apache Kafka

Introduction This Article is about Programming Apache Kafka producer and consumer using Java language, as we’ll see, using Java we’ll be able to reproduce what the CLI does and even more. Prerequisites Kafka Installation and configuration article ( To setup ...

Setup Apache Flink Environment Standalone on Windows | Apache Flink

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams, for introduction about Apache Flink components please check our previous article In this article we will learn together how to setup and ...

Introduction to Apache Flink | Apache Flink

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. Flink has been designed to run in all common cluster environments, perform computations at in-memory speed and at any scale. Apache Flink ...

Azure Storage Account | Microsoft Azure

Storage Account A storage account is a container that groups a set of Azure Storage services together. Only data services from Azure Storage can be included in a storage account (Azure Blobs, Azure Files, Azure Queues, and Azure Tables) Storage ...