Knowledge Hub

Containerization – Docker | Concepts | Docker Installation

In this article, we will talk about Containerization and Virtualization concepts. We are going through the solutions like Docker, install it, and write some functionality on Linux operating system.Before we start talking about Docker we have to know first the ...

Dimensional Modeling |Part 1: Introduction and Fact Types

Dimensional Modeling Dimensional modeling is one of the data modeling techniques used for designing the data warehouses, It also considered a suitable technique for representing analytic data, because it understandably delivers data for users and is optimized for query performance ...

ETL vs ELT | Differences and Use Cases

1. What is ETL? ETL stands for Extract, Transform, and Load. ETL process starts by extract data from one or multiple sources, then, Transform this data to match the data warehouse schema, and finally load the transformed data to the ...

DNA Sequencing with Machine Learning

Introduction What if, a small sample of each baby’s saliva was sent out to a lab, where—for just a few dollars—the baby’s DNA was analyzed and a multitude of “risk scores” returned? These would not be diagnoses but instead, prognostication: ...

Denormalization when, why, and how !?

What is de-normalization? De-normalization is an optimization technique to make our database respond faster to queries by reducing the number of joins needed to satisfy user needs. In de-normalization, we mainly aim to reduce the number of tables that are ...

Introduction to Apache Airflow – Powerful and Dynamic Orchestrator

What is Apache Airflow? Apache Airflow is a platform that will help you programmatically to design, schedule, and monitor big data pipelines, with a rich number of tasks you can execute and link together you can almost design any pipeline ...