Containerization – Docker | Concepts | Docker Installation

In this article, we will talk about Containerization and Virtualization concepts. We are going through the solutions like Docker, install it, and write some functionality on Linux operating system.Before we start talking about Docker we have to know first the concept of Containerization and Virtualization, we going first to Traditional Deployment to see how is […]

Dimensional Modeling |Part 1: Introduction and Fact Types

Dimensional Modeling Dimensional modeling is one of the data modeling techniques used for designing the data warehouses, It also considered a suitable technique for representing analytic data, because it understandably delivers data for users and is optimized for query performance which increases the data retrieval speed. Normalized databases are very useful in transactional processing because […]

ETL vs ELT | Differences and Use Cases

1. What is ETL? ETL stands for Extract, Transform, and Load. ETL process starts by extract data from one or multiple sources, then, Transform this data to match the data warehouse schema, and finally load the transformed data to the data warehouse. ETL system should enforce data quality, consistency standards, and ensure that separated data […]

Denormalization when, why, and how !?

What is de-normalization? De-normalization is an optimization technique to make our database respond faster to queries by reducing the number of joins needed to satisfy user needs. In de-normalization, we mainly aim to reduce the number of tables that are needed by re-joining these tables together and add redundant data. De-normalization is commonly used with […]

Introduction to Apache Airflow – Powerful and Dynamic Orchestrator

What is Apache Airflow? Apache Airflow is a platform that will help you programmatically to design, schedule, and monitor big data pipelines, with a rich number of tasks you can execute and link together you can almost design any pipeline you have no matter how it is complicated In this article, we discover what are […]

Normalization in Depth

Designing and understanding a data model is all about understanding the concepts and the options you have in your use case and what is the best use case for each design option you have, in this article we will go through the normalization types and understand how to implement each option and pros and cons […]

Data Lake Concept and Solutions on GCP using Cloud Storage | GCP Cloud Storage.

Introduction to Data Lakes Let’s start with a discussion about what data lakes are, and then where they fit in as a critical component to your overall data engineering ecosystem. So what is a data lake? Well, it’s a fairly broad term, but it generally describes a place where you can securely store various types […]

NoSQL Database Services | Cloud Datastore, Cloud Firestore, and Cloud Bigtable

Introduction The relational database (RDBMS) model completely dominated database technology for over 20 years. Today this “one size fits all” stability has been disrupted by a relatively recent explosion of new database technologies. These paradigm-busting technologies are powering the “Big Data” and “NoSQL” revolutions, as well as forcing fundamental changes in databases across the board. […]

Building a data warehouse solution using BigQuery | GCP BigQuery

An enterprise data warehouse brings the data together and makes it available for querying and data processing, it should consolidate data from many sources. All data in a data warehouse should be available for querying and it’s important to ensure that those queries are quick. Another reason to consolidate all of your data besides standardizing […]

Introduction to Impala .. Architecture and Components | Impala

Cloudera Impala provides fast, interactive SQL queries directly on your Apache Hadoop data stored in HDFS or HBase. In addition to using the same unified storage platform, Impala also uses the same metadata, SQL syntax (Hive SQL), ODBC driver, and user interface (Cloudera Impala query UI in Hue) as Apache Hive. This provides a familiar […]

Docker Commands | Dockers

In our last blog, we talked about Docker architecture, how to install Docker, and the main differences between Containerization vs Virtualization. Here, we are going to dive into and see how to use the Docker in action. Let’s cap up for what we need here, which is the difference between Docker image and Docker container, […]