• Courses
  • Knowledge Hub
  • Cheat Sheets
  • Market Place
  • Plans and Pricing
  • Contact Us
  • Become an Instructor
DataValley
Category
Cloud Computing
Data Engineering
Data Modeling
ETL (Data Integration)
Data Science
Python
Webinars & Events
{{ search }}
Log in Sign Up

Login/Sign Up

Courses Favorites 0

Search

Category
Cloud Computing
Data Engineering
Data Modeling
ETL (Data Integration)
Data Science
Python
Webinars & Events
{{ search }}

Menu

  • Courses
  • Knowledge Hub
  • Cheat Sheets
  • Market Place
  • Plans and Pricing
  • Contact Us
  • Become an Instructor

Containerization – Docker | Concepts | Docker Installation

June 11, 2021Sayed Ibrahem Data Engineering

In this article, we will talk about Containerization and Virtualization concepts. We are going through the solutions like Docker, install it, and write some functionality on Linux operating system.Before we start talking about Docker we have to know first the concept of Containerization and Virtualization, we going first to Traditional Deployment to see how is […]

Dimensional Modeling |Part 1: Introduction and Fact Types

April 12, 2021Seifalden Hany Data Engineering, Data Modeling

Dimensional Modeling Dimensional modeling is one of the data modeling techniques used for designing the data warehouses, It also considered a suitable technique for representing analytic data, because it understandably delivers data for users and is optimized for query performance which increases the data retrieval speed. Normalized databases are very useful in transactional processing because […]

ETL vs ELT | Differences and Use Cases

April 5, 2021Seifalden Hany Data Engineering, Data Integration

1. What is ETL? ETL stands for Extract, Transform, and Load. ETL process starts by extract data from one or multiple sources, then, Transform this data to match the data warehouse schema, and finally load the transformed data to the data warehouse. ETL system should enforce data quality, consistency standards, and ensure that separated data […]

Denormalization when, why, and how !?

March 25, 2021Seifalden Hany Data Engineering, Data Modeling

What is de-normalization? De-normalization is an optimization technique to make our database respond faster to queries by reducing the number of joins needed to satisfy user needs. In de-normalization, we mainly aim to reduce the number of tables that are needed by re-joining these tables together and add redundant data. De-normalization is commonly used with […]

Introduction to Apache Airflow – Powerful and Dynamic Orchestrator

March 20, 2021Ahmed Ibrahem Data Engineering

What is Apache Airflow? Apache Airflow is a platform that will help you programmatically to design, schedule, and monitor big data pipelines, with a rich number of tasks you can execute and link together you can almost design any pipeline you have no matter how it is complicated In this article, we discover what are […]

Normalization in Depth

March 17, 2021Seifalden Hany Data Engineering, Data Modeling

Designing and understanding a data model is all about understanding the concepts and the options you have in your use case and what is the best use case for each design option you have, in this article we will go through the normalization types and understand how to implement each option and pros and cons […]

Data Lake Concept and Solutions on GCP using Cloud Storage | GCP Cloud Storage.

October 7, 2020aliaa.amr Cloud Storage, Data Engineering, Data Lake, Google Cloud Platform

Introduction to Data Lakes Let’s start with a discussion about what data lakes are, and then where they fit in as a critical component to your overall data engineering ecosystem. So what is a data lake? Well, it’s a fairly broad term, but it generally describes a place where you can securely store various types […]

NoSQL Databases in GCP

NoSQL Database Services | Cloud Datastore, Cloud Firestore, and Cloud Bigtable

September 28, 2020aliaa.amr Bigtable, Data Engineering, Databases, Datastore, Firestore, Google Cloud Platform, NoSQL

Introduction The relational database (RDBMS) model completely dominated database technology for over 20 years. Today this “one size fits all” stability has been disrupted by a relatively recent explosion of new database technologies. These paradigm-busting technologies are powering the “Big Data” and “NoSQL” revolutions, as well as forcing fundamental changes in databases across the board. […]

Bigquery

Building a data warehouse solution using BigQuery | GCP BigQuery

September 20, 2020aliaa.amr Big Query, Data Engineering, Data Warehouse, Google Cloud Platform

An enterprise data warehouse brings the data together and makes it available for querying and data processing, it should consolidate data from many sources. All data in a data warehouse should be available for querying and it’s important to ensure that those queries are quick. Another reason to consolidate all of your data besides standardizing […]

Introduction to Impala .. Architecture and Components | Impala

September 10, 2020mtarek Big Data, Data Engineering, Databases

Cloudera Impala provides fast, interactive SQL queries directly on your Apache Hadoop data stored in HDFS or HBase. In addition to using the same unified storage platform, Impala also uses the same metadata, SQL syntax (Hive SQL), ODBC driver, and user interface (Cloudera Impala query UI in Hue) as Apache Hive. This provides a familiar […]

Dimensional Modeling … Design Methodology for Analytics Oriented Data Warehouse | Data Warehouse

August 30, 2020radwa.ali Data Engineering, Data Modeling, Data Warehouse

Data warehouses has been around since the 80s. Throughout these years, it has proven its capabilities to support decision making and business analysis. Data warehouses allow Integrating many source systems such as databases, spreadsheets, and flat files. Cleansing and Transformation can be applied to these data after integration then organizes it in a way that […]

Docker Commands | Dockers

August 22, 2020mahmoud.feteha Containerization, Data Engineering, Docker

In our last blog, we talked about Docker architecture, how to install Docker, and the main differences between Containerization vs Virtualization. Here, we are going to dive into and see how to use the Docker in action. Let’s cap up for what we need here, which is the difference between Docker image and Docker container, […]

Your Guide to NoSQL Databases | Data Engineering

August 19, 2020Ahmed Ibrahem Concepts and Technologies, Data Engineering, NoSQL

One of the major reasons that the era of big data started was the increase in the number of data source and variety of data types that each organization has nowadays, almost any organization has different types of data not only structured data but also it can have unstructured or semi-structured data, and each type […]

Getting Started with Containers & Dockers | Dockers

August 17, 2020mahmoud.feteha Containerization, Data Engineering, Docker

Introduction Containerization revolutionized the software development and it becomes a common building block in today’s architecture, applications, big data environments, and data engineering applications can be deployed and developed inside containers In this article, we will know more containers and its advantage, and we will discuss Dockers which is a container image that packages all […]

Aggregation Queries in Apache Hive | Apache Hive

August 13, 2020mtarek Apache Hive, Data Engineering

Introduction Data aggregation is the process of gathering and expressing data in a summary to get more information about particular groups based on specific conditions. HiveQL offers several built-in aggregate functions, such as max, min, avg,..etc. It also supports advanced aggregation using keywords such as Variance and Standard Deviation and different types of window functions. […]

  • 1
  • 2
  • 3
  • Next

Learn

Courses
Cheat Sheets
Market Place
Plans and Pricing

About

DataValley is the e-learning platform for everything data science. From beginners to gurus, data geeks of all levels can find something at DataValley to help them enhance their skills.

Contact

DataValley Technologies.

[email protected]

Copyright © 2021 DataValley Technologies.
Search