Big Data Workshop
Big Data - Hadoop, Kafka and Spark
In our workshop “Big Data – Hadoop, Kafka and Spark”, administrators and big data developers as beginners receive an introduction to the most popular big data frameworks Hadoop, Kafka and Spark for faster analysis.
- Sound understanding of Hadoop, Kafka and Spark
- Hands-on experience for real big data projects
- Ability to process large amounts of data efficiently
Our training series in the area of big data
Big data made easy: getting started with Hadoop, Kafka and Spark
Hadoop is the standard in the field of big data and forms the basis for the efficient storage and analysis of polystructured mass data from a wide variety of sources. The Hadoop ecosystem makes it possible to process data quickly and scalably.
This workshop offers you a structured introduction to the technologies Hadoop, Kafka and Spark and makes it easier for you to enter the world of big data. You will gain in-depth background knowledge and practical tools that will enable you to contribute directly to a big data project. Through hands-on exercises, you will learn about the most important components of the big data ecosystem and how to use them to successfully master data-driven challenges.
Learn the basics and practical applications of modern big data technologies
This workshop will give you a comprehensive overview of the most important big data technologies such as Hadoop, Spark and Kafka. In addition to theoretical basics, the focus is on practical implementation to prepare you optimally for the challenges of data-driven projects:
- Basics: Introduction to the topic of Big Data, benefits of Big Data solutions, overview of current Big Data technologies
- Big data architectures: data integration, data storage, data access and processing, lambda and kappa architectures
- The Hadoop File System (HDFS): Basics, Command Line Interface and REST API, Java API, Deployment
- Hadoop configuration, file formats (Parquet and others), No-SQL databases, in-memory databases, column-oriented databases
- Cluster resource management: basics, YARN, command line interface, Java API, analysis of log files
- Execution options / Execution engines, Horizontal and vertical scaling, Map & Reduce, Tez
- Apache Spark: Architecture, Command line interface, Horizontal and vertical scaling, Log file analysis
- Apache Hive: Analysis of structured data using Hive SQL, integration of business analytics solutions, optimizations using Apache Hive LLAP
- Overview of Apache Kafka, Apache Storm, Apache NiFi and other tools
- Best practices and open discussion
Methodology:
The basics are taught in the form of a training course. Practical exercises are used to consolidate what has been learned. The participants manage a Big Data system, analyze and optimize the runtime behavior of the system and create their own small applications. We are happy to use sample / demo data from your company for this.
Location:
This training course takes place at our Münster location in a small group with a maximum of three participants. Alternatively, we offer this training course as in-house training.
Workshop
Big Data - Hadoop, Kafka, Spark-
Duration: 2 days
-
Prerequisite: Basic knowledge of IT systems*
-
Presence / Online
-
Appointments can be made at short notice
Practice-oriented and tailored to you
Target group: Administrators and Big Data developers as beginners in Big Data projects
We design the content individually according to your requirements and respond flexibly to your questions – also using your own sample data. We combine the topic modules to create a training course that is perfectly tailored to your needs.
Contact us
How can we provide support?
Would you like to receive further information or are you interested in an individual consultation? Simply let us know in a short message how we can help you and we will get back to you as soon as possible.
Let us begin.