[[https://www.tutorialspoint.com/apache_kafka/apache_kafka_introduction.htm| Apache Kafka - Introduction]] In Big Data, an enormous volume of data is used. Regarding data, we have two main challenges.The first challenge is how to collect large volume of data and the second challenge is to analyze the collected data. To overcome those challenges, you must need a messaging system. Kafka is designed for distributed high throughput systems. Kafka tends to work very well as a replacement for a more traditional message broker. In comparison to other messaging systems, Kafka has better throughput, built-in partitioning, replication and inherent fault-tolerance, which makes it a good fit for large-scale message processing applications. What is a Messaging System? A Messaging System is responsible for transferring data from one application to another, so the applications can focus on data, but not worry about how to share it. Distributed messaging is based on the concept of reliable message queuing. Messages are queued asynchronously between client applications and messaging system. Two types of messaging patterns are available − one is point to point and the other is publish-subscribe (pub-sub) messaging system. Most of the messaging patterns follow pub-sub. [[https://www.redhat.com/architect/apache-kafka-java| Using Apache Kafka with Java: What architects need to know ]] [[restricted-area-for-courses:databases:no-sql:mongo_db|Mongo DB]] [[https://kafka.apache.org/intro|Introduction Everything you need to know about Kafka in 10 minutes]] [[https://cwiki.apache.org/confluence/display/KAFKA/Clients|KAFKA Clients]] [[https://www.youtube.com/watch?v=FKgi3n-FyNU|What is Apache Kafka®?]] [[https://www.youtube.com/watch?v=xa0Yia1jdu8|Asynchronous Processing with Go using Kafka and MongoDB]] [[https://www.youtube.com/watch?v=tcaPzIXwj8A|Golang Live I How to develop Kafka based Go services in 2020 with Dino Omanovic]] [[https://kafka.apache.org/| Apache Kafka ]] More than 80% of all Fortune 100 companies trust, and use Kafka. Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications. Core Capabilities High Throughput Deliver messages at network limited throughput using a cluster of machines with latencies as low as 2ms. Scalable Scale production clusters up to a thousand brokers, trillions of messages per day, petabytes of data, hundreds of thousands of partitions. Elastically expand and contract storage and processing. Permanent storage Store streams of data safely in a distributed, durable, fault-tolerant cluster. High availability Stretch clusters efficiently over availability zones or connect separate clusters across geographic regions. Ecosystem Built-in Stream Processing Process streams of events with joins, aggregations, filters, transformations, and more, using event-time and exactly-once processing. Connect To Almost Anything Kafka’s out-of-the-box Connect interface integrates with hundreds of event sources and event sinks including Postgres, JMS, Elasticsearch, AWS S3, and more. Client Libraries Read, write, and process streams of events in a vast array of programming languages. Large Ecosystem Open Source Tools Large ecosystem of open source tools: Leverage a vast array of community-driven tooling. Trust & Ease Of Use Mission Critical Support mission-critical use cases with guaranteed ordering, zero message loss, and efficient exactly-once processing. Trusted By Thousands of Orgs Thousands of organizations use Kafka, from internet giants to car manufacturers to stock exchanges. More than 5 million unique lifetime downloads. Vast User Community Kafka is one of the five most active projects of the Apache Software Foundation, with hundreds of meetups around the world. Rich Online Resources Rich documentation, online training, guided tutorials, videos, sample projects, Stack Overflow, etc.