The rise of dashcams in recent years has been nothing short of phenomenal, especially in industries that rely heavily on transportation and logistics. With the ability to capture invaluable data in real-time, these cameras have become a game-changer for safety analysis, operational efficiency, and much more.
However, with great power comes great responsibility. The amount of data generated by dashcams can quickly become overwhelming, making it crucial to have an efficient and scalable data routing and streaming system in place.
In this blog post, we will take a deep dive into the world of event routing and discover how RabbitMQ can work in tandem with Kafka to deliver an unparalleled experience in data streaming and routing for dashcams. We’ll explore the challenges, the solution to be implemented, and the benefits to be reaped from this high-performance infrastructure.
About video telematics and dashcams
Video telematics and dashcams are innovative technologies that have revolutionized fleet management and road safety.
Dashcams, or dashboard cameras, are compact video cameras typically mounted on the dashboard or windshield of a vehicle. They continuously record the road ahead, capturing both audio and video footage of the vehicle’s surroundings.
Video telematics involves the integration of video recording and telematics systems to provide valuable insights into vehicle and driver behavior. It combines GPS tracking, accelerometer data, and real-time video footage to comprehensively view fleet operations.
Organizations seeking to stream data captured by numerous dashcams worldwide to their existing Amazon Web Services (AWS) infrastructure encounter challenges in establishing an efficient routing mechanism.
The problem at hand involves two primary aspects. Firstly, there is a need to design and implement a reliable and scalable routing solution capable of handling the incoming data streams from a large number of globally distributed dashcams. This solution must ensure seamless and uninterrupted data transmission, taking into account potential latency and network limitations across different geographical locations.
Secondly, the routing mechanism must seamlessly integrate with the organization’s existing AWS infrastructure. Key considerations include data ingestion protocols, network connectivity options, security measures, scalability requirements, and fault tolerance mechanisms.
Addressing these challenges will enable organizations to effectively stream data from dashcams to their AWS infrastructure, facilitating real-time insights, improved fleet management, and enhanced operational efficiency.
- Data Volume: One of the biggest challenges when dealing with data from thousands of dashcams is managing the sheer volume of data. Capturing, storing, processing, and streaming data from dashcams is a complex and challenging task, especially when dealing with large volumes of data coming from thousands of devices. The infrastructure in AWS has to be able to capture millions of events coming from these devices and process them efficiently for fleet monitoring and dashboarding purposes.
- Real-time data streaming: Another challenge is ensuring that the video data is captured and processed in the correct sequence. When dealing with data from thousands of devices, it is essential to ensure that the data is captured and processed in real-time and that the order of the data is maintained. This required a powerful and efficient system that can store, and manage the data effectively and ensure that it is processed in the correct order.
- Reliable and secure transmission: Video footage shared from dashcams to the cloud infrastructure is sensitive in nature and data security is paramount.
- High availability and scalability: The data streaming for large fleets of vehicles spread across different geographies has to be resilient and scalable.
Use of AWS Managed Apache Kafka
Due to its capability to handle massive volumes of real-time data streaming, Amazon MSK (Managed Streaming for Apache Kafka) is the ideal choice for processing the continuous stream of video footage generated by the dashcams and consumed for dashboard data.
Amazon MSK is a fully managed service for Apache Kafka, which means that it takes care of the underlying infrastructure and scaling, allowing for more efficient and reliable data streaming. It provides features such as encryption, access control, and data retention policies, which help with data security and compliance.
It integrates with other AWS services, which can be used for data processing, deployment, and monitoring.
However, the dashcams communicating over the MQTT protocol face challenges as MSK does not support this protocol. To overcome this hurdle, RabbitMQ, a reliable, scalable, and flexible open-source message broker software, can be used. RabbitMQ allows applications to communicate by sending and receiving messages using multiple messaging protocols, making it suitable for building distributed systems and microservices architectures. RabbitMQ cannot directly talk to MSK hence Apache Camel connector can be used.
Requests originating from the dashcams can be sent to the Network Load Balancer (NLB) before being directed to RabbitMQ through a single pipeline. A transformation is necessary to ensure that the data is correctly categorized and directed to the appropriate topic in MSK. If the data streams are originating from a variety of dashcams distributed worldwide, there is likely to be a vast amount of data that requires efficient sorting and processing. Without the transformation logic, this task would be cumbersome and time-consuming, potentially leading to delayed or incorrect data processing. By applying the transformation logic, the data can be properly categorized and directed to the appropriate topic in MSK, simplifying the processing and analysis of the data and enabling real-time fleet monitoring.
The transformation logic has to run for every topic in Kafka based on the customer, i.e., the fleet manager. Metadata needs to be added to the transformation logic to facilitate this process.
Salient Features of the solution:
- High volume of stream of events coming from thousands of devices from across the globe
- Use of NLB to handle tens of millions of requests per second while maintaining high throughput at ultra-low latency
- Ingestion of traffic into a managed message broker service for RabbitMQ using Amazon MQ that supports MQTT protocol used by the devices
- Data transformation and refactoring using Apache Camel Connector before pushing it to the appropriate topic into Managed Kafka (MSK) cluster
Referring to the diagram above the data flow is explained in the steps given below:
- Step1: The data is received from various Lightmetrics devices/dashcams mounted on thousands of vehicles from different continents
- Step 2: The high volume of data is ingested into the system with Network Load Balancer being the entry point providing high throughput and low latencies
- Step 3: Since the devices use the MQTT protocol for sending events, a managed message broker service for RabbitMQ is configured using Amazon MQ
- Step 4: The data is extracted, transformed, and loaded into an appropriate topic in MSK using Apache Camel Connector
- RabbitMQ cluster running on EC2 instances in a private subnet
- Network Load Balancer with externally accessible endpoint
- MSK cluster with SASL/SCRAM and IAM authentication enabled
- EC2 instance (connector_inst)in private subnet running connector logic (with maven, Kafka)
Steps to set up the data streaming:
- Start by logging into the jumpbox.
- Once logged in, SSH into the “connector_inst” instance.
- Set the M2_HOME and update the $PATH.
- Use Git to clone the repository containing basic Kafka to MSK connector logic from the following URL: https://github.com/Fraser27/kafka-docker-setup.
- Run the Java application using the following command: “mvn clean install; java -jar <path-where-jar-file-is-present>”.
- Now log in to the RabbitMQ UI using the credentials provided.
- Send messages to the specified queue.
- Finally, consume the messages in the MSK topic.
- Scalability: AWS MSK and RabbitMQ are highly scalable, and they can handle a large volume of data. This makes it easier to handle data generated by multiple dashcams simultaneously.
- High availability: Kafka and RabbitMQ, both offer high availability, ensuring that data can be transmitted and received reliably and without interruptions.
- Flexibility: RabbitMQ and Kafka are flexible and can work with various programming languages and data formats.
- Real-time processing: RabbitMQ and Kafka allow for real-time processing of data, which is crucial for dashcams.
- Fault-tolerance: RabbitMQ and Kafka both offer fault-tolerance and can recover quickly from failures, ensuring that data is always available.
In conclusion, this blog post has demonstrated how RabbitMQ and AWS MSK can be used together to handle the challenges of data routing and streaming from dashcams to AWS infrastructure. For companies dealing with similar challenges, this solution can be a game-changer.
Don’t hesitate to reach out to us if you want to delve deeper into the realm of efficient real-time data routing and streaming. Our team is always available to provide expert guidance on effectively handling data streaming at scale. Get in touch with us today to explore how we can help your business.
Bhupali is a seasoned technology leader with a passion for innovation and a deep understanding of the cloud computing industry. With extensive experience in cloud architecture and a proven track record of delivering successful AWS implementations, Bhupali is a trusted advisor to Comprinno’s clients. She is a thought leader in the industry and loves to channel her passion for technology through her insightful blogs.