KafkaSink- kafka topic is not being created

Hello, Team

As I am trying to create a sink as Kafka Consumer …please see the below code.

for the streaming data, I used log file - /opt/gen_logs/logs/access.log

> # example.conf: A single-node Flume configuration
> # Name the components on this agent
> kflume.sources = src1
> kflume.sinks = snk1
> kflume.channels = chn1
> # Describe/configure the source
> kflume.sources.src1.type = exec
> kflume.sources.src1.command = tail - F /opt/gen_logs/logs/access.log
> # Describe the sink
> kflume.sinks.snk1.type = org.apache.flume.sink.kafka.KafkaSink
> kflume.sinks.snk1.brokerList = f.cloudxlab.com:6667
> kflume.sinks.snk1.topic = cca
> # Use a channel which buffers events in memory
> kflume.channels.chn1.type = memory
> kflume.channels.chn1.capacity = 1000
> kflume.channels.chn1.transactionCapacity = 100
> #Bind the source and sink to the channel
> kflume.sources.src1.channels = chn1
> kflume.sinks.snk1.channel = chn1

flume-ng agent --conf conf --conf-file kflume.conf --name kflume -Dflume.root.logger=INFO,console

when I run the code through the above command even though it runs fine but I don’t see Kafka topic created.

I checked /kafka-logs as was mentioned in Ambari as Kafka log directory but Kafka topic is not listed there

Please help me on the same.Thank you

Why I can’t see my Kafka topic listed in /kafka-logs directory.

Hi @Mahesh ,

Can you please check using Kafka binaries if the topic is created or not?

Here is the detailed video on using Kafka binaries on CloudxLab

Hope this helps.

Hi Abhinav,

With kafka binaries topic is created but don’t know how I can look data inside it.

Anyway I tried with both active Kafka server (ip-172-31-53-48:6667) and zookeeper server (ip-172-31-53-48:2181), I can see my topic listed by executing Kafka-topics.sh

Actually, the same streaming data which I am producing through flume sources to Kafka Sink I would like to get it processed by spark streaming.

I will post you if I face any issue

Thank you very much.

Hello again

One more thing I want to know, in our cluster spark has no Kafka and flume dependencies while I import below packages I see spark doesn’t accept those.

import org.apache.spark.streaming.flume._
import org.apache.spark.streaming.kafka._

Please tell me if we have another option to do add those packages .

I added those packages with giving commands

spark-shell --packages org.apache.spark:spark-streaming-flume_2.10:1.6.0 --master yarn --conf spark.ui.port=11111

spark-shell --packages org.apache.spark:spark-streaming-kafka_2.10:1.6.0 --master yarn --conf spark.ui.port=11111

Hi @Mahesh,

Good to see that you are able to include those packages yourself :slight_smile:

How can i check the data present in Kafka topic.i can see the topic created when i run list topics using kafka bineries but how to check the data inside the topic.Please tell or may be we can discuss this in call.

You can use the kafka consumer binary for the same.

Below is the sample command

kafka-console-consumer.sh --zookeeper localhost :2181 --topic testuday123 --from-beginning

You can find the running zookeeper server ips in Ambari