Graphx program query

Anubhav_Gupta · March 30, 2020, 3:24am

please explain the commans below which is mentioned in the graphx tutorial example
what is the purpose of fields command and case command

val users = sc.textFile("/data/spark/graphx/users.txt").map { line => val fields = line.split(",") (fields(0).toLong, fields(1)) }

val ranksByUsername = users.join(ranks).map {
case (id, (username, rank)) => (username, rank)
}

sgiri · March 30, 2020, 8:35am

Create a RDD from test file. Each line is going to be an element of the rdd:
sc.textFile("/data/spark/graphx/users.txt")

Convert each line into a tuple of (number, text) and assign it to users:
.map { line => val fields = line.split(",") (fields(0).toLong, fields(1)) }

Join the users prepared in the first step with rank. It is essentially join based on the key i.e. first element of tuple:
users.join(ranks)

Clean up the joined data:

.map {
case (id, (username, rank)) => (username, rank)
}