I have a email address and i wanted to find the count of distinct ngram on the email_alias.Lets say xyz at gmail.com is the email.
email_alias is xyz .
so it has to give the count of distinct ngram is 2 if by n here is 2
def apply(in1:String,in2:Int):List[(Array[String], Int)] = {
val email_alias = in1.split("@").toList
val email_tokens = email_alias(0).split("")
val gram=email_tokens.sliding(in2).toList
val fin=gram.groupBy(identity).mapValues(_.size).toList
return fin
}
apply(str,2)
output is as below .
res121: List[(Array[String], Int)] = List((Array(z, x),1), (Array(x, y),1), (Array(y, z),1), (Array(x, y),1), (Array(y, z),1))