Fibonacci sequence in parallel

How to generate fibonacci sequence in parallel using spark

1 Like

Fibonacci is basically this sequence:
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, …

The next number is found by adding the previous two numbers in sequence. The first two numbers are 0, 1.
Third number: 0 + 1 = 1
Forth number = 1 + 1 = 2
Fifth number = 1 + 2 = 3
Sixth number = 3 + 2 = 5
And so on…

Though it is obvious that you would not be able to parallelize this because the next number depends on previous.

There is an alternative formula of calculating the nth Fibonacci sequence:

image

where " φ " is called the golden number which is approximately 1.618034…

Using this we can parallelize the calculation of first n fibobaci numbers.

Here is a code that you can run on pyspark console in CloudxLab:

import math

nums = sc.parallelize(range(1000))

golden_ratio = 1.618034
igolden_ratio = 1 - golden_ratio
sqrt5 = math.sqrt(5)
def fib(n):
    fib_double = (golden_ratio**n - igolden_ratio**n)/sqrt5
    return int(fib_double)

fibs = nums.map(fib)
fibs.take(100)

You will get the following result.

1 Like

Since calculation using golden number formula is costly, so a better idea will be to write a function which compute the fibonacci series using a list of indexes. For first two element of partition, we will compute the fibonicci sequence numbers using the golden number formula and for the rest we will generate the numbers sequentially. Since each partition is anyway processed sequentially, it would not make any difference. Here you go.

import math

golden_ratio = 1.618034
igolden_ratio = 1 - golden_ratio
sqrt5 = math.sqrt(5)
def fib(n):
    print("GOLDEN Ration: ", n)
    fib_double = (golden_ratio**n - igolden_ratio**n)/sqrt5
    return int(fib_double)

# Test
# fib(0); fib(1); fib(2); fib(3)

def fib_seq(a):
    print("----")
    result = []
    i = 0
    for e in a:
        if i <= 1:
            result.append(fib(e))
        else:
            result.append(result[i-1] + result[i-2])
        i += 1
    return iter(result)

# Test
# list(fib_seq(range(5))) should print [0, 1, 1, 2, 3]

nums = sc.parallelize(range(1000))
fibs = nums.mapPartitions(fib_seq)
fibs.take(100)
fibs.getNumPartitions()

This is how it looks to me.

1 Like