Projects in ML, DL, AI and Big Data

Hi All,

Please suggest the projects on Machine Learning, Deep Learning, Artificial Intelligence and Big Data here.

You need to mention the following:
Source Of Data:
Tags (The list of technology or terms this project belongs to):

Also, mention what kind of projects would you want to be part of.

Hi Sandeep,

I have an idea for a project that I think would suit cloudxlab.

I recently stumbled across an article offering SETI (the Search for Extraterrestial Intelligence) data to the public, along with a general request for data-crunching assistance. This sounds fantastic, but they then lead you to a sign up for an ‘IBM Bluemix’ account in order to handle the big data. From what I can see on the IBM website, they offer a free but very limited account, and then it’s pay as you go for anything more serious. They have a ‘cost estimator’ on the website, but it’s not particularly clear.

New SETI data is uploaded on a regular basis here … . They state that … “The setiQuest data archive contains 4 TB of quadrature waveform samples of interesting targets that have been collected at the Allen Telescope Array. The data and waterfall links are distributed via Amazon Web Services. The goal of the data availability is to be an educational resource and to aid setiQuest app developers

There is also a collection of Python starter scripts available. There is a markdown file available stating the science goals of the project. They state that “The goal of this project is to help improve observations performed by SETI Institute at the Allen Telescope Array (ATA). We are aiming for citizen scientists to make significant contributions in the following ways … buried treasure (signal from ET) and improved signal detection”. There is even a dedicated python module called ibmseti.

I also found a different GitHub repository that appears to be slightly different to that above. This one looks like it was from a machine learning (deep learning) challenge that was held last year by SETI. Again, starter scripts are available … On this one they state that “We intended to make access and analysis as democratic as possible: there’s no platform or language requirements”, so it should all be possible on cloudxlab?


Objective: Get the SETI starter code working on cloudxlab, demonstrating big data handling and machine learning, and then build on their work to progress it
Source of Data: The official SETI data source
Tags: Big data, python, Spark, machine learning, deep learning, convolutional neural networks, aliens!

This looks very interesting. Lets aim for this. Thank you Rob for brilliant suggestions.

Hi, I’m glad the project is of interest. I think so too! I’ve emailed a few times now about the internship you posted about but haven’t had a reply. I proposed the above project for that, so please could you let me know if the internship still happening as this project is my idea for it. Many thanks.

Hi, I’ve not heard anything on this internship via the emails I’ve sent or this forum, so I’ll stop asking. I might do this project anyway and will keep you posted. Thanks.

Hi Sandeep,

I am looking to implement machine learning for embedded signal processing Like for electrical domain, whether signal is faulty or compare 2 signals(One is good signal and Other is faulty). Is there any way to do it using machine learning. Can you please provide any way to perform signal processing using machine learning/deep learning.

If so, how to prepare signal data and train algorithm for prediction. Can you please give some idea on it.

Thanks and Regards

Every signal is basically sequence of numbers. You can try using an RNN. Do you have some training data?

Hi Sandeep, Thanks for message.

Electrical signals are stored in COMTRADE format. We extract signal data from those files. Please let me know in which format i need to extract for training model.

Thanks and Regards

Figure out how to create a pandas data frame from that data. Once you are able to do that rest is easy.

Hi Sandeep,

Thank you so much for reply.

As you suggested I am trying to apply concepts of RNN to time series classification. I am trying to predict labeled data from Indoor User movement data example. Please refer below link for data files and Jupyter notebook for code:

implemented code

I am using early stopping for val_loss. accuracy is appearing only 50%. when i used monitor parameter val_accuracy it is showing 78% accuracy. Can you please check and suggest about anything i am missing. I am trying with different options. But not getting desired result.

Would you please guide me on this.

Thanks and Regards