The Centre for Machine Intelligence in collaboration with SparkCognition, invites students to participate in its first Machine Learning Hackathon Competition. The hackathon is open to all students from all backgrounds and aims to help them:
No prior experience in Machine Learning is required but having some basic coding skills and some understanding of statistics will definitely give you a head start. Over the span of two weeks, you and your team will apply Machine Learning to solve a problem of your choice. We’ll start with “Machine Learning 101,” a short primer on the technology, what it’s good for (and not good for), and how to use it. Then you’ll pick a problem to solve and jump in!
Registration is now closed! We received over 300 registrations and many on the waiting list.
But, you ask, with little or no background in Machine Learning, how can you jump in without drowning? Good question. Machine Learning has matured to the point that powerful tools have been built to enable “citizen data scientists” to succeed without specialized training. SparkCognition, a leading AI company based in Austin, Texas, has given us Darwin, an Automated Machine Learning tool. Darwin solves the hard problems of wrangling and exploring datasets, searching for a good model and evaluating its efficacy. With Darwin you’ll be empowered to solve problems that would otherwise require years of AI experience.
Throughout the event, SparkCognition will provide technical support on selecting problems and using Darwin. As described in the timeline below, your team will submit a short description of your project. The report should describe what you did: the problem you solved, why it’s interesting/important, challenges you encountered and the results you obtained. Three to five pages should suffice.
Everyone’s busy, especially this time of year. Nevertheless, you should strongly consider making time for this event. Machine Learning is transforming whole industries, and you need to know when it’s applicable and how to use it. This event will give you a good introduction to the technology and a state-of-the-art tool that makes you quickly productive. You’ll meet corporate leaders, work with your friends, scope out a cool project, and compete to win significant prizes.
These are pre-selected datasets and problem definitions that you may want to start your project on. These problems have been chosen based on their relevance to real-world domains and the economic and social impact they have. If you come up with a brilliant idea to solve any of these, you can be sure there will be a route to market for it. We strongly encourage you to find your own dataset and specify your own problem (see constraints on datasets below).
A telephone company is trying to retain its customers and avoid "customer churn," a jargony way of saying that a customer moves to another service provider. The company compiles a dataset that records information about each customer, including whether s/he "churned," as recorded in the rightmost column of the dataset. Given the training set, the challenge is to form a model that predicts whether different customers such as those in the test set, will churn or not.
A power-generation company is trying to predict the power it will produce each day, based on environmental conditions such as Temperature, Pressure and Humidity. The training set records information about environmental conditions and power generation across many days. Given the training set, the challenge is to form a model that predicts the power that will be generated under various conditions given in the test set.
We would like to categorise companies according to their engagement levels i.e., do they respond to calls and at what times of the day do they tend to answer phone calls. Each company has a number of contact points and therefore it's important to determine who might be the best person to contact and can we predict when to call them. Also, it may be interesting to know from which sectors most engaged companies come from or eventually set up meetings with us (see the Meeting column). Note of caution: this dataset does not come with a training/test set and has not yet been tested on Darwin and may require you to use Python and the SDK to specify a good subset.
PROBLEM 4: FOOTBALL PREDICTIONS
This dataset contains match outcomes from the NFL for 2017 and 2018 season. Interesting problems to solve include predicting the point spread or the match outcome. Note of caution: this dataset does not come with a training/test set and has not yet been tested on Darwin and may require you to use Python and the SDK to specify a good subset.
You can access Darwin tutorials here.
Darwin is a platform that works for numerical data fit for prediction and classification task. If you know Python, you can also use the SDK (software development kit) to develop your model for time series prediction.
Darwin will help you work out what model to use to run regression, classification, and prediction on your dataset. The following rules apply:
These are the criteria to score your report and your presentation. Your submission will be reviewed by a panel of experts from industry and the university.
|15 points||Problem Selection: How was Darwin used? Was the problem selected meaningful, appropriate for the tools used and scope of the project, and quantifiable? Was appropriate data used? Was the data set up appropriately to achieve meaningful results?|
|22.5 points||Outcome: Was the research complete? How well were the results analyzed to solve the problem?|
|25 points||Innovation: What innovation did the team bring to the table? (Examples: novel approaches to feature engineering, incorporating insights from a paper, integrating multiple data sets, or setting up a problem uniquely.)|
|17.5 points||Impact: What are the implications of this research, and how impactful could it be in a given field?|
|20 points||Presentation: How well did the team present the project and results? Were both the technical and business perspectives of the problem and solution explained?|
We are very grateful to have the support of SparkCognition for this event. SparkCognition have provided a prize of £2000 for the winner of the competition and are also providing free access to their platform, tutorial videos, and daily online support for all teams.
Sridhar Sudarsan is the Chief Technology Officer of SparkCognition. Sudarsan is responsible for driving SparkCognition’s product and technology strategy, leveraging next-generation artificial intelligence systems to secure and optimize assets across key industries.
With over two decades of technology leadership experience, Sudarsan has been at the helm of several complex products and projects, collaborating with global customers on cutting-edge technologies. Previously, Sudarsan was the CTO of IBM Watson Platform and Partnerships, where he led the technology strategy and architecture of the IBM Watson platform. Sudarsan is widely recognized as an expert on the business potential and application of advanced technologies. He provides thought leadership on AI solutions and patterns for clients, partners, academics, and R&D teams. He holds over 14 patents in the areas of AI and distributed computing, has published white papers and articles for a variety of outlets, and has been a featured speaker at conferences and universities.
Sari Andoni is a Senior Data Scientist at SparkCognition, Inc. He has extensive experience in machine learning, neural networks and deep learning
combined with a research background in neurobiology. With published research in leading journals, Sari currently focuses on automated model building with multivariate time-series data using artificial neural networks. He received his Bachelors degree in Computer Sciences and Mathematics from Brigham Young University, and a PhD from the Institute for Neuroscience at The University of Texas at Austin.
For his dissertation, Sari studied the auditory midbrain and how the auditory system classifies natural vocalizations into behaviorally relevant perceptions. In his postdoctoral research, he studied the visual system focusing on the interaction of spontaneous activity with stimulus-evoked responses in the thalamocortical circuit.
Anna represents SparkCognition in the Europe and UK as Regional Vice President of Business Development. She has been with the company since 2016, working closely with the executive board
as they grow in numbers of employees, numbers of clients, and global reach. She has worked in international business development for all her career, and specialises in building Japanese business due to being fluent in Japanese and having worked in Japan for over 8 years in finance, government and in industry.
Anna was born in Southampton but hasn't spent much of her life in the city so she is looking forward to continuing that connection at the Hackathon today.
A two-time Chair of the University of Texas Computer Science Department, Dr. Bruce Porter serves as SparkCognition's Chief Science Officer, where he leads the company's many R&D initiatives. Currently, as University Professor, Dr. Porter research focuses on machine reading, a technology that holds tremendous potential for capturing knowledge for automated inference, question answering, explanation generation, and other AI capabilities.
Dr. Porter also directs UT’s Knowledge Systems Research Group, an AI organization with the goal to develop methods to build knowledgeable computers. He has won the Best Paper Award at the National Conference on Artificial Intelligence, the College of Natural Sciences Teaching Excellence Award, the National Science Foundation’s Presidential Young Investigator Award, and the President’s Associates Teaching Excellence Award.
Selected Awards & Honors:
Keith Moore is the Director of Product Management at SparkCognition and is responsible for the development of the IoT product line (SparkPredict®). He specializes in applying advanced data science and natural language processing algorithms to complex data sets. Moore previously worked for National Instruments as an analog-to-digital converter and vibration software product manager. Prior to that, he developed client software solutions for major oil and gas, aerospace, and semiconductor organizations. Moore has served as a board member of Pi Kappa Phi fraternity, and still serves volunteers on the alumni engagement committee. He graduated from the University of Tennessee with a with a B.A. in mechanical engineering.