Kaggle hits million member milestone

Aussie data science platform acquired by Google earlier this year

Kaggle, the community data science platform originally coded in a Bondi bedroom, this week surpassed one million members.

Founded by Melbourne University alumnus Anthony Goldbloom in 2009, in March this year the site was acquired by Google for an undisclosed sum.

To mark the member number milestone, Goldbloom shared some statistics on the platform: in the last seven years the Kaggle community has submitted more than four million machine learning models to competitions, shared 170,000 forums posts, more than 250,000 kernels and 1,000 datasets.

From rather irreverent beginnings – the first competition challenged users to forecast Eurovision song competition votes – the platform is now being used to help predict lung cancer, analyse imagery from government satellites and find more efficient manufacturing methods.

“Our early competitions had participants who called themselves computer scientists, statisticians, econometricians and bioinformaticians. They used a wide range of techniques, ranging from logistic regression to self organising maps,” Goldbloom said in a blogpost.

“It's been rewarding to see these once-siloed communities coming together on Kaggle: sharing different approaches and ideas through the forums and Kaggle Kernels. This sharing has helped create a common language…As well as breaking down silos, the sharing of approaches and ideas on Kaggle has made machine learning accessible to many more people.”

For Google, the acquisition is part of its effort to ‘democratise AI’.

“We must lower the barriers of entry to AI and make it available to the largest community of developers, users and enterprises, so they can apply it to their own unique needs. With Kaggle joining the Google Cloud team, we can accelerate this mission,” Google Cloud chief scientist of AI and machine learning Fei-Fei Li said in March.

Within Google Cloud the Kaggle team will remain as a distinct brand. The platform will remain open to all data scientists, companies, techniques and technologies, Goldbloom said at the time of the acquisition

Crowdsourcing data science

Enterprises wanting to use the platform can either run open or closed competitions, offering prize money to Kaggle’s legion of scientists to work on their data.

In May, US real estate marketplace Zillow put up a record US$1.2m to teams that can improve the algorithm behind its proprietary home valuation tool.

A number of Australian organisations have also turned to the platform to crowdsource their data science and recruit exceptional users. Last year Telstra issued a competition to predict the severity of service disruptions on their network. Using a dataset of features from service logs, Kagglers was tasked with predicting if a disruption is a momentary glitch or a total interruption of connectivity, and the best entrants were "considered for data science roles in Telstra's Big Data team".

In 2013, with Kaggle emerging as a disruptor to Deloitte’s consulting business and global network of 6000 analysts – Deloitte Australia forged a partnership with the platform. Where appropriate, Deloitte Australia put client problems on the Kaggle in closed competitions.

Earlier this year, the University of Melbourne ran a $20,000 competition to find a prediction model for epileptic seizures based on anonymised electrical brain activity (EEG) data.