Advertisement

Stability AI, Hugging Face and Canva back new AI research nonprofit

Image Credits: v_alex / Getty Images

Developing cutting-edge AI systems like ChatGPT requires massive technical resources, in part because they're costly to develop and run. While several open source efforts have attempted to reverse-engineer proprietary, closed source systems created by commercial labs such as Alphabet's DeepMind and OpenAI, they've often run into roadblocks -- mainly due to a lack of capital and domain expertise.

Hoping to avoid this fate, one community research group, EleutherAI, is forming a nonprofit foundation. The organization today announced it'll found a not-for-profit research institute, the EleutherAI Institute, funded by donations and grants from backers, including AI startups Hugging Face and Stability AI, former GitHub CEO Nat Friedman, Lambda Labs and Canva.

"Formalizing as an organization allows us to build a full time staff and engage in longer and more involved projects than would be feasible as a volunteer group," Stella Biderman, an AI researcher at Booz Allen Hamilton who will co-run the EleutherAI Institute, told TechCrunch in an email interview. "In terms of a nonprofit specifically, I think it’s a no-brainer given our focus on research and the open source space."

EleutherAI started several years ago as a grassroots collection of developers working to open source AI research. Its founding members -- Connor Leahy, Leo Gao and Sid Black -- wrote the code and collected the data needed to create a machine learning model close to OpenAI's text-generating GPT-3, which at the time was getting a lot of press.

ADVERTISEMENT

The company curated and open sourced The Pile, a collection of datasets designed to be used to train GPT-3-like models to complete text, write code and more. And it released several models under the Apache 2.0 license, including GPT-J and GPT-NeoX, language models that for a while fueled an entirely new wave of startups.

To train its models, EleutherAI relied mostly on the TPU Research Cloud, a Google Cloud program that supports projects with the expectation that the results will be shared publicly. CoreWeave, a U.S.-based cryptocurrency miner that provides cloud services for AI workloads, also supplied compute resources to EleutherAI in exchange for models its customers can use and serve.

EleutherAI grew quickly. Today, over 20 of the community's regular contributors are working full-time, focusing mainly on research. And over the past 18 months, EleutherAI members have co-authored 28 academic papers, trained dozens of models and released ten codebases.

But the fickle nature of its cloud providers sometimes forced EleutherAI to scuttle its plans. Originally, the group had intended to release a model roughly the size of GPT-3 in terms of the number of parameters, but ended up shelving that roadmap for technical and funding reasons. (In AI, parameters are the parts of the model learned from historical training data and essentially define the skill of the model on a problem, such as generating text.)

In late 2022, EleutherAI became well-acquainted with Stability AI, the now-well-financed startup behind the image-generating AI system Stable Diffusion. Along with other collaborators, it helped to create the initial version of Stable Diffusion. And since then, Stability AI has donated a portion of compute from its AWS cluster for EleutherAI's ongoing language model research.