-
Notifications
You must be signed in to change notification settings - Fork 477
Using BOINC for AI
Computing power - especially GPUs - is a bottleneck in many AI applications. As a result, start-of-the-art AI is available only to big companies.
Volunteer computing (BOINC) has the potential to provide access to millions of GPUs in home computers. This computing power could potentially be free for non-profit researchers, or cheap (using various incentive systems) for small companies.
However, compared to data-center computers, home computers have properties that present technical challenges to using them for AI:
- They are only sporadically available.
- They are heterogeneous in terms of hardware and software.
- Their interconnection (via the Internet) is slow compared to LANs.
- They are generally behind firewalls that don't allow incoming connections.
In the past few years, BOINC has added features intended to address these challenges:
- Support for apps that run in Docker containers (which potentically provide GPU access).
- A mechanism called 'sporadic apps' that makes it possible to orchestrate large sets of computers running an application simultaneously.
Other open-source projects have developed libraries for doing peer-to-peer communication in the presence of firewalls.
Thus, it may be possible to use BOINC for some AI applications, but additional research and development will be needed.
This involves training models (e.g. large language models) with data (e.g. web pages).
Many models have ~1 billion parameters, so they fit in RAM (or VRAM) on a typically home computer. Some models are larger (tens or hundreds of billions) and would have to be divided among multiple computers.
The amount of training data varies; e.g. Llama 2 uses 8 TB and Llama 3 uses 60 TB. This would typically have to be divided among multiple computers.
Training algorithms typically involve computing the gradient of an error function (using the full model and all the training data) and finding a minimum of the error function in that direction.
Such an algorithm can be parallelized across multiple computers by subdividing the problem:
- dividing the model (e.g. into its layers);
- dividing the training data;
... or both.
There is considerable research on how to do this. Existing libraries (e.g. Hive) implement parallel training in the context of data-center computers.
The problem is more complex in the context of volunteer computing. For example: the floating-point speeds of personal computers differ by orders of magnitude. If we use synchronous algorithms, fast computers may spend most of their time waiting for slow ones. This can be addressed by using asynchronous algorithms, or by combining sets of computers into groups that have similar performance.
BOINC could potentially be used for real-time inference: to compute the output of a model to a given prompt, perhaps as part of a chat system.
This would be useful only if the rate of inputs is high enough to keep thousands of computers busy. Otherwise it would easier to use data-center computers.
Some AI applications involve running large numbers of queries against a model. For example, suppose a model takes a molecular structure as input, and has been trained to recognize molecules that are potential drug candidates. We might want to do this with a database of hundreds of millions of molecules.