How to optimize distributed LLMs with limited communication?
Distributed Stochastic Zeroth-Order Optimization with Compressed Communication
This paper proposes a new algorithm, Com-DSZO, for distributed optimization in multi-agent systems when calculating gradients is impractical (like when using LLMs or with private data). Com-DSZO uses only two function evaluations per iteration and incorporates compression to reduce communication overhead. The algorithm achieves sublinear convergence rates for both smooth and non-smooth objective functions, making it suitable for a variety of applications, particularly in LLM-based multi-agent systems where communication costs can be significant. A variance-reduced version, VR-Com-DSZO, improves performance with mini-batch feedback. The paper’s key contribution is providing a communication-efficient method for distributed optimization without requiring gradient calculations, particularly useful in privacy-preserving or black-box scenarios where gradient information isn't readily available. This has direct implications for LLM-based multi-agent systems, where communication efficiency is crucial and gradients of the objective function are difficult or impossible to obtain.