Testing the Performance of Distributed Coordinate Gradient Descent
Machine learning, distributed computing, stragglers
In this project, we implement distributed coordinate gradient descent in the master/worker setting on a local machine. The goal is to analyze the performance of this algorithm in the presence of stregglers.
The new idea here is to distribute the data redundantly to the workers so that stragglers do not affect the performance of the overall algorithm. We try a new scoring strategy of the coordinates that will reflect the repetition of each coordinate within the workers to guarantee that the master observes the important coordinates at each iteration with high probability.
We test our ideas on MNIST data set and potentially using CIFAR-10.