

Load distribution on a cluster - liverpoolfan

Hi,<p>A computer cluster is composed of nodes that execute jobs on files stored in that node (data locality optimization). Certain files have more jobs assigned to them than others.<p>Let's say we have:
- file A with load X
- file B with load X
- file C with load 2X
- two nodes in the cluster<p>So the best distribution is: file A and file B in one node and file C in the other node.<p>How can I distribute the files in the nodes? Does a greedy algorithm solve my problem?
======
lzw
I think maybe you are looking for hadoop or ore map reduce based system?
Possibly i misunderstood your question.

~~~
liverpoolfan
A job could be a map/reduce operation.

In this case, my question would be:

How does a map/reduce system redistribute the chunks to balance the load if
some chunks are more accessed than others?

