Locked History Actions

Diff for "InfolabCluster"

Differences between revisions 1 and 13 (spanning 12 versions)
Revision 1 as of 2012-08-10 21:03:08
Size: 579
Editor: akrevl
Comment:
Revision 13 as of 2012-10-18 01:31:20
Size: 2272
Editor: akrevl
Comment:
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
== Hardware == {{{#!wiki caution
'''Beta warning'''

If Google can keep things in Beta, why can't we? So.. beware... Things might break. Please join the mailing list and report any glitches that you come across.
}}}

It was recently decided that having a split personality is not the best thing to have in a cluster as two different resource managers start to compete while not being aware of each other. That is why we have separated our cluster into a Compute cluster and a Hadoop cluster. Read on for more about the two.

== Compute cluster ==

The compute cluster comes in handy whenever you need a lot of cores to get your job done. It is just like looking at the CPU and memory load of the other servers and then deciding which one to use for your job, only the job schedule will take care of looking at the CPU load for you and schedule the resources on a first come first serve basis (at least for the time being, queue priorities may change in the future).

=== Hardware ===

 * 1 head node: iln1
 * 2 development nodes: ild1, ild2
 * 28 compute nodes:
  * 896 CPU cores
  * 1792 GB RAM

=== Software ===

 * Torque resource manager
 * MAUI job scheduler
 * CentOS 6.3
  
=== Resources ===
 
 * [[InfolabClusterCompute|Compute cluster]]

=== Hardware ===
Line 7: Line 37:
 * 38 compute nodes:
  * 1216 cores
  * 2.4 TB RAM
 * 36 compute nodes:
  * 1152 cores
  * 2.25 TB RAM
Line 14: Line 44:
  
== Mailing list ==

=== Mailing list ===
Line 22: Line 52:
== Access == === Software ===

We decided that it is not good to have a split personality that is why we now have a set of nodes dedicated to the compute cluster and another set of nodes dedicated to a hadoop cluster.

 * :
   * TORQUE resource manager, MAUI job scheduler
   * Nodes: iln01-iln28
   * Submission node: ilhead1
 * [[InfolabClusterHadoop|Hadoop cluster]]
   * Apache Hadoop
   * Nodes: iln29-iln36
   * Submission node: iln29

Infolab cluster

Beta warning

If Google can keep things in Beta, why can't we? So.. beware... Things might break. Please join the mailing list and report any glitches that you come across.

It was recently decided that having a split personality is not the best thing to have in a cluster as two different resource managers start to compete while not being aware of each other. That is why we have separated our cluster into a Compute cluster and a Hadoop cluster. Read on for more about the two.

Compute cluster

The compute cluster comes in handy whenever you need a lot of cores to get your job done. It is just like looking at the CPU and memory load of the other servers and then deciding which one to use for your job, only the job schedule will take care of looking at the CPU load for you and schedule the resources on a first come first serve basis (at least for the time being, queue priorities may change in the future).

Hardware

  • 1 head node: iln1
  • 2 development nodes: ild1, ild2
  • 28 compute nodes:
    • 896 CPU cores
    • 1792 GB RAM

Software

  • Torque resource manager
  • MAUI job scheduler
  • CentOS 6.3

Resources

Hardware

  • 2 head nodes
  • 2 development nodes
  • 36 compute nodes:
    • 1152 cores
    • 2.25 TB RAM
  • Each node:
    • 2x AMD Opteron 6276 (Interlagos) @2.3GHz - 3.2GHz, 16 cores/CPU, AMD64, VT
    • 64 GB RAM
    • 2 TB local HDD

Mailing list

There is a mailing list for all those interested in what is currently happening with the cluster and the configuration of the cluster:

Software

We decided that it is not good to have a split personality that is why we now have a set of nodes dedicated to the compute cluster and another set of nodes dedicated to a hadoop cluster.

  • :
    • TORQUE resource manager, MAUI job scheduler
    • Nodes: iln01-iln28
    • Submission node: ilhead1
  • Hadoop cluster

    • Apache Hadoop
    • Nodes: iln29-iln36
    • Submission node: iln29