COSMOS News Bulletins
This web page provides the most up-to-date information on the
operational status of
COSMOS and future downtimes. If
COSMOS
is unreachable, Consult this page to find out why.
Operational status
COSMOS (Altix 4700) is OFFLINE (only to come online on specific request)
UNIVERSE (Altix UV) is UP
OBERON (NFS server) is UP.
MOAB/Torque (workload management system) is UP.
Please notify cosmos_sys
immediately (telephone 01223 339740) if you find the system
unresponsive despite the information on this page.
Scheduled downtimes:
-
Exceptional maintenance
-
- Regular maintenance
-
Every week 9am-3pm Wednesday.
Recent news
(20/03/11) .amtp.cam.ac.uk domain names gone
NOTE: the old way of referring to machines with .amtp.cam.ac.uk
names has stopped working.
We (DAMTP) have recently been working towards finally
removing the 'amtp' forms of names as we promised to do when we were allowed to
start using 'damtp' names in 1993.
As of 11th March we removed all the machine names of the form
xxx.amtp.cam.ac.uk so attempting to use those DNS names - e.g ssh'ing to
universe.amtp.cam.ac.uk will no longer work. Simply replace the 'amtp' part of
the name with 'damtp' for a name which will continue to work.
(24/03/10) COSMOS is getting MOAB Suite
MOAB/Torque installation takes longer than expected. The work will have to be continued next day.
(17/03/10) PBSPro is to be replaced soon
PBSPro license nearly run its course and we have opted for the more advanced
workload management system -
MOAB Grid Suite from
Adaptive Computing
(former ClusterResources)
News Archive
(18/03/09)
COSMOS going for ProPack6
COSMOS was running quite steadily since last system update in July 2008.
now is the time to give it another boost, by going to the latest ProPack6 from SGI.
This update requires the basis OS upgrade too, so the procedure is rather complex and time consuming.
(09/07/08)
COSMOS updated to ProPack5
Finally, we have COSMOS environment updated to much more recent Novell
SLES 10SP1
operating system with SGI ProPack 5SP5 addons.
It took so long (ProPack 5 and SLES10 were released quite some
time ago), because
it was not a simple question of OS update. During the update - CXFS was
removed, Microcosm (CXFS server, running on IRIX) was retired; file
systems migrated, new
storage RAID installed, existing RAIDs re-connected; new version of LSF
installed; new Intel compilers installed; COSMOLIB re-built... And the
process is not over yet
- there are areas which need some work.
And there is a new service pack for SLES10 out already,
waiting for install (should not be
that involving though)
(30/05/08)
STFC Final Report submitted
Our report about COSMOS activities in the period 2006-08 was submitted
to STFC.
Here
is the submitted version and
here
is the report in PDF format for offline reading (about 1.3Mb).
(30/05/08)
Victor's last day
Today was Victor Travieso's last day with the COSMOS consortium before
moving on to greener pastures in the entertainment/technology industry.
His contributions loom large in the STFC Final Report; he particularly
played a vital role when he ran COSMOS single-handedly during 2007. We
thank him for his very capable contributions to UK cosmology over the
past 4.5 years and wish him well for the future.
(19/05/08)
STFC Report is due
COSMOS REPORT. We need a final STFC report about our scientific
achievements over the last 3 years - before the end of the month. Much
of this has been done implicitly through our recent proposals for HPC
funding, but we need an update for the last year. Andrey will circulate
everyone with their COSMOS publications and we would be grateful if
you could cooperate by updating these (using SPIRES BibTeX format)
and write a sentence or two about your own research highlights.
(19/05/08) PPAN
Presentation today
This is a critical week concerning STFC funding for HPC. Today
Carlos Frenk is giving a PPAN presentation about the findings of the
Theory and Computation panel (of which Andrew Liddle is also a
member).
(19/05/08)
Victor is leaving
VICTOR TRAVIESO. In light of the recent funding problems at
STFC, it was sad but prudent for Victor to look around for other
employment opportunities. It wasn't long before Frontiers,
a large entertainment/technology company, offered him a senior
role in the development team for their core physics and graphics
engine. We are extremely grateful for all his efforts over the past 4
years on behalf of the UK cosmology community. Victor is clearly an
outstanding programmer, he was able to extract vastly improved
efficiency and performance out of almost any code he touched, and he is
very friendly, flexible, patient and helpful. Moreover, he is also a
talented HPC sysadmin,
running COSMOS in the interregnum between Stuart and Andrey,
during which he single-handedly oversaw our last major upgrade. We are
sure the whole consortium will join us in wishing him well in this new
and exciting challenge. He will remain in touch in Cambridge ...
(19/05/08) COSMOS
upgrade delayed (a bit)
The earlier announced COSMOS upgrade, which was scheduled for this
week, is going to be delayed for two weeks, due to the vendor's move
across the border. (LSI has moved its factory to Mexico). As a result,
we have some change of plans.
We age going to close COSMOS for the maintenance this Wednesday
(21/05/08) and the next one (28/05/08), to prepare filesystems for the
migration and also to test the system within the new environment (OS
SLES10SP1 and SGI ProPack5) with the Cosmolib rebuilt.
(08/05/08) COSMOS is
evolving
As you know very well, the Universe is evolving and so do the tools to
study this evolution. COSMOS has been an integral part of those tools
for many years, steadily growing in size and capabilities, so we are
happy
to announce that it is time for it to evolve yet again.
Presuming that our STFC grant allocation becomes available
later this year,
you will get a huge boost in computing power and storage, but before
that
can happen, COSMOS needs an intermediate upgrade. In two weeks time
(Wednesday, 21/05/08), COSMOS is going down for an uplift; in brief
summary - more Raid, larger 100Gb backed-up user home directories,
and comprehensive OS and job scheduler upgrades.
We are going to retire the IRIX file server, which served us
so well for an era
and with it CXFS - the clustered file system, which is no longer
compatible with
the further planned COSMOS upgrades. We also are going to install a new
RAID
storage appliance, which would allow us to streamline the storage
arrangements
and give you more space and better backup capabilities.
There is one inevitable drawback of this upgrade - the Viz
server (Cosmos2)
will no longer be on one file system with Cosmos, so it won't be
possible to run
visualizations directly, as you were accustomed to before - data files
would have
to be transferred first to the local storage. It is a small
inconvenience, but hopefully
not for long, as we plan to get a brand new and much more powerful Viz
server
with the said upgrade later this year.
After this coming May upgrade you will no longer have
cosmos-med separate
storage. Instead, you will get a significantly larger home storage,
which is fully
backed up. (At the moment you quota on cosmos home is 1GB - it will go
up to
100GB after the upgrade)
An essential part of this upgrade is also the install of the
latest OS, SGI ProPack
and job scheduler versions, which would allow fo better overall
performance and
better reporting/system tuning capabilities. Not all the consequences
of this move
are apparent until the upgrade is done, but one is pretty certain -
most of the
in-house developed libraries and applications will require complete
rebuild
(re-compilation) for the new system.
Please use the remaining time (two weeks from now) to go
through your files
in all three locations on COSMOS (/local/cosmos/ /local/cosmos-med/
/local/cosmos-tmp/)
and clean them up as much as possible (backup to outside locations,
remove duplicates,
non-necessary files, commonly available source codes, etc. Especially
pay attention to
the /local/cosmos-med/ folders - move them out to /local/cosmos-tmp/
folders if necessary.
Those files which remain in /local/cosmos-med/ will be archived to the
/local/cosmos/ location
during the upgrade. We do not expect any data loss, but we cannot fully
backup everything
we have on COSMOS, so cannot guarantee 100% the data safety.
Better to be prudent now than sorry later.
That is all for now. If you have any questions regarding the
upgrade, please do not
hesitate to ask cosmos_sys