Site Navigation:
security torque-2.5.7-1.el6 security update
Status:stable
Release: Fedora EPEL 6
Update ID: FEDORA-EPEL-2011-3928
Builds: torque-2.5.7-1.el6 (logs)
Pushed: True
Date Submitted: 2011-07-29 19:01:45
Date Released: 2011-07-30 10:02:59
Date Modified: 2011-08-22 11:23:01
Submitter: stevetraylen
Karma: 0
Details

Warning: Significant change, munge enabled - action required:

The updated EPEL6 build of torque-2.5.7-1 as compared to previous versions enables munge[1] as an inter node authentication method.

It is highly advisable that prior to upgrading to version 2.5.7-1 of this torque package that munge is installed and enabled. A munge package[2] is of course available within EPEL6.

  • [1] http://code.google.com/p/munge/
  • [2] https://admin.fedoraproject.org/community/?package=munge#package_maintenance

To enable munge on your torque cluster:

  • Install the munge package on your pbs_server and submission hosts in your cluster.
  • On one host generate a key with /usr/sbin/create-munge-key
  • Copy the key, /etc/munge/munge.key to your pbs_server and submission hosts on your cluster.
  • Start the munge daemon on these nodes.. service munge start && chkconfig munge on

ChangeLog:

2.5.7:

  • e - Added new qsub argument -F. This argument takes a quoted string as an argument. The string is a list of space separated commandline arguments which are available to the job script.
  • e - Added an option to asynchronously delete jobs (currently cannot work for qdel -a all due to limitations of single threads) backported from 3.0.2
  • c - Fix an issue where job_purge didn't protect key variables that resulted in crashes
  • b - fix bugzilla #134, qmgr -= was deleting all entries (backported from 3.0.2)
  • b - do not prepend ${HOME} with the current dir for -o and -e in qsub (backported from 3.0.2)
  • b - fix jobs named with -J not always having the server name appended correctly (backported from 3.0.2)
  • b - make it so that jobs named like arrays via -J have legal output and error file names (backported from 3.0.2)
  • b - Fixed a bug for high availability. The -l listener option for pbs_server was not complete and did not allow pbs_server to properly communicate with the scheduler. Also fixed a bug with job dependencies where the second server or later in the $TORQUE_HOME/server_name directory was not added as part of the job dependecny so dependent jobs would get stuck on hold if the current server was not the first server in the server_name file.
  • b - Fixed a potential buffer overflow problem in src/resmom/checkpoint.c function mom_checkpoint_recover. I modified the code to change strcpy and strcat to strncpy and strncpy.

2.5.6

  • b - Made changes to record_jobinfo and supporting functions to be able to use dynamically allcated buffers for data. This fixed a problem where incoming data overran fixed sized buffers.
  • b - restored functionality for -W umask as reported in bugzilla 115 (backported from 3.0.1)
  • b - Updated torque.spec.in to be able to handle the snapshot names of builds.
  • e - Added new MOM configure option job_starter. This options will execute the script submitted in qsub to the executable or script provided as the argument to the job_starter option of the MOM configure file.
  • b - fix pbs_mom -q to work with parallel jobs (backported from 3.0.1)
  • b - fixed a problem with pbs_server high availability where the current server could not keep the HA lock. The problem was a result of truncating the directory name where the lock file was kept. TORQUE would fail to validate permissions because it would do a stat on the wrong directory.
  • b - Added code to free the mom.lock file during MOM shutdown.
  • b - fixed a bug in set_resources that prevented the last resource in a list from being checked. As a result the last item in the list would always be added without regard to previous entries.
  • e - Added new symbol JOB_EXEC_OVERLIMIT. When a job exceeds a limit (i.e. walltime) the job will fail with the JOB_EXEC_OVERLIMIT value and also produce an abort case for mailing purposes. Previous to this change a job exceeding a limit returned 0 on success and no mail was sent to the user if requested on abort.
  • e - Added options to buildutils/torque.spec.in to conditionally build munge, BLCR, high-availability, cpusets, and spooling. Also allows customization of the sendmail path and allows for optional XML conversion to serverdb.
  • b - --with-tcp-retry-limit now actually changes things without needing to run autoheader
  • e - Added a new queue resource named procct. procct allows the administrator to set queue limits based on the number of total processors requested in a job. Patch provided by Martin Siegert.
  • e - allow more than 5 concurrent connections to TORQUE using pbsD_connect. Increase it to 10 (backported from 3.0.1)
  • b - fix a segfault when receiving an obit for a job that no longer exists (backported from 3.0.1)
  • b - also remove the procct resource when it is applied because of a default (backported from 3.0.1)
  • e - allow an administator using the proxy user submission to also set the job id to be used in TORQUE. This makes TORQUE easier to use in grid configurations. (backported from 3.0.2)
  • c - fix a segfault when queue has acl_group_enable and acl_group_sloppy set true and no acl_groups are defined. (backported from 3.0.1)
  • f - Added the ability to detect Nvidia gpus using nvidia-smi (default) or NVML. Server receives gpu statuses from pbs_mom. Added server attribute auto_node_gpu that allows automatically setting number of gpus for nodes based on gpu statuses. Added new configure options --enable-nvidia-gpus, --with-nvml-include and --with-nvml-lib.
  • e - The -e and -o options of qsub allow a user to specify a path or optionally a filename for output. If the path given by the user ended with a directory name but no '/' character at the end then TORQUE was confused and would not convert the .OU or .ER file to the final output/error file. The code has now been changed to stat the path to see if the end path element is a path or directory and handled appropriately.
  • c - fix a segfault when using --enable-nvidia-gpus and pbs_mom has Nvidia driver older than 260 that still has nvidia-smi command
  • e - Added new MOM configuration option $rpp_throttle. The syntax for this in the $TORQUE_HOME/mom_priv/config file is $rpp_throttle <value> where value is a long representing microseconds. Setting this values causes rpp data to pause after every sendto for <value> microseconds. This may help with large jobs where full data does not arrive at sister nodes.
  • c - check if the file pointer to /dev/console can be opened. If not, don't attempt to write it (backported from 3.0.2)
  • b - Added patch from Michael Jennings to buildutils/torque.spec.in. This patch allows an rpm configured with DRMAA to complete even if all of the support files are not present on the system.
  • b - commited patch submitted by Michael Jennings to fix bug 130. TORQUE on the MOM would call lstat as root when it should call it as user in open_std_file.
  • e - Added capability to automatically set mode on Nvidia gpus. Added support for gpu reseterr option on qsub. Removed server attribute auto_node_gpu. The nodes file will be updated with Nvidia gpu count when --enable-nvidia-gpu configure option is used. Moved some code out of job_purge_thread to prevent segfault on mom.
Bugs Fixed
713090 - CVE-2011-2907: torque: Authorization Bypass Vulnerability
Feedback
bodhi - 2011-07-29 19:01:47
This update has been submitted for testing by stevetraylen.
bodhi - 2011-07-30 10:34:52
This update has been pushed to testing
bodhi - 2011-08-13 17:03:16
This update has reached 14 days in testing and can be pushed to stable now if the maintainer wishes
bodhi - 2011-08-26 08:09:05
This update has been submitted for stable by stevetraylen.
bodhi - 2011-08-26 21:56:44
This update has been pushed to stable

Add a comment

Tip: Login to impact how quickly this update gets pushed or unpushed.
obfuscated letters