HTcondor Submit参数介绍
博客专区 > twins 的博客 > 博客详情
HTcondor Submit参数介绍
twins 发表于4年前
HTcondor Submit参数介绍
  • 发表于 4年前
  • 阅读 80
  • 收藏 0
  • 点赞 0
  • 评论 0

新睿云服务器60天免费使用,快来体验!>>>   

 1. 运行两个进程,结果输出到不同目录中(当前目录)

#################### 
# 
# Example 3: demonstrate use of multiple 
# directories for data organization. 
# 
####################
executable = mathematica 
universe = vanilla 
input = test.data 
output = loop.out 
error = loop.error 
log = loop.log 
request_memory = 1 GB
initialdir = run_1 
queue
initialdir = run_2 
queue

2. 运行150个进程,物理内存大于64M

 ####################                    
  #
  # Example 4: Show off some fancy features including
  # the use of pre-defined macros.
  #
  ####################                                                    

  Executable     = foo                                                    
  Universe       = standard                                                    
  requirements   = OpSys == "LINUX" && Arch =="INTEL"     
  rank           = Memory >= 64
  image_size     = 28000
  request_memory = 32

  error   = err.$(Process)                                                
  input   = in.$(Process)                                                 
  output  = out.$(Process)                                                
  log     = foo.log

  queue 150

3. 申明什么时候传输文件以及如何传输文件。

  #申明是否传输文件
  should_transfer_files = IF_NEEDED
  when_to_transfer_output = ON_EXIT
  
  #申明传输什么文件  
  should_transfer_files = YES
  when_to_transfer_output = ON_EXIT
  transfer_input_files = file1,file2

  should_transfer_files的三个参数

YES: 
HTCondor transfers both the executable and the file defined by the input command from the machine where the job is submitted to the remote machine where the job is to be executed. The file defined by the output command as well as any files created by the execution of the job are transferred back to the machine where the job was submitted. When they are transferred and the directory location of the files is determined by the command when_to_transfer_output.

IF_NEEDED:
HTCondor transfers files if the job is matched with and to be executed on a machine in a different FileSystemDomain than the one the submit machine belongs to, the same as if should_transfer_files = YES. If the job is matched with a machine in the local FileSystemDomain, HTCondor will not transfer files and relies on the shared file system.

NO: 
HTCondor's file transfer mechanism is disabled.

4. 传输文件的目录表达方法

  4.1 目录:

  o1
  o2
  d1 (directory)
     o3
     o4

transfer_output_files = o1,o2,d1

  4.2 输出结果

      o1
      o2
      d1 (directory)
          o3
          o4

   4.3 输入参数

transfer_output_files = o1,o2,d1/o3

  4.4 输出结果

      o1
      o2
      d1 (directory)
          o3

5. 汇总测试:

5.1目录环境(执行目录)

/scratch/test (directory)
      my_program.condor (the submit description file)
      my_program (the executable)
      files (directory)
          logs2 (directory)
          in1 (file)
          in2 (file)
      logs (directory)

5.2 运行配置

# file name:  my_program.condor
# HTCondor submit description file for my_program
Executable      = my_program
Universe        = vanilla
Error           = logs/err.$(cluster)
Output          = logs/out.$(cluster)
Log             = logs/log.$(cluster)

should_transfer_files = YES
when_to_transfer_output = ON_EXIT
transfer_input_files = files/in1,files/in2

Arguments       = in1 in2 out1
Queue

5.3 运行结果:

This first example explicitly transfers input files. These input files to be transferred are specified relative to the directory where the job is submitted. An output file specified in the arguments command, out1, is created when the job is executed. It will be transferred back into the directory /scratch/test.

5.4 运行配置

# file name:  my_program.condor
# HTCondor submit description file for my_program
Executable      = my_program
Universe        = vanilla
Error           = logs2/err.$(cluster)
Output          = logs2/out.$(cluster)
Log             = logs2/log.$(cluster)

initialdir      = files

should_transfer_files = YES
when_to_transfer_output = ON_EXIT
transfer_input_files = in1,in2

Arguments       = in1 in2 out1

5.5 运行结果:

This third example illustrates the use of the submit command initialdir, and its effect on the paths used for the various files. The expected location of the executable is not affected by the initialdir command. All other files (specified by input, output, error,transfer_input_files, as well as files modified or created by the job and automatically transferred back) are located relative to the specified initialdir. Therefore, the output file, out1, will be placed in the files directory. Note that the logs2 directory exists to make this example work correctly.

6.交互作业:

An interactive job is a Condor job that is provisioned and scheduled like any other vanilla universe Condor job onto an execute machine within the pool. The result of a running interactive job is a shell prompt issued on the execute machine where the job runs. The user that submitted the interactive job may then use the shell as desired, perhaps to interactively run an instance of what is to become a Condor job. This might aid in checking that the set up and execution environment are correct, or it might provide information on the RAM or disk space needed. This job (shell) continues until the user logs out or any other policy implementation causes the job to stop running. A useful feature of the interactive job is that the users and jobs are accounted for within Condor's scheduling and priority system.

Neither the submit nor the execute host for interactive jobs may be on Windows platforms.
The current working directory of the shell will be the initial working directory of the running job. The shell type will be the default for the user that submits the job. At the shell prompt, X11 forwarding is enabled.

Each interactive job will have a job ClassAd attribute of
  InteractiveJob = True

Submission of an interactive job specifies the option -interactive on the condor_submit command line.
A submit description file may be specified for this interactive job. Within this submit description file, a specification of these 5 commands will be either ignored or altered:
executable
transfer_executable
arguments
universe. The interactive job is a vanilla universe job.
queue <n>. In this case the value of <n> is ignored; exactly one interactive job is queued.
The submit description file may specify anything else needed for the interactive job, such as files to transfer.If no submit description file is specified for the job, a default one is utilized as identified by the value of the configuration variableINTERACTIVE_SUBMIT_FILE .
Here are examples of situations where interactive jobs may be of benefit.
An application that cannot be batch processed might be run as an interactive job. Where input or output cannot be captured in a file and the executable may not be modified, the interactive nature of the job may still be run on a pool machine, and within the purview of Condor.
A pool machine with specialized hardware that requires interactive handling can be scheduled with an interactive job that utilizes the hardware.
The debugging and set up of complex jobs or environments may benefit from an interactive session. This interactive session provides the opportunity to run scripts or applications, and as errors are identified, they can be corrected on the spot.
Development may have an interactive nature, and proceed more quickly when done on a pool machine. It may also be that the development platforms required reside within Condor's purview as execute hosts.

windows平台用不了,故贴上英文doc

7. 运行在不同的操作系统上

  ####################
  #
  # Example of submission targeting RedHat platforms in a heterogeneous Linux pool
  #
  ####################

  universe     = vanilla
  Executable   = /bin/date
  Log          = distro.log
  Output       = distro.out
  Error        = distro.err

  Requirements = (OpSysName == "RedHat")

  Queue
  ####################
  #
  # Example of submission targeting RedHat 6 platforms in a heterogeneous Linux pool
  #
  ####################

  universe     = vanilla
  Executable   = /bin/date
  Log          = distro.log
  Output       = distro.out
  Error        = distro.err

  Requirements = ( OpSysName == "RedHat" && OpSysMajorVersion == 6)

  Queue
Here is a more compact way to specify a RedHat 6 platform.
  ####################
  #
  # Example of submission targeting RedHat 6 platforms in a heterogeneous Linux pool
  #
  ####################

  universe     = vanilla
  Executable   = /bin/date
  Log          = distro.log
  Output       = distro.out
  Error        = distro.err

  Requirements = ( OpSysAndVer == "RedHat6")

  Queue

8.运行在不同的指令集上:

   ####################
  #
  # Example of heterogeneous submission
  #
  ####################

  universe     = standard
  Executable   = povray.$$(OpSys).$$(Arch)
  Log          = povray.log
  Output       = povray.out.$(Process)
  Error        = povray.err.$(Process)

  # HTCondor automatically adds the correct expressions to insure that the
  # checkpointed jobs will restart on the correct platform types.
  Requirements = ( (Arch == "INTEL" && OpSys == "LINUX") || \
                 (Arch == "X86_64" && OpSys == "LINUX") )

  Arguments    = +W1024 +H768 +Iimage1.pov
  Queue 

  Arguments    = +W1024 +H768 +Iimage2.pov
  Queue 

  Arguments    = +W1024 +H768 +Iimage3.pov
  Queue

9.文件传输选项:

The requirements expression for a job must depend on the should_transfer_files command. The job must specify the correct logic to ensure that the job is matched with a resource that meets the file transfer needs. If no requirements expression is in the submit description file, or if the expression specified does not refer to the attributes listed below, condor_submit adds an appropriate clause to the requirements expression for the job.condor_submit appends these clauses with a logical AND, &&, to ensure that the proper conditions are met. Here are the default clauses corresponding to the different values of should_transfer_files:

  1. should_transfer_files = YES results in the addition of the clause (HasFileTransfer). If the job is always going to transfer files, it is required to match with a machine that has the capability to transfer files.

  2. should_transfer_files = NO results in the addition of (TARGET.FileSystemDomain == MY.FileSystemDomain). In addition, HTCondor automatically adds theFileSystemDomain attribute to the job ClassAd, with whatever string is defined for the condor_schedd to which the job is submitted. If the job is not using the file transfer mechanism, HTCondor assumes it will need a shared file system, and therefore, a machine in the same FileSystemDomainas the submit machine.

  3. should_transfer_files = IF_NEEDED results in the addition of

      (HasFileTransfer || (TARGET.FileSystemDomain == MY.FileSystemDomain))

    If HTCondor will optionally transfer files, it must require that the machine is either capable of transferring files or in the same file system domain.

To ensure that the job is matched to a machine with enough local disk space to hold all the transferred files, HTCondor automatically adds theDiskUsage job attribute. This attribute includes the total size of the job's executable and all input files to be transferred. HTCondor then adds an additional clause to the Requirements expression that states that the remote machine must have at least enough available disk space to hold all these files:

  && (Disk >= DiskUsage)

If should_transfer_files = IF_NEEDED and the job prefers to run on a machine in the local file system domain over transferring files, but is still willing to allow the job to run remotely and transfer files, the Rank expression works well. Use:

rank = (TARGET.FileSystemDomain == MY.FileSystemDomain)

The Rank expression is a floating point value, so if other items are considered in ranking the possible machines this job may run on, add the items:

Rank = kflops + (TARGET.FileSystemDomain == MY.FileSystemDomain)

The value of kflops can vary widely among machines, so this Rank expression will likely not do as it intends. To place emphasis on the job running in the same file system domain, but still consider floating point speed among the machines in the file system domain, weight the part of the expression that is matching the file system domains. For example:

Rank = kflops + (10000 * (TARGET.FileSystemDomain == MY.FileSystemDomain))
  • 打赏
  • 点赞
  • 收藏
  • 分享
共有 人打赏支持
粉丝 4
博文 92
码字总数 27330
×
twins
如果觉得我的文章对您有用,请随意打赏。您的支持将鼓励我继续创作!
* 金额(元)
¥1 ¥5 ¥10 ¥20 其他金额
打赏人
留言
* 支付类型
微信扫码支付
打赏金额:
已支付成功
打赏金额: