Configurations

Azkaban can be configured in many ways. The following describes the knobs and switches that can be set. For the most part, there is no need to deviate from the default values.

Azkaban Web Server Configurations

These are properties to configure the web server. They should be set in azkaban.properties.

General Properties

Parameter Description Default
azkaban.name The name of the azkaban instance that will show up in the UI. Useful if you run more than one Azkaban instance. Local
azkaban.label A label to describe the Azkaban instance. My Local Azkaban
azkaban.color Hex value that allows you to set a style color for the Azkaban UI. #FF3601
web.resource.dir Sets the directory for the ui’s css and javascript files. web/
default.timezone The timezone that will be displayed by Azkaban. America/Los_Angeles
viewer.plugin.dir Directory where viewer plugins are installed. plugins/viewer
job.max.Xms The maximum initial amount of memory each job can request. This validation is performed at project upload time 1GB
job.max.Xmx The maximum amount of memory each job can request. This validation is performed at project upload time 2GB

Multiple Executor Mode Parameters

Parameter Description Default
azkaban.use.multiple. executors Should azkaban run in multi-executor mode. Required for multiple executor mode. false
azkaban.executorselec tor.filters A common separated list of hard filters to be used while dispatching. To be choosen from StaticRemaining, FlowSize, MinimumFreeMemory and CpuStatus. Order of filter do not matter.  
azkaban.executorselec tor.comparator.{Compa ratorName} Integer weight to be used to rank available executors for a given flow. Currently, {ComparatorName} can be NumberOfAssignedFlowC omparator, Memory, LastDispatched and CpuUsage as ComparatorName. For example:- azkaban.executorselec tor.comparator.Memory =2  
azkaban.queueprocessi ng.enabled Hhould queue processor be enabled from webserver initialization true
azkaban.webserver.que ue.size Maximum flows that can be queued at webserver 100000
azkaban.activeexecuto r.refresh.milisecinte rval Maximum time in milliseconds that can be processed without executor statistics refresh 50000
azkaban.activeexecuto r.refresh.flowinterva l Maximum number of queued flows that can be processed without executor statistics refresh 5
azkaban.executorinfo. refresh.maxThreads Maximum number of threads to refresh executor statistics 5

Jetty Parameters

Parameter Description Default
jetty.maxThreads Max request threads 25
jetty.ssl.port The ssl port 8443
jetty.keystore The keystore file  
jetty.password The jetty password  
jetty.keypassword The keypassword  
jetty.truststore The trust store  
jetty.trustpassword The trust password  

Project Manager Settings

Parameter Description Default
project.temp.dir The temporary directory used when uploading projects temp
project.version.reten tion The number of unused project versions retained before cleaning 3
creator.default.proxy Auto add the creator of the projects as a proxy user to the project. true
lockdown.create.proje cts Prevents anyone except those with Admin roles to create new projects. false
lockdown.upload.proje cts Prevents anyone but admin users and users with permissions to upload projects. false

MySQL Connection Parameter

Parameter Description Default
database.type The database type. Currently, the only database supported is mysql. mysql
mysql.port The port to the mysql db 3306
mysql.host The mysql host localhost
mysql.database The mysql database  
mysql.user The mysql user  
mysql.password The mysql password  
mysql.numconnections The number of connections that Azkaban web client can open to the database 100

Executor Manager Properties

Parameter Description Default
executor.port The port for the azkaban executor server 12321
executor.host The host for azkaban executor server localhost
execution.logs.retent ion.ms Time in milliseconds that execution logs are retained 7257600000L (12 weeks)

Notification Email Properties

Parameter Description Default
mail.sender The email address that azkaban uses to send emails.  
mail.host The email server host machine.  
mail.user The email server user name.  
mail.password The email password user name.  

User Manager Properties

Parameter Description Default
user.manager.class The user manager that is used to authenticate a user. The default is an XML user manager, but it can be overwritten to support other authentication methods, such as JDNI. azkaban.user.XmlUserM anager
user.manager.xml.file Xml file for the XmlUserManager conf/azkaban-users.xm l

User Session Properties

Parameter Description Default
session.time.to.live The session time to live in ms seconds 86400000
max.num.sessions The maximum number of sessions before people are evicted. 10000

Azkaban Executor Server Configuration

Executor Server Properties

Parameter Description Default
executor.port The port for azkaban executor server 12321
executor.global.pro

perties

A path to the properties that will be the parent for all jobs. none
azkaban.execution.d

ir

The folder for executing working directories executions
azkaban.project.dir The folder for storing temporary copies of project files used for executions projects
executor.flow.threa

ds

The number of simulateous flows that can be run. These threads are mostly idle. 30
job.log.chunk.size For rolling job logs. The chuck size for each roll over 5MB
job.log.backup.index The number of log chunks. The max size of each logs is then the index * chunksize 4
flow.num.job.threads The number of concurrent running jobs in each flow. These threads are mostly idle. 10
job.max.Xms The maximum initial amount of memory each job can request. If a job requests more than this, then Azkaban server will not launch this job 1GB
job.max.Xmx The maximum amount of memory each job can request. If a job requests more than this, then Azkaban server will not launch this job 2GB
azkaban.server.flow

.max.running.minutes

The maximum time in minutes a flow will be living inside azkaban after being executed. If a flow runs longer than this, it will be killed. If smaller or equal to 0, there’s no restriction on running time. -1

MySQL Connection Parameter

Parameter Description Default
database.type The database type. Currently, the only database supported is mysql. mysql
mysql.port The port to the mysql db 3306
mysql.host The mysql host localhost
mysql.database The mysql database  
mysql.user The mysql user  
mysql.password The mysql password  
mysql.numconnection

s

The number of connections that Azkaban web client can open to the database 100

Plugin Configurations

Execute-As-User

With a new security enhancement in Azkaban 3.0, Azkaban jobs can now run as the submit user or the user.to.proxy of the flow by default. This ensures that Azkaban takes advantage of the Linux permission security mechanism, and operationally this simplifies resource monitoring and visibility. Set up this behavior by doing the following:-

Execute.as.user is set to true by default. In case needed, it can also be configured to false in azkaban-plugin’s commonprivate.properties Configure azkaban.native.lib= to the place where you are going to put the compiled execute-as-user.c file (see below) Generate an executable on the Azkaban box for azkaban-common/src/main/c/execute-as-user.c. it should be named execute-as-user Below is a sample approach

  • scp ./azkaban-common/src/main/c/execute-as-user.c onto the Azkaban box
  • run: gcc execute-as-user.c -o execute-as-user
  • run: chown root execute-as-user (you might need root privilege)
  • run: chmod 6050 execute-as-user (you might need root privilege)