A Databricks Job Manager
Databricks CLI is a very useful tool to deal with a Databricks workspace. From there we can start existing jobs easily. However, if we want to stop active executions for a particular job, we need query for their runs —their IDs— first, and then cancel them. This is the main motivation for this tiny CLI app.
It has basically two commands:
- Start an existing Databricks job given its job ID.
- Stop active runs for a particular job given its job ID.
If you use Coursier you can install the application in your current working directory by running
coursier bootstrap com.alhuelamo:dbjobs_3:0.1.0 -o dbjobs
You will need sbt as the main pre-requisite.
Go to the repository folder and run
sbt stage
to generate a launcher file for the app. Then you can go to the folder
cd ./target/universal/stage
and run
./bin/dbjobs --help
The application requires the presence of a Databricks CLI config file in your home folder (~/.databrickscfg
).
You need to define a profile for the Databricks workspace you want to point at.
vim ~/.databrickscfg
[myprofile]
host=https://mydatabricks.workspace.url.com
token=myapiaccesstoken
Once configured, you can use the app to start and stop existing jobs in that workspace.
Start jobs
dbjobs start \
--profile myprofile \
--job-ids 42,314,9
Stop active jobs:
dbjobs stop \
--profile myprofile \
--job-ids 42,314,9
You can use the flag --plan
to just show which are going to be the affected jobs and runs without actually starting or stopping them.