Vizier is an interactive, reactive workbook: A workflow system with a notebook-style interface.
- No Kernels: There's no long-running kernel with state to lose if you have to log out.
- Reproducibility: Vizier executes cells in order, and automatically re-executes cells when their inputs change so your notebook's outputs are always up-to-date.
- Data Snapshots: Vizier automatically snapshots data created by each cell, so you can re-run a cell without re-running all of its inputs.
- Polyglot: You can combine Python, SQL, and Scala, all seamlessly working with the same data.
- Code-Optional: Use a spreadsheet-style interface, or Vizier's "data lenses" to work with your data, code optional!
- Workflow Snapshots: Vizier automatically keeps a record of how you edit your workflow so you can always go back to an earlier version.
- Scalable: Vizier datasets are backed by Spark and Apache Arrow, allowing you to make big changes fast.
- Reactive: Changes are reflected immediately.
See some Screenshots
Make sure you have JDK (v8 preferred, v11 otherwise) and Python3.X installed.
Download vizier from releases or
wget https://maven.mimirdb.info/info/vizierdb/vizier
chmod +x vizier
sudo mv vizier /usr/local/bin
vizier
... and open up http://localhost:5000 in your web browser.
docker run -p 5000:5000 --name vizier okennedy/vizier:latest
... and open up http://localhost:5000 in your web browser.
- Project Website (w/ screenshots)
- Vizier for Jupyter Users
- User Documentation
- Developer Documentation
Unlike most notebooks, Vizier is not backed by a long-running kernel. Each cell runs in a fresh interpreter.
Cells communicate by exporting artifacts:
- datasets (e.g., Pandas or Spark dataframes)
- files
- parameters
- charts
- python functions
Define and export a function in a python cell, and then use it in a SQL cell.
Vizier tracks how cells use artifacts and inter-cell dependencies. When a cell is updated, all cells that depend on it will be automatically re-run. For example, when you modify the python function, Vizier will automatically re-run the SQL cell.
To use this repository you'll need
An easy way to install all three is with Coursier (cs setup
and cs install mill
).
Some useful commands for using this repository
mill vizier.compile
Compiled class files will be in out/vizier/compile/dest
mill vizier.test
mill vizier.ui.test
The UI test cases require node
and jsdom
. Install node and then
npm install jsdom
To run a single test case:
mill vizier.test.testOnly [classname]
To run a single example:
mill vizier.test.testOnly [classname] -- ex "[any text in the example label]"
mill vizier.run [vizier arguments]
Vizier defaults to running on port 5000 on localhost.
If you see an error about xdg-open
, it's probably because you're running Vizier in a VM. Starting vizier with the -n
flag should remove the error.
Vizier takes a few seconds to start. If you're only editing the UI, you can get a faster development cycle by only rebuilding and reloading the resources directory
mill -w vizier.ui.resourceDir
Mill's -w
flag watches for changes to files in the UI directory. Any changes will trigger a recompile. Vizier uses the generated resources directory in-situ, so reloading the UI in your web browser will load the updated version.
mill vizier.publishLocal
mill mill.contrib.Bloop/install