How Hasura makes it easier to work with Jupyter notebooks

Hasura can help enhance your Jupyter notebook experience when it comes to deployment, customisation and sharing

TLDR; Using Hasura you can: 1) deploy the notebook on the cloud for free, 2) customise your notebooks however you want, 3) setup custom authentication for sharing notebooks and 4) persist data on tables or files.


Deploying your notebook

Deploying your Jupyter notebook onto the cloud takes just a couple of minutes with Hasura. And it’s free! To start, head over to Hasura Hub and search for jupyter or notebook.

Hub is where you will find community projects that you can clone, modify and deploy onto your own cluster in a couple of minutes. Hub projects let you avoid writing boilerplate code and get started faster.

We have two Jupyter projects on Hub. For this blog, let’s choose the scipy-jupyter-notebook project.

The Jupyter notebook quickstart on Hasura Hub

You can deploy this notebook simply by following these instructions :

$ hasura clone hasura/scipy-jupyter-notebook
$ cd scipy-jupyter-notebook
$ git add . && git commit -m “init”
$ git push hasura master

And that’s it! Once the push is over, you get an external (HTTPS) endpoint which you can use to create, run and share notebooks.

Customising your notebook

You may require a specific library or tool in your notebook. Although it is very simple to do this in notebook via a code cell which runs, for example, pip install tensorflow, it may be more convenient to package this as part of your notebook. There are at least a couple of reasons to do this:

  • The packages have complex dependencies that you don’t want mixing with your code; code should be separated from the environment.
  • You do not want to repeatedly install packages (for example, if you are restarting containers or changing the underlying VMs).

Adding your own libraries or tools can be done by editing the Dockerfile file present in your project directory. For e.g, to add thetensorflow package to your scipy-jupyter-notebook, you simply have to add one line as below:# For e.g. to add tensorflow package

# Edit microservices/jupyter/Dockerfile in your project directory


$ cat Dockerfile

FROM jupyter/scipy-notebook

pip install tensorflow

Once you are done editing, just git push again!

Adding access control to make sharing easier

A simple way of sharing notebooks is to share their URLs. But you will need to send your collaborators a token or a password to access the notebook. If you have a large team or if your notebook needs more sophisticated access control, then you will require a custom solution.

Hasura’s API gateway can control access to services based on the role of a particular user. For e.g. if we restrict our jupyter service to only allow clients with the role user then any user who isn’t logged in with the role user will be redirected to a URL of your choosing. For e.g. you can choose to redirect them to Hasura’s Auth UI kit endpoint to allow them to signup or login.

.

Once the user logs in, he is redirected to the jupyter service.

Adding persistent storage

Storing your data in a persistent store helps you avoid repeatedly fetching the data from an external source. You can store data in a relational table using Hasura’s data service or store raw files using Hasura’s file service.

If your data can be stored in the data service, then you get the added benefit of using Hasura’s Data APIs to retrieve data as JSON in your notebook.

Using Hasura’s data APIs to retrieve data in your notebook

Both the Data and File services are integrated with the Auth service to provide role based access control to tables and files.

Coming soon: Instantly upgrading underlying infra

If your notebook is running out of memory or running very slowly, then it may be that your notebook is consuming more resources than available. In such cases, your only option might be to increase the compute power. Shortly, you will simply need to edit your infra.yamlfile to upgrade the infrastructure underlying your Hasura project. Watch this space!

Are there any other ways you would like your Jupyter notebook experience enhanced? Let us know in the comments.

PS: Shoutout to the Awesome Python newsletter for featuring this!


Hasura is an open-source engine that gives you realtime GraphQL APIs on new or existing Postgres databases, with built-in support for stitching custom GraphQL APIs and triggering webhooks on database changes.