Open Notebook Systems

Posted Monday, September 26, 2022 by Sri. Tagged MEMO
EDITING PHASE:gathering info...

My friend N mentioned something he's doing with Jupyter Notebooks, and it sounds cool. In the data science work he does, he uses "PyNotebooks" and describes them as a REPL that produces browser output. He uses it from the command line, I believe, but he'd like to come up with a way for non-technical data scientists to host their own versions without all the hassle of Linux while also keeping their data private for research, rather than somewhere in someone else's cloud service.

The Challenge

From what I gather, data scientists without the command line skills still want to process large datasets of files and analyze them. They need to extract the data they want, perform the operation, and then gather the results. Traditionally I believe statistical packages like SPSS and mathematical software like Mathematica were used by data scientist/analysts, but the trend recently is to use Python to build semi-custom tools.

Examples

Some exmaples of "in the cloud" versions, which has better usability for non-technical users:

  • This example is the "classical version" that spits out URLs so you can diagnose problems.
  • Google has a version called Colaboratory
  • Observable HQ is the Javascript version. I have always loved what these guys are doing. I believe Joi Itoi and NYT Data Science alums founded this. Beautiful work.

For more technical users:

  • Visual Studio Code directly support Jupyter Notebooks for the hardcore. It will spin up a local server I think?
  • Can also use the Python Package Installer (PIP) pip install jupyter for command-line people
  • I'm intrigued by this blog post about installing python instead an electron app.

Trivia

  • On a side note, "this is the defacto way to learn Python" says N