This is a mirror of a github project page. The actual project, complete with clone URLs, issue tracking, etc, is hosted here
dsnb Prerequisites | Features | Running the Code | |
![]() |
Data Science Notebook Template |
About
This project is one part playground, one part project skeleton, derived from the jupyter allspark stack, plus these guidelines for standardizing data science project layouts.
Prerequisites
You really just need docker and docker-compose.
If you're having problems, you might be interested to know my version info which is at least confirmed working:
$ docker --version
Docker version 17.03.1-ce, build c6d412e
Features
So what's in the box? Well, obviously everything already shipping with allspark:
- Everything in the allspark box:
- Jupyter Notebook,
- Python 2 and 3, scipy, numpy, pandas
- Scala,
- R, Spark
- lots more
Besides that you get:
- A Dockerfile, used by docker-compose. Mods on top of allspark are:
- preconfigured to install local requirements
- preconfigured to use SSL and authentication
- preconfigured to enable plugins like jupyter_dashboards
- A docker-compose.yml to reduce painful command-line invocations. Includes
- a jupyter service built from dockerfile, preconfigured with volume for working directory & port forwarding
- a mesos service (not finished yet), ready for mesos client connection from allspark
- various jupyter notebook demos & PoCs I've found useful
- graph ops and visualization networkx
- csv loading/summarizing/graphing with pandas
- images with IPython.display and PIL
Running the Code
Use docker-compose up dsnb
to bring up the notebook service. If you make changes to the Dockerfile, add requirements, etc, you'll want to use docker-compose up --force-recreate --build dsnb