This is a mirror of a github project page. The actual project, complete with clone URLs, issue tracking, etc, is hosted here
dsnb     Prerequisites | Features | Running the Code
Data Science Notebook Template

About

This project is one part playground, one part project skeleton, derived from the jupyter allspark stack, plus these guidelines for standardizing data science project layouts.

Prerequisites

You really just need docker and docker-compose.

If you're having problems, you might be interested to know my version info which is at least confirmed working:

    $ docker --version
    Docker version 17.03.1-ce, build c6d412e

Features

So what's in the box? Well, obviously everything already shipping with allspark:

  • Everything in the allspark box:
  • Jupyter Notebook,
  • Python 2 and 3, scipy, numpy, pandas
  • Scala,
  • R, Spark
  • lots more

Besides that you get:

  • A Dockerfile, used by docker-compose. Mods on top of allspark are:
  • A docker-compose.yml to reduce painful command-line invocations. Includes
    • a jupyter service built from dockerfile, preconfigured with volume for working directory & port forwarding
    • a mesos service (not finished yet), ready for mesos client connection from allspark
  • various jupyter notebook demos & PoCs I've found useful

Running the Code

Use docker-compose up dsnb to bring up the notebook service. If you make changes to the Dockerfile, add requirements, etc, you'll want to use docker-compose up --force-recreate --build dsnb