Installing the classic Jupyter Notebook interface#
This section includes instructions on how to get started with Jupyter Notebook. But there are multiple Jupyter user interfaces one can use, based on their needs. Please checkout the list and links below for additional information and instructions about how to get started with each of them.
This information explains how to install the Jupyter Notebook and the IPython kernel.
Prerequisite: Python#
While Jupyter runs code in many programming languages, Python is a requirement for installing the Jupyter Notebook. The Python version required differs between Jupyter Notebook releases (e.g. Python 3.6+ for Notebook v6.3, and Python 3.7+ for Notebook v7) .
We recommend using the Anaconda distribution to install Python and Jupyter. We’ll go through its installation in the next section.
Installing Jupyter using Anaconda and conda#
For new users, we highly recommend installing Anaconda. Anaconda conveniently installs Python, the Jupyter Notebook, and other commonly used packages for scientific computing and data science.
Use the following installation steps:
Download Anaconda. We recommend downloading Anaconda’s latest Python 3 version (currently Python 3.9).
Install the version of Anaconda which you downloaded, following the instructions on the download page.
Congratulations, you have installed Jupyter Notebook. To run the notebook:
Alternative for experienced Python users: Installing Jupyter with pip#
Jupyter installation requires Python 3.3 or greater, or Python 2.7. IPython 1.x, which included the parts that later became Jupyter, was the last version to support Python 3.2 and 2.6.
As an existing Python user, you may wish to install Jupyter using Python’s package manager, pip , instead of Anaconda.
First, ensure that you have the latest pip; older versions may have trouble with some dependencies:
Then install the Jupyter Notebook using:
(Use pip if using legacy Python 2.)
Congratulations. You have installed Jupyter Notebook. See Running the Notebook for more details.
Install Anaconda Python and Jupyter Notebooks for Data Science
To explain what is Anaconda, we will quote its definition from the official website:
Anaconda is a free, easy-to-install package manager, environment manager and Python distribution with a collection of 1,000+ open source packages with free community support. Anaconda is platform-agnostic, so you can use it whether you are on Windows, macOS or Linux.
It is easy to secure and scale any data science project with Anaconda as it natively allows you to take a project from your laptop directly to deployment cluster. A complete set of features can be shown here with the official image as well:
To show in brief what Anaconda is, here are some quick points:
- It contains Python and hundreds of packages which are especially useful if you are getting started or experienced with Data Science and Machine Learning
- It comes with conda package manager and virtual environments which development very easy
- It allows you to get started with development very fast without wasting your time to setup tools for Data Science and Machine Learning
You can install Anaconda from here. It will automatically install Python on your machine so you don’t have to install it separately.
Anaconda vs Jupyter Notebooks
Whenever I try to discuss Anaconda with people who are beginners with Python and Data Science, they get confused between Anaconda and Jupyter Notebooks. We will quote the difference in one line:
Anaconda is package manager. Jupyter is a presentation layer.
Anaconda tries to solve the dependency hell in python—where different projects have different dependency versions—so as to not make different project dependencies require different versions, which may interfere with each other.
Jupyter tries to solve the issue of reproducibility in the analysis by enabling an iterative and hands-on approach to explaining and visualizing code; by using rich text documentation combined with visual representations, in a single solution.
Anaconda is similar to pyenv, venv and minconda; it’s meant to achieve a python environment that’s 100% reproducible on another environment, independent of whatever other versions of a project’s dependencies are available. It’s a bit similar to Docker, but restricted to the Python ecosystem.
Jupyter is an amazing presentation tool for analytical work; where you can present code in “blocks,” combines with rich text descriptions between blocks, and the inclusion of formatted output from the blocks, and graphs generated in a well-designed matter by way of another block’s code.
Jupyter is incredibly good in analytical work to ensure reproducibility in someone’s research, so anyone can come back many months later and visually understand what someone tried to explain, and see exactly which code drove which visualization and conclusion.
Often in analytical work, you will end up with tons of half-finished notebooks explaining Proof-of-Concept ideas, of which most will not lead anywhere initially. Some of these presentations might months later—or even years later—present a foundation to build from for a new problem.
Using Anaconda and Jupyter Notebook from Anaconda
Finally, we will have a look at some commands with which we will be able to use Anaconda, Python and Jupyter on our Ubuntu machine. First, we will download the installer script from the Anaconda website with this command:
We also need to ensure the data integrity of this script:
We will get the following output:
Check Anaconda integrity
We can now run the Anaconda script:
Once you accept the terms, provide a location for installation of packages or just hit Enter for it to take the default location. Once the installation is completed, we can activate the installation with this command:
Finally, test the installation:
Making an Anaconda Environment
Once we have a complete installation in place, we can use the following command to create a new environment:
We can now activate the environment we made:
With this, our command prompt will change, reflecting an Active Anaconda environment. To continue with setting up a Jupyter environment, continue with this lesson which is an excellent lesson on How to install Jupyter Notebooks on Ubuntu and start using them.
Conclusion: Install Anaconda Python and Jupyter Notebooks for Data Science
In this lesson, we studied how we can install and start using the Anaconda environment on Ubuntu 18.04 which is an excellent environment manager to have, especially for beginners for Data Science and Machine Learning. This is just a very simple introduction of many lessons to come for Anaconda, Python,Data Science and Machine Learning. Share your feedback for the lesson with me or to LinuxHint Twitter handle.
About the author
Shubham Aggarwal
I’m a Java EE Engineer with about 4 years of experience in building quality products. I have excellent problem-solving skills in Spring Boot, Hibernate ORM, AWS, Git, Python and I am an emerging Data Scientist.
Running Jupyter Notebooks on a Ubuntu Server
Configuring a VPS from scratch to host Jupyter notebooks with Anaconda.
It dawned on me the other day that for a publication which regularly uses and talks about Jupyter notebooks, we’ve never actually taken the time to explain what they are or how to start using them. No matter where you may have been in your career, first exposure to Jupyter and the IPython shell is often a confusingly magical experience. Writing programs line-by-line and receiving feedback in real-time feels more like painting oil on canvas and programming. I suppose we can finally chalk up a win for dynamically typed languages.
There are a couple of barriers for practical devs to overcome before using Jupyter, the most obvious being hardware costs. If you’re utilizing a full Anaconda installation, chances are you’re not the type of person to mess around. Real machine learning algorithms take real resources, and real resources take real money. A few vendors have popped up here are offering managed cloud-hosted notebooks for this reason. For those of us who bothered to do the math, it turns out most of these services are more expensive than spinning up a dedicated VPS.
Data scientists with impressive machines have no problem running notebooks locally for most use cases. While that’s fine and good for scientists, this setup is problematic for those of us with commitments to Python outside of notebooks. Upon installation, Anaconda barges into your system’s
/.bash_profile , shouts “I am the captain now,” and crowns itself as your system’s default Python path. Conda and Pip have some trouble getting along, so for those of us who build Python applications and use notebooks, it's best to keep these things isolated.
Setting Up a VPS
We're going to spin up a barebones Ubuntu 18.04 instance from scratch. I opted for DigitalOcean in my case, both for simplicity and the fact that I'm incredibly broke. Depending on how broke you may or may not be, this is where you'll have to make a judgment call for your system resources:
My kind sir, I would like to order the most exquisite almost-cheapest Droplet on the menu
SSH into that bad boy. You know what to do next:
With that out of the way, next we'll grab the latest version of Python:
Finally, we'll open port 8888 for good measure, since this is the port Jupyter runs on:
Create a New User
As always, we should create a Linux user besides root to do just about anything:
Then, add them to the sudoers group:
Log in as the user:
Install The Latest Anaconda Distribution
Anaconda comes with all the fantastic Data Science Python packages we'll need for our notebook. To find the latest distribution, check here: https://www.anaconda.com/download/. We'll install this to a /tmp folder:
Once downloaded, begin the installation:
Complete the resulting prompts:
Get ready for the wall of text.
This kicks off a rather lengthy install process. Afterward, you'll be prompted to add Conda to your startup script. Say yes:
The final part of the installation will ask if you'd like to install VS Code. Decline this offer because Microsoft sucks.
Finally, reload your /.bashrc file to get apply Conda's changes:
Setting Up Conda Environments
Conda installations can be isolated to separate environments similarly to how we would with Virtualenv. Unlike Virtualenv, however, Conda environments can be activated from anywhere (not just in the directory containing the environment). Create and activate a Conda env:
Congrats, you're now in an active Conda environment!
Starting Up Jupyter
Make sure you're in a directory you'd like to be running Jupyter in. Entering jupyter notebook in this directory should result in the following:
This next part is tricky. To run our notebook, we need to reconnect to our VPS via an SSH tunnel. Close the terminal and reconnect to your server with the following format:
Indeed, localhost is intended to stay the same, but your_server_ip is to be replaced with the address of your server.
With that done, let's try this one more time. Remember to reactivate your Conda environment first!
This time around, the links which appear in the terminal should work!
WE DID IT
BONUS ROUND: Theme Your Notebooks
If ugly interfaces bother you as much as they bother me, I highly recommend taking a look at the jupyter-themes package on Github. This package allows you to customize the look and feel of your notebook, either as simple as activating a style, or as complex as setting your margin width. I highly recommend checking out the available themes to spice up your notebook!
What is Jupyter?
Jupyter is a browser-based interactive notebook for programming, mathematics, and data science. It supports a number of languages via plugins (“kernels”), such as Python, Ruby, Haskell, R, Scala, and Julia.
Optimus is an open source library based on PySpark to prepare, process and explore your Big Data.
Anaconda is a complete development environment with over 300 Python packages such as pywin32, numpy, scipy. A complete list of packages can be found here.
Create a New Ubuntu Server on Digitalocean
I recommend a create a droplet with at least 2GB of RAM and 2 vCPUs.
Connect to your server
You will need an SSH Client to connect to your server. You can use putty or bitvise Tunnelier on Windows, or use Terminal in Linux and MacOS
Write or copy and paste this commands lines in your console. Copy and execute one line at time.
Install Anaconda3, Optimus & Spark
Linux Anaconda Installation (Anaconda conveniently installs Python, the Jupyter Notebook, and other commonly used packages for scientific computing and data science.