How to Setup IPython Notebook with Spark 1.5 in a Minuteby Shahid Ashraf
IPython Notebook provides a browser-based notebook with support for code, text, mathematical expressions, inline plots and other media as well support for interactive data visualization.This tool provides users ability to create rich content documents with embedded source code with very little effort. In 2014, Fernado Perez announced a spin-off project from IPython called Project Jupyter. IPython will continue to exist as a Python shell and a kernel for Jupyter, while the notebook and other language-agnostic parts of IPython will move under Jupyter. Jupyter added support for Julia, R, Haskell and Ruby.
In this post, we will see how to install IPython Notebook and quickly start using Ipython notebooks with pyspark. (You can read more about achieving this, here and here.) However, they did not work perfectly on Spark greater than 1.4.0, for my required configuration.
pip install ipython
If you are unable to install using the command, visit install ipython.
Set Up Spark
After downloading Spark, set SPARK_HOME to spark installation path.
e.g add to .zshrc or bash profile
pip install findspark
Start ipython notebook by following command,
A main page will open on the browser:
Create new notebook and add following in cell one,
import findspark findspark.init() import pyspark sc = pyspark.SparkContext(appName="first spark based notebook") print sc
If all is successful, it will print sparkContext object.
This is how we can quickly start using Apache Spark in IPython notebooks without messy configurations.
To learn more about Data Science & PopHealthContact Us
For Feedback and Queries shoot me email: email@example.com. Follow me on twitter