Python

 

 

 

 

Panda

 

Panda is a data structure and analysis tool developed for Python. You might have heard of many cases where Python is used for Big Data / Data Science etc. Panda is one of the most common tools that are used for this kind of application.

 

Pandas is a popular open-source Python library for data manipulation and analysis. It has become a widely-used library for handling structured data, especially for tasks such as data cleaning, transformation, and analysis. The name 'pandas' is derived from the term "panel data", which is a term used in econometrics for multi-dimensional structured data sets.

 

Actually, Panda is not the only tool that are used for Big Data/Data Science application. You would see from many tutorials that Panada is used alongwith additional tools like numpy or matplotlib.

 

 

 

Where to get

 

If Installation Method 1 does not work with your Python and PIP, you need to download proper Panda installation package and manually install it. You can download all the Panda version from http://pandas.pydata.org/ . From this site, you can get all the official documents about Panda as well.

 

 

How to know which version to Install ?

 

When you try to install the Panda using Installation Method 2, one of the most challenging thing to install a Panda was to figure out which version of Panda I need to download and install. It seemed that Panda distribution package is depenant very stictly on specific PC platform and specific Python version. It was difficult to find the exact version just based on Panda installation version.

 

Of course, you would know which Python version is installed on your PC. It is simple.  (If you are not sure one which Python version is installed on your PC, you can just run the Python kernel in command line, you can find Python information as shown below)

 

 

< Finding Operating System Information >

 

c:\Python35>python

 

Python 3.5.1 (v3.5.1:37a07cee5969, Dec  6 2015, 01:38:48) [MSC v.1900 32 bit (Intel)] on win32

Type "help", "copyright", "credits" or "license" for more information.

 

Then you need to figure out your PC information which is a little bit tricky to find out. One way to get rough information on your PC is to use the command systeminfo in case of Windows 7 or higher or lscpu in case of linux.

 

C:\> systeminfo

 

Host Name:                 ...

OS Name:                   Microsoft Windows 7 Enterprise

OS Version:                6.1.7601 Service Pack 1 Build 7601

OS Manufacturer:           Microsoft Corporation

OS Configuration:          Member Workstation

OS Build Type:             Multiprocessor Free

Registered Owner:          ...

Registered Organization:   ...

Product ID:                ...

Original Install Date:     12/1/2015, 5:39:01 PM

System Boot Time:          12/5/2016, 12:44:32 AM

System Manufacturer:       Dell Inc.

System Model:              Latitude E6430

System Type:               x64-based PC

Processor(s):              1 Processor(s) Installed.

                           [01]: Intel64 Family 6 Model 58 Stepping 9 GenuineI

el ~2701 Mhz

BIOS Version:              Dell Inc. A12, 5/20/2013

Windows Directory:         C:\Windows

System Directory:          C:\Windows\system32

Boot Device:               ...

System Locale:             en-us;English (United States)

Input Locale:              en-us;English (United States)

Time Zone:                 (UTC-05:00) Eastern Time (US & Canada)

Total Physical Memory:     16,338 MB

Available Physical Memory: 4,920 MB

Virtual Memory: Max Size:  32,675 MB

Virtual Memory: Available: 18,952 MB

Virtual Memory: In Use:    13,723 MB

Page File Location(s):     C:\pagefile.sys

Domain:                    ...

Logon Server:              ...

Hotfix(s):                 ....

                           ....

 

 

 

< Finding the package for the specific python version >

 

The tips above is to find out the proper whl file matching a specific operating system. Now you need to figure out which package would match to the specific Python version you are using. You can figure this out from the panda whl file as shown below. The red part indicates the python version it support.

 

pandas-0.21.0-cp27-cp27m-win32.whl  <-- this supports Python 2.7x

pandas-0.18.1-cp35-cp35m-win32.whl  <-- this supports Python 3.5x

pandas-0.20.3-cp36-cp36m-win32.whl  <-- this supports Python 3.6x

 

 

 

Panda Installation

 

There are several different ways to install Panada, but I want to introduce a couple of simple (at least simple to me) method. I think the simplest way is the Method 1. So I would recommend you to try the Method 1 first and try Method 2 if Method 1 does not work for you(You'd better uninstall the package first and then try Method 2).

 

 

Method 1 >  Install from PIP

 

The first method is to open up a Windows command and try 'pip install panda' in the folder where your Python is installed. In my PC, I am using Python 3.5.1 installed in C:\Python35. So I ran the command as shown below.

 

c:\Python35>pip install pandas

 

NOTE : you may install panda or pandas , but I would suggest you to install pandas (note 's' at the end) since panda may not work with the latest python.

 

If PIP can install the panda, you would see the process going on as shown below. Otherwise, you may have some error message. If this does not work, upgrade PIP to the latest version as in PIP page and try this again.

 

Collecting panda

  Downloading panda-0.3.1.tar.gz

Requirement already satisfied: setuptools in c:\python35\lib\site-packages (from

 panda)

Collecting requests (from panda)

  Downloading requests-2.12.4-py2.py3-none-any.whl (576kB)

    100% |################################| 583kB 1.5MB/s

Installing collected packages: requests, panda

  Running setup.py install for panda ... done

Successfully installed panda-0.3.1 requests-2.12.4

 

 

If Pana installation went successful, try to install matplotlib as shown below since a lot of application would need matplotlib in addition to panda.

 

c:\Python35>pip install matplotlib

 

Collecting matplotlib

  Downloading matplotlib-1.5.3-cp35-cp35m-win32.whl (6.2MB)

    100% |################################| 6.2MB 178kB/s

Requirement already satisfied: numpy>=1.6 in c:\python35\lib\site-packages (from

 matplotlib)

Requirement already satisfied: python-dateutil in c:\python35\lib\site-packages

(from matplotlib)

Collecting pyparsing!=2.0.4,!=2.1.2,>=1.5.6 (from matplotlib)

  Downloading pyparsing-2.1.10-py2.py3-none-any.whl (56kB)

    100% |################################| 61kB 5.1MB/s

Collecting cycler (from matplotlib)

  Downloading cycler-0.10.0-py2.py3-none-any.whl

Requirement already satisfied: pytz in c:\python35\lib\site-packages (from matpl

otlib)

Requirement already satisfied: six>=1.5 in c:\python35\lib\site-packages (from p

ython-dateutil->matplotlib)

Installing collected packages: pyparsing, cycler, matplotlib

Successfully installed cycler-0.10.0 matplotlib-1.5.3 pyparsing-2.1.10

 

 

If Pana installation went successful, try to install numpy as shown below since a lot of application would need numpy in addition to panda.

 

c:\Python35>pip install numpy

 

Requirement already satisfied: numpy in c:\python35\lib\site-packages

 

In my case, numpy was found to be installed already. so I got the message as above. If it is not installed already, you would see the installation process going on.

 

 

Method 2 >  Install the downloaded package

 

In this method, I will install the panda package that I downloaded as in Where to Get section. I was not lucky to make it work at the first try. I tried several different versions and finally got the following package working. So don't get disappointed if you fail at the first trial.. keep on trying :)

 

c:\Python35>pip install pandas-0.18.0-cp35-cp35m-win32.whl

 

Processing c:\python35\pandas-0.18.0-cp35-cp35m-win32.whl

Collecting numpy>=1.7.0 (from pandas==0.18.0)

  Downloading numpy-1.11.3-cp35-none-win32.whl (6.6MB)

    100% |################################| 6.6MB 172kB/s

Collecting pytz>=2011k (from pandas==0.18.0)

  Downloading pytz-2016.10-py2.py3-none-any.whl (483kB)

    100% |################################| 491kB 1.7MB/s

Collecting python-dateutil>=2 (from pandas==0.18.0)

  Downloading python_dateutil-2.6.0-py2.py3-none-any.whl (194kB)

    100% |################################| 194kB 2.3MB/s

Collecting six>=1.5 (from python-dateutil>=2->pandas==0.18.0)

  Downloading six-1.10.0-py2.py3-none-any.whl

Installing collected packages: numpy, pytz, six, python-dateutil, pandas

Successfully installed numpy-1.11.3 pandas-0.18.0 python-dateutil-2.6.0 pytz-20

6.10 six-1.10.0

 

 

Basic Check

 

Now let do some final check if the installation went really successful. The simplest way is just to try.

 

Run Python in command line as shown below.

 

c:\Python35>python

Python 3.5.1 (v3.5.1:37a07cee5969, Dec  6 2015, 01:38:48) [MSC v.1900 32 bit (Intel)] on win32

Type "help", "copyright", "credits" or "license" for more information.

 

In Python command prompt, try following command. If this does not give you any error message, it means panda installation is successful.

 

>>> import pandas as pd

 

In Python command prompt, try following command. If this does not give you any error message, it means nympy installation is successful.

 

>>> import numpy as np

 

 

 

How to Uninstall Panda

 

If you want to uninstall the panda that has already been installed, you can uninstall the package using pip as shown below.

 

c:\Python35>pip uninstall panda

 

Uninstalling panda-0.3.1:

  c:\...\python\python36-32\lib\site-packages\panda-0.3.1-py3.6.egg-info

  c:\...\python\python36-32\lib\site-packages\panda\__init__.py

  c:\...\python\python36-32\lib\site-packages\panda\__pycache__\__init__.cpython-36.pyc

  c:\...\python\python36-32\lib\site-packages\panda\__pycache__\models.cpython-36.pyc

  c:\...\python\python36-32\lib\site-packages\panda\__pycache__\request.cpython-36.pyc

  c:\...\python\python36-32\lib\site-packages\panda\__pycache__\test.cpython-36.pyc

  c:\...\python\python36-32\lib\site-packages\panda\__pycache__\upload_session.cpython-36.pyc

  c:\...\python\python36-32\lib\site-packages\panda\models.py

  c:\...\python\python36-32\lib\site-packages\panda\request.py

  c:\...\python\python36-32\lib\site-packages\panda\test.py

  c:\...\python\python36-32\lib\site-packages\panda\upload_session.py

Proceed (y/n)? y

  Successfully uninstalled panda-0.3.1

 

 

 

Troubleshooting Tips for Panda Installation

 

 

Case 1 : pandas package import error.

 

If have some error as shown below even after you installed panda libarary, it is likely that the package is not properly installed. In my case, I used the package using pip (method 1) for Python 3.5 and it works OK. But when I tried the same way for Python 3.62, I got this error. So I uninstalled the package and reinstall it with Method 2 and it worked.

 

>>> import pandas as pd

 

Traceback (most recent call last):

  File "<pyshell#2>", line 1, in <module>

    import pandas as pd

ModuleNotFoundError: No module named 'pandas'

 

 

Case 2 :  *.whl is not a supported wheel on this platform

 

If you got an error as shown below when you try installing a whl file, it is highly likely that the whl file does not match the python version that you are using. See here to find right whl package for your python version.

 

C:\...\Python36-32>pip install pandas-0.18.1-cp35-cp35m-win32.whl

 

pandas-0.18.1-cp35-cp35m-win32.whl is not a supported wheel on this platform.