Python |
||
Panda
Panda is a data structure and analysis tool developed for Python. You might have heard of many cases where Python is used for Big Data / Data Science etc. Panda is one of the most common tools that are used for this kind of application.
Pandas is a popular open-source Python library for data manipulation and analysis. It has become a widely-used library for handling structured data, especially for tasks such as data cleaning, transformation, and analysis. The name 'pandas' is derived from the term "panel data", which is a term used in econometrics for multi-dimensional structured data sets.
Actually, Panda is not the only tool that are used for Big Data/Data Science application. You would see from many tutorials that Panada is used alongwith additional tools like numpy or matplotlib.
If Installation Method 1 does not work with your Python and PIP, you need to download proper Panda installation package and manually install it. You can download all the Panda version from http://pandas.pydata.org/ . From this site, you can get all the official documents about Panda as well.
How to know which version to Install ?
When you try to install the Panda using Installation Method 2, one of the most challenging thing to install a Panda was to figure out which version of Panda I need to download and install. It seemed that Panda distribution package is depenant very stictly on specific PC platform and specific Python version. It was difficult to find the exact version just based on Panda installation version.
Of course, you would know which Python version is installed on your PC. It is simple. (If you are not sure one which Python version is installed on your PC, you can just run the Python kernel in command line, you can find Python information as shown below)
< Finding Operating System Information >
c:\Python35>python
Python 3.5.1 (v3.5.1:37a07cee5969, Dec 6 2015, 01:38:48) [MSC v.1900 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.
Then you need to figure out your PC information which is a little bit tricky to find out. One way to get rough information on your PC is to use the command systeminfo in case of Windows 7 or higher or lscpu in case of linux.
Host Name: ... OS Name: Microsoft Windows 7 Enterprise OS Version: 6.1.7601 Service Pack 1 Build 7601 OS Manufacturer: Microsoft Corporation OS Configuration: Member Workstation OS Build Type: Multiprocessor Free Registered Owner: ... Registered Organization: ... Product ID: ... Original Install Date: 12/1/2015, 5:39:01 PM System Boot Time: 12/5/2016, 12:44:32 AM System Manufacturer: Dell Inc. System Model: Latitude E6430 System Type: x64-based PC Processor(s): 1 Processor(s) Installed. [01]: Intel64 Family 6 Model 58 Stepping 9 GenuineI el ~2701 Mhz BIOS Version: Dell Inc. A12, 5/20/2013 Windows Directory: C:\Windows System Directory: C:\Windows\system32 Boot Device: ... System Locale: en-us;English (United States) Input Locale: en-us;English (United States) Time Zone: (UTC-05:00) Eastern Time (US & Canada) Total Physical Memory: 16,338 MB Available Physical Memory: 4,920 MB Virtual Memory: Max Size: 32,675 MB Virtual Memory: Available: 18,952 MB Virtual Memory: In Use: 13,723 MB Page File Location(s): C:\pagefile.sys Domain: ... Logon Server: ... Hotfix(s): .... ....
< Finding the package for the specific python version >
The tips above is to find out the proper whl file matching a specific operating system. Now you need to figure out which package would match to the specific Python version you are using. You can figure this out from the panda whl file as shown below. The red part indicates the python version it support.
pandas-0.21.0-cp27-cp27m-win32.whl <-- this supports Python 2.7x pandas-0.18.1-cp35-cp35m-win32.whl <-- this supports Python 3.5x pandas-0.20.3-cp36-cp36m-win32.whl <-- this supports Python 3.6x
There are several different ways to install Panada, but I want to introduce a couple of simple (at least simple to me) method. I think the simplest way is the Method 1. So I would recommend you to try the Method 1 first and try Method 2 if Method 1 does not work for you(You'd better uninstall the package first and then try Method 2).
Method 1 > Install from PIP
The first method is to open up a Windows command and try 'pip install panda' in the folder where your Python is installed. In my PC, I am using Python 3.5.1 installed in C:\Python35. So I ran the command as shown below.
c:\Python35>pip install pandas
NOTE : you may install panda or pandas , but I would suggest you to install pandas (note 's' at the end) since panda may not work with the latest python.
If PIP can install the panda, you would see the process going on as shown below. Otherwise, you may have some error message. If this does not work, upgrade PIP to the latest version as in PIP page and try this again.
Collecting panda Downloading panda-0.3.1.tar.gz Requirement already satisfied: setuptools in c:\python35\lib\site-packages (from panda) Collecting requests (from panda) Downloading requests-2.12.4-py2.py3-none-any.whl (576kB) 100% |################################| 583kB 1.5MB/s Installing collected packages: requests, panda Running setup.py install for panda ... done Successfully installed panda-0.3.1 requests-2.12.4
If Pana installation went successful, try to install matplotlib as shown below since a lot of application would need matplotlib in addition to panda.
c:\Python35>pip install matplotlib
Collecting matplotlib Downloading matplotlib-1.5.3-cp35-cp35m-win32.whl (6.2MB) 100% |################################| 6.2MB 178kB/s Requirement already satisfied: numpy>=1.6 in c:\python35\lib\site-packages (from matplotlib) Requirement already satisfied: python-dateutil in c:\python35\lib\site-packages (from matplotlib) Collecting pyparsing!=2.0.4,!=2.1.2,>=1.5.6 (from matplotlib) Downloading pyparsing-2.1.10-py2.py3-none-any.whl (56kB) 100% |################################| 61kB 5.1MB/s Collecting cycler (from matplotlib) Downloading cycler-0.10.0-py2.py3-none-any.whl Requirement already satisfied: pytz in c:\python35\lib\site-packages (from matpl otlib) Requirement already satisfied: six>=1.5 in c:\python35\lib\site-packages (from p ython-dateutil->matplotlib) Installing collected packages: pyparsing, cycler, matplotlib Successfully installed cycler-0.10.0 matplotlib-1.5.3 pyparsing-2.1.10
If Pana installation went successful, try to install numpy as shown below since a lot of application would need numpy in addition to panda.
c:\Python35>pip install numpy
Requirement already satisfied: numpy in c:\python35\lib\site-packages
In my case, numpy was found to be installed already. so I got the message as above. If it is not installed already, you would see the installation process going on.
Method 2 > Install the downloaded package
In this method, I will install the panda package that I downloaded as in Where to Get section. I was not lucky to make it work at the first try. I tried several different versions and finally got the following package working. So don't get disappointed if you fail at the first trial.. keep on trying :)
c:\Python35>pip install pandas-0.18.0-cp35-cp35m-win32.whl
Processing c:\python35\pandas-0.18.0-cp35-cp35m-win32.whl Collecting numpy>=1.7.0 (from pandas==0.18.0) Downloading numpy-1.11.3-cp35-none-win32.whl (6.6MB) 100% |################################| 6.6MB 172kB/s Collecting pytz>=2011k (from pandas==0.18.0) Downloading pytz-2016.10-py2.py3-none-any.whl (483kB) 100% |################################| 491kB 1.7MB/s Collecting python-dateutil>=2 (from pandas==0.18.0) Downloading python_dateutil-2.6.0-py2.py3-none-any.whl (194kB) 100% |################################| 194kB 2.3MB/s Collecting six>=1.5 (from python-dateutil>=2->pandas==0.18.0) Downloading six-1.10.0-py2.py3-none-any.whl Installing collected packages: numpy, pytz, six, python-dateutil, pandas Successfully installed numpy-1.11.3 pandas-0.18.0 python-dateutil-2.6.0 pytz-20 6.10 six-1.10.0
Now let do some final check if the installation went really successful. The simplest way is just to try.
Run Python in command line as shown below.
c:\Python35>python Python 3.5.1 (v3.5.1:37a07cee5969, Dec 6 2015, 01:38:48) [MSC v.1900 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.
In Python command prompt, try following command. If this does not give you any error message, it means panda installation is successful.
>>> import pandas as pd
In Python command prompt, try following command. If this does not give you any error message, it means nympy installation is successful.
>>> import numpy as np
If you want to uninstall the panda that has already been installed, you can uninstall the package using pip as shown below.
c:\Python35>pip uninstall panda
Uninstalling panda-0.3.1: c:\...\python\python36-32\lib\site-packages\panda-0.3.1-py3.6.egg-info c:\...\python\python36-32\lib\site-packages\panda\__init__.py c:\...\python\python36-32\lib\site-packages\panda\__pycache__\__init__.cpython-36.pyc c:\...\python\python36-32\lib\site-packages\panda\__pycache__\models.cpython-36.pyc c:\...\python\python36-32\lib\site-packages\panda\__pycache__\request.cpython-36.pyc c:\...\python\python36-32\lib\site-packages\panda\__pycache__\test.cpython-36.pyc c:\...\python\python36-32\lib\site-packages\panda\__pycache__\upload_session.cpython-36.pyc c:\...\python\python36-32\lib\site-packages\panda\models.py c:\...\python\python36-32\lib\site-packages\panda\request.py c:\...\python\python36-32\lib\site-packages\panda\test.py c:\...\python\python36-32\lib\site-packages\panda\upload_session.py Proceed (y/n)? y Successfully uninstalled panda-0.3.1
Troubleshooting Tips for Panda Installation
Case 1 : pandas package import error.
If have some error as shown below even after you installed panda libarary, it is likely that the package is not properly installed. In my case, I used the package using pip (method 1) for Python 3.5 and it works OK. But when I tried the same way for Python 3.62, I got this error. So I uninstalled the package and reinstall it with Method 2 and it worked.
>>> import pandas as pd
Traceback (most recent call last): File "<pyshell#2>", line 1, in <module> import pandas as pd ModuleNotFoundError: No module named 'pandas'
Case 2 : *.whl is not a supported wheel on this platform
If you got an error as shown below when you try installing a whl file, it is highly likely that the whl file does not match the python version that you are using. See here to find right whl package for your python version.
C:\...\Python36-32>pip install pandas-0.18.1-cp35-cp35m-win32.whl
pandas-0.18.1-cp35-cp35m-win32.whl is not a supported wheel on this platform.
|
||