Top 10 Python Libraries
The top 10 python libraries are quick and easy to pick up. The programming language is pretty neat too. While it is easy to learn, the libraries available are powerful. I have compiled a list of the Top 10 Python Libraries every programmer should know. This is not a list of every library I recommend but is a great starting point for most Python developers. I have not touched on any libraries for Graphical User Interfaces here but will discuss it in a later article.
1.Requests
Requests is a wonderful python library to send HTTP/1.1 requests within a program. It provides a simpler and higher level way of creating requests for internet interactions. Requests is a definite recommended addition to your collection of python libraries. Below is a quick script to scrape HTML using urllib3:
import urllib3
from bs4 import BeautifulSoup
conn = urllib3.PoolManager()
req = conn.request('GET','https://madhatlabs.com')
soup_parser = BeautifulSoup(req.data)
print("Title is: "+str(soup_parser.title))
print("Content is: "+str(soup_parser.get_text()))
similarly, a script with the same functionality but using Requests is:
import requests
from bs4 import BeautifulSoup
req = requests.get('https://madhatlabs.com')
soup_parser = BeautifulSoup(req.content)
print("Title is: "+str(soup_parser.title))
print("Content is: "+str(soup_parser.get_text()))
We have only made a simple HTTP request for the website front page but we are already one line shorter. The Requests library handles a lot of stuff like Keep-Alive, connection poolings, SSL, authentication, and other modern web concepts and functionality.
2. Scrapy
Scrapy is one of the python libraries that provides an easy way to create website scrapers for web crawling. It is easy to parse and pull data from websites and for that reason, we can easily make a simple spider to scrape the front page of a website.
import scrapy
class MadHatSpider(scrapy.Spider):
#name of the Spider
name='MadHatSpider'
#where to begin crawling
start_urls=['https://madhatlabs.com']
#function that takes the response and does things
def parse(self, response):
print(response.body)
3. Pillow(PIL)
Pillow (PIL) is one of the python libraries that provides an easy way to interact and manipulate images. It provides easy ways to resize, rotate and manipulate the image overall. PIL is one of the python libraries that is easy to use and below is a simple program to get the type and size of an image.
from PIL import Image
pic = Image.open("MadHat.jpg")
print(f"The format of this image is {pic.format} and its size is {pic.size}")
4. SQLAlchemy
SQLAlchemy is one of the python libraries for database programming and it provides a programmatic way to access and interact with a SQL database. It also provides an Object Relational Mapper and also automates redundant tasks. As a result, the object model and database schema can be decoupled from the beginning of development. Above all, it is a powerful tool for SQL database interactions and below is a simple program to create a database with two tables.
import os
import sys
from sqlalchemy import Column, Integer, ForeignKey,String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship
from sqlalchemy import create_engine
MadHat= declarative_base()
class Person(MadHat):
__tablename__ = 'hat_wearer'
id = Column(Integer,primary_key=True)
first_name = Column(String(250),nullable=False)
last_name = Column(String(250),nullable=False)
class Hat(MadHat):
__tablename__ = 'hat'
id = Column(Integer,primary_key=True)
hat_color = Column(String(250))
person_id = Column(Integer,ForeignKey('hat_wearer.id'))
hat_wearer = relationship(Person)
hat_engine = create_engine('sqlite:///mad_hat.db')
MadHat.metadata.create_all(hat_engine)
5. BeautifulSoup
BeautifulSoup is another one of the python libraries that provide web scraping and it provides a powerful but easy way to parse documents and traverse the document tree. It easily navigates and parses HTML and XML and as a result is powerful with websites and XML documents. It can save developers countless hours is simple to use. I have written a simple request and parser to find all hyperlinks in a website.
import requests
from bs4 import BeautifulSoup
website = 'https://madhatlabs.com'
req = requests.get(website)
soup_parser = BeautifulSoup(req.content)
print(f"Found all links in {website}")
for anchor in soup_parser.find_all('a',href=True):
print(f"{anchor['href']}")
6. Twisted
Twisted is one the best python libraries for networking because it removes a lot of complexity in network programming. It is event-driven and makes it easy to add networking functionality to a python program. It also includes support for SSL, UDP, IMAP, SSHv2 and more. Above all, it is a simple but powerful way to add networking to python programs. Twisted is a strong addition to your libraries for application development and communication.
This is a simple server that observers for connections and prints out what it receives.
from twisted.internet import reactor, protocol
class Observer(protocol.Protocol):
"""Observes and prints the incoming data"""
def dataReceived(self, data):
print(data)
server= protocol.ServerFactory()
server.protocol = Observer
reactor.listenTCP(8000,server)
reactor.run()
Similarly, this is the client that will connect and send a message to the server.
from twisted.internet import reactor, protocol
class SimpleProtocol(protocol.Protocol):
defconnectionMade(self):
print("Connection Made. Transmitting simple message.")
self.transport.write(b"Simple Message")
self.transport.loseConnection()
class SimpleClient(protocol.ClientFactory):
protocol = SimpleProtocol
def clientConnectionFailed(self, connector, reason):
print("Connection failed. Shutting down.")
reactor.stop()
def clientConnectionLost(self, connector, reason):
print("Connection lost. Shutting down.")
reactor.stop()
client = SimpleClient()
reactor.connectTCP("localhost", 8000, client)
reactor.run()
7. NumPy
The NumPy library is another one of the python libraries for data science that makes computation tasks in python easier. NumPy is useful for data science, data mining, linear algebra, N-dimension array calculations, and scientific applications. It provides easy ways to operate on N-dimension arrays such as matrix products.
import numpy as np
first_matrix = np.array([[1,1],[2,2]])
second_matrix = np.array([[1,2],[2,1]])
first_matrix.dot(second_matrix)
8. Scapy
The Scapy library is one of the python libraries for networking but it allows a programmer to manipulate network packets similarly to Nmap or tcpdump. Scapy also provides a way to do functionality other tools won’t like injecting custom 802.11 frames, sending invalid frames and more. Overall it is great for getting access and information to network packets but requires a greater knowledge of networking to fully take advantage of its capabilities in networking. A simpler TCP connect scanner is shown. Scapy is a great addition to any cyber security and security oriented collection of python libraries.
import sys
from scapy.all import sr1,IP,TCP
ip = input("What ip do you want to scan?")
active = sr1(IP(dst=ip)/TCP(dport=80,flags="S"))
if active:
active.show()
9. Pandas
Pandas is one of the python libraries for data science that provides an easy way to manipulate and prepare data for analysis. This is great for Data Mining, Machine Learning, Data Analysis and so much more. Due to Pandas’s capabilities in data preparation, it has become one of great python libraries for Machine Learning and Data mining. It allows a Python-centered workflow when doing data analysis in python.
Reading in a CSV file is as easy as:
import pandas as pd
csv_data = pd.read_csv('mycsvfile.csv')
Similarly, it is just as simple to select and read the first 2 rows from the csv_data.
print(csv_data[:2])
10. TensorFlow
TensorFlow is one of the python libraries for Machine Learning framework that makes it easier for developers to gain access to Machine Learning functions without the need to know the extensive mathematics behind the different machine learning models. The Tensorflow Library is a bit more extensive than a simple 20 line program to get some cool Machine Learning/Artificial Intelligence functionality working, but you can check out the article discussing it here Tensorflow and Keras QuickDive. Below I create two 1-D tensors and multiply them together.
import tensorflow as tf
first = tf.constant([4,3,2,1])
second = tf.constant([2,3,1,2])
result = tf.multiply(first,second)
session = tf.Session()
print(session.run(result))
session.close()
11.Keras (Bonus)
The Keras library is a high-level neural network framework, one of the python libraries for machine learning, that works with TensorFlow for Deep Learning(Insert link here to another post). Keras with TensorFlow can be used for rapid creation and testing of Deep Learning Models. It is worth noting that both Keras and Tensorflow are python libraries for machine learning, but Keras runs on top of Tensorflow.
These python libraries are far from an extensive list of great libraries to help speed up workflow. They are a great addition to any developer’s collection of use python libraries. I always suggest constant learning and integration of 3rd party software to maximize what someone can create. It is always important to follow the Object Oriented principle of DRY.
Don’t Repeat Yourself!