Python
Scripting & DataOverview
General-purpose, high-level language known for readability. Widely used in data science, AI/ML, web backends, scripting, and automation.
Resources
Popular learning and reference links:
Installation & Getting Started
Python comes pre-installed on macOS and most Linux distros. Use pyenv or the official installer for version management.
# macOS (Homebrew)
brew install python
# Using pyenv (recommended for version management)
brew install pyenv
pyenv install 3.12
pyenv global 3.12
# Using uv (fast Python version management)
uv python install 3.12
uv python pin 3.12
# Verify installation
python3 --version
# REPL — Python has an excellent built-in REPL
python3 # Standard REPL
python3 -i script.py # Run script then drop into REPL
# IPython — enhanced REPL with autocomplete, syntax highlighting
pip install ipython
ipython
# Jupyter — interactive notebooks
pip install jupyter
jupyter notebook
# Run a script
python3 script.py
# Quick one-liner
python3 -c "print('Hello!')" Project Scaffolding
Create a virtual environment and a pyproject.toml or requirements.txt.
# Basic project with venv
mkdir my-project && cd my-project
python -m venv .venv
source .venv/bin/activate
# Using uv (recommended)
uv init my-project
cd my-project
uv run python main.py
# Using poetry
poetry new my-project
cd my-project
poetry install
# Django web project
pip install django
django-admin startproject mysite
# FastAPI project
pip install "fastapi[standard]" Package Management
pip is the standard package manager. uv and poetry are modern alternatives.
# Install a package
pip install requests
# Install from requirements file
pip install -r requirements.txt
# Using uv (fast, modern)
uv pip install requests
uv add requests
# Using poetry
poetry add requests
# Virtual environments
python -m venv .venv
source .venv/bin/activate # macOS/Linux Tooling & Formatter/Linter
Python has mature formatting and linting tools. Ruff is rapidly becoming the standard for both.
# Ruff — extremely fast linter + formatter (Rust-based)
pip install ruff
ruff check . # Lint
ruff format . # Format
ruff check --fix . # Auto-fix
# Black — opinionated formatter
pip install black
black .
# isort — import sorting (built into Ruff)
pip install isort
isort .
# mypy — static type checker
pip install mypy
mypy src/
# pyright — fast type checker (Microsoft)
pip install pyright
pyright
# pyproject.toml
[tool.ruff]
line-length = 100
target-version = "py312"
[tool.ruff.lint]
select = ["E", "F", "I", "N", "UP", "B"]
[tool.mypy]
strict = true Build & Compile Model
Interpreted with bytecode compilation. Python compiles source to bytecode (.pyc), then executes on the CPython VM.
# No explicit compile step — run directly
python3 app.py
# Bytecode is auto-cached in __pycache__/
# Force compile all files
python3 -m compileall src/
# PyInstaller — bundle into standalone executable
pip install pyinstaller
pyinstaller --onefile app.py
# Output: dist/app (single binary)
# Nuitka — compile to C, then native binary
pip install nuitka
nuitka --standalone app.py
# Cython — compile Python to C extension modules
cython module.pyx
Execution model:
- Source → Bytecode (.pyc) → CPython VM interpreter
- CPython has a GIL (Global Interpreter Lock) — one thread executes Python at a time
- Alternative runtimes: PyPy (JIT), GraalPy, Cython (AOT to C)
- Python 3.13+ has experimental free-threaded mode (no GIL)
- No static binary output by default — requires third-party tools
Libraries & Frameworks
Python has a vast ecosystem spanning web, data science, machine learning, and automation.
Web Frameworks - Django, Flask, FastAPI, Starlette, Litestar
Data Science - NumPy, Pandas, Polars, SciPy, Matplotlib, Seaborn, Plotly
Machine Learning - scikit-learn, PyTorch, TensorFlow, Keras, Hugging Face Transformers, JAX
LLM / AI - LangChain, LlamaIndex, OpenAI SDK, Anthropic SDK
HTTP / API - Requests, httpx, aiohttp, urllib3
Databases / ORM - SQLAlchemy, Django ORM, Tortoise ORM, Peewee, databases
CLI - Click, Typer, argparse, Rich
Async - asyncio, trio, anyio, uvloop
Testing - pytest, unittest, hypothesis, coverage
DevOps / Automation - Fabric, Ansible, Boto3 (AWS), Celery
Testing
pytest is the de facto standard. Python also includes unittest in the standard library. hypothesis provides property-based testing.
# pytest (recommended)
def test_addition():
assert 1 + 2 == 3
def test_list_contains():
assert 2 in [1, 2, 3]
# Fixtures for setup/teardown
import pytest
@pytest.fixture
def sample_list():
return [1, 2, 3]
def test_length(sample_list):
assert len(sample_list) == 3
# Parametrize — run test with multiple inputs
@pytest.mark.parametrize("a,b,expected", [
(1, 2, 3),
(0, 0, 0),
(-1, 1, 0),
])
def test_add(a, b, expected):
assert a + b == expected
# unittest (standard library)
import unittest
class TestMath(unittest.TestCase):
def test_add(self):
self.assertEqual(1 + 2, 3)
# Run tests
pytest # pytest (auto-discovers tests)
pytest -v # Verbose output
pytest --cov=mypackage # With coverage
python -m unittest # Standard library Debugging
Built-in pdb debugger, enhanced ipdb/pudb, and VS Code/PyCharm GUI debuggers.
# pdb — built-in debugger
import pdb; pdb.set_trace() # Breakpoint
# Python 3.7+ — built-in breakpoint()
breakpoint() # Drops into pdb (or configured debugger)
# pdb commands:
# n (next), s (step into), c (continue)
# p expr (print), pp expr (pretty print)
# l (list source), w (where/stack trace)
# b 42 (breakpoint at line 42)
# q (quit)
# Run with debugger
python -m pdb script.py
# ipdb — IPython-enhanced debugger
pip install ipdb
import ipdb; ipdb.set_trace()
# pudb — full-screen TUI debugger
pip install pudb
python -m pudb script.py
# VS Code debugging
# launch.json:
{
"type": "debugpy",
"request": "launch",
"program": "${file}",
"console": "integratedTerminal"
}
# Click gutter for breakpoints, F5 to start
# PyCharm — built-in debugger
# Click gutter, Shift+F9 to debug
# Logging (prefer over print)
import logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)
logger.debug("Processing %d items", len(items))
logger.error("Failed to connect", exc_info=True)
# Rich — beautiful tracebacks
pip install rich
from rich.traceback import install
install(show_locals=True) Variables
Variables are dynamically typed. No declaration keyword needed — just assign.
name = "Alice"
age = 30
is_active = True
# Multiple assignment
x, y, z = 1, 2, 3
# Swap
a, b = b, a
# Constants (convention only, not enforced)
MAX_SIZE = 100
# Type hints (optional, for tooling)
count: int = 0
label: str = "hello" Types
Dynamically typed with built-in types: int, float, str, bool, list, dict, tuple, set.
# Primitives
s: str = "hello"
n: int = 42
f: float = 3.14
b: bool = True
# Collections
items: list[int] = [1, 2, 3]
mapping: dict[str, int] = {"a": 1, "b": 2}
pair: tuple[int, str] = (1, "hello")
unique: set[int] = {1, 2, 3}
# None (null equivalent)
value: str | None = None
# Type checking
isinstance(42, int) # True
type(42) # <class 'int'> Data Structures
Rich built-in types: list, dict, set, tuple, frozenset. collections module adds deque, Counter, defaultdict, and more.
# List — ordered, mutable
nums = [1, 2, 3]
nums.append(4)
nums.pop()
nums[0] # 1
nums[1:3] # [2, 3] (slicing)
# Tuple — ordered, immutable
point = (10, 20)
x, y = point # Unpacking
# Dict — key-value (ordered since 3.7)
user = {"name": "Alice", "age": 30}
user["email"] = "a@b.com"
user.get("missing", "default")
user.keys()
user.items()
# Set — unique values
s = {1, 2, 3, 3} # {1, 2, 3}
s.add(4)
s & {2, 3, 5} # Intersection: {2, 3}
s | {5, 6} # Union: {1, 2, 3, 4, 5, 6}
# Comprehensions
squares = [x**2 for x in range(10)]
evens = {x for x in range(10) if x % 2 == 0}
mapping = {k: v for k, v in pairs}
# collections module
from collections import deque, Counter, defaultdict, namedtuple
dq = deque([1, 2, 3])
dq.appendleft(0) # O(1) prepend
counts = Counter("abracadabra")
# Counter({'a': 5, 'b': 2, 'r': 2, 'c': 1, 'd': 1})
dd = defaultdict(list)
dd["key"].append(1) # No KeyError
Point = namedtuple("Point", ["x", "y"])
p = Point(10, 20) Functions
Defined with def. Support default args, keyword args, and type hints.
# Basic function
def greet(name: str) -> str:
return f"Hello, {name}!"
# Default parameters
def power(base: int, exp: int = 2) -> int:
return base ** exp
# *args and **kwargs
def log(*args, **kwargs):
print(*args, **kwargs)
# Lambda (anonymous function)
add = lambda a, b: a + b
# Decorators
def timer(func):
def wrapper(*args):
import time
start = time.time()
result = func(*args)
print(f"{time.time() - start:.2f}s")
return result
return wrapper Conditionals
if/elif/else blocks. Python 3.10+ adds match/case for pattern matching.
# If / elif / else
if x > 0:
print("positive")
elif x == 0:
print("zero")
else:
print("negative")
# Ternary (inline if)
label = "positive" if x > 0 else "non-positive"
# Match / case (Python 3.10+)
match command:
case "quit":
exit()
case "hello":
print("Hi!")
case _:
print("Unknown command")
# Truthy / falsy
if not []:
print("empty list is falsy") Loops
for loops iterate over any iterable. while for condition-based loops.
# For loop
for i in range(5):
print(i)
# Iterate over a list
for item in ["a", "b", "c"]:
print(item)
# Enumerate (index + value)
for i, val in enumerate(["a", "b", "c"]):
print(i, val)
# While loop
n = 0
while n < 3:
n += 1
# List comprehension
squares = [x ** 2 for x in range(10)]
# Dict comprehension
mapping = {k: v for k, v in [("a", 1), ("b", 2)]} Generics & Type System
Dynamically typed with optional type hints (PEP 484+). Generics via typing module. Type checkers (mypy, pyright) enforce at analysis time.
# Generic functions (Python 3.12+ syntax)
def identity[T](value: T) -> T:
return value
# Constrained generics
from typing import Protocol
class HasLength(Protocol):
def __len__(self) -> int: ...
def get_length[T: HasLength](x: T) -> int:
return len(x)
# Generic classes
class Box[T]:
def __init__(self, value: T) -> None:
self.value = value
def map[U](self, fn: Callable[[T], U]) -> "Box[U]":
return Box(fn(self.value))
# Pre-3.12 syntax (still common)
from typing import TypeVar, Generic
T = TypeVar("T")
class Stack(Generic[T]):
def __init__(self) -> None:
self._items: list[T] = []
def push(self, item: T) -> None:
self._items.append(item)
def pop(self) -> T:
return self._items.pop()
# Union types
def process(value: int | str) -> str:
return str(value)
# TypeGuard — narrow types
from typing import TypeGuard
def is_string_list(val: list[object]) -> TypeGuard[list[str]]:
return all(isinstance(x, str) for x in val)
# Variance
from typing import TypeVar
T_co = TypeVar("T_co", covariant=True)
T_contra = TypeVar("T_contra", contravariant=True) Inheritance & Composition
Supports single and multiple inheritance with MRO (Method Resolution Order). Mixins and protocols (structural typing) for composition.
# Single inheritance
class Animal:
def __init__(self, name: str):
self.name = name
def speak(self) -> str:
return f"{self.name} makes a sound"
class Dog(Animal):
def speak(self) -> str:
return f"{self.name} barks"
# Multiple inheritance
class Swimmer:
def swim(self): return "swimming"
class Flyer:
def fly(self): return "flying"
class Duck(Animal, Swimmer, Flyer):
pass
# Super and MRO
class Puppy(Dog):
def __init__(self, name: str):
super().__init__(name)
self.young = True
# Abstract base classes
from abc import ABC, abstractmethod
class Shape(ABC):
@abstractmethod
def area(self) -> float: ...
# Protocols (structural typing, Python 3.8+)
from typing import Protocol
class Drawable(Protocol):
def draw(self) -> None: ...
# Any class with draw() satisfies Drawable Functional Patterns
First-class functions, comprehensions, generators, and functools for functional programming.
# Higher-order functions
def apply(fn, x):
return fn(x)
double = lambda x: x * 2
apply(double, 5) # 10
# Map, filter, reduce
from functools import reduce
nums = [1, 2, 3, 4, 5]
squared = list(map(lambda x: x * x, nums))
evens = list(filter(lambda x: x % 2 == 0, nums))
total = reduce(lambda a, b: a + b, nums, 0)
# List comprehensions (preferred over map/filter)
squared = [x * x for x in nums]
evens = [x for x in nums if x % 2 == 0]
# Generator expressions (lazy)
gen = (x * x for x in range(1_000_000))
# Closures
def counter(start=0):
count = start
def inc():
nonlocal count
count += 1
return count
return inc
# Partial application
from functools import partial
add = lambda a, b: a + b
add5 = partial(add, 5)
# Decorators (higher-order functions)
def log(fn):
def wrapper(*args, **kwargs):
print(f"Calling {fn.__name__}")
return fn(*args, **kwargs)
return wrapper Concurrency
GIL limits true thread parallelism for CPU work. Use asyncio for I/O concurrency, multiprocessing for CPU parallelism.
# asyncio — async/await for I/O concurrency
import asyncio
async def fetch_data(url: str) -> str:
# simulated async I/O
await asyncio.sleep(1)
return f"data from {url}"
async def main():
# Run concurrently
results = await asyncio.gather(
fetch_data("/api/a"),
fetch_data("/api/b"),
fetch_data("/api/c"),
)
print(results)
asyncio.run(main())
# Threading — concurrent I/O (GIL limits CPU work)
from concurrent.futures import ThreadPoolExecutor
with ThreadPoolExecutor(max_workers=4) as pool:
futures = [pool.submit(fetch, url) for url in urls]
results = [f.result() for f in futures]
# Multiprocessing — true CPU parallelism (separate processes)
from multiprocessing import Pool
with Pool(4) as pool:
results = pool.map(cpu_heavy_task, data)
# Python 3.13+ free-threading (experimental)
# Run without GIL: python3.13t -X gil=0 script.py Modules & Imports
Every .py file is a module. Directories with __init__.py are packages. Use import to bring in modules. Distribution is done via wheels/sdists uploaded to PyPI.
# Import entire module
import os
import json
# Import specific names
from os.path import join, exists
# Import with alias
import numpy as np
from collections import defaultdict as dd
# Relative imports (within a package)
from . import sibling_module
from ..utils import helper
# Import everything (discouraged)
from math import *
# Dynamic import
import importlib
mod = importlib.import_module('my_module')
# Conditional import
try:
import ujson as json
except ImportError:
import json
# __init__.py controls package exports
# my_package/__init__.py:
# from .core import App
# from .utils import helper
# __all__ = ['App', 'helper'] Error Handling
Use try/except/finally. Python uses exceptions extensively.
# Try / except
try:
result = 10 / 0
except ZeroDivisionError as e:
print(f"Error: {e}")
finally:
print("always runs")
# Multiple exceptions
try:
data = json.loads(raw)
except (ValueError, KeyError) as e:
print(f"Bad data: {e}")
# Raising exceptions
def divide(a, b):
if b == 0:
raise ValueError("Cannot divide by zero")
return a / b
# Custom exceptions
class AppError(Exception):
def __init__(self, message, code):
super().__init__(message)
self.code = code Memory Management
Reference counting + cyclic garbage collector. CPython uses reference counting as the primary mechanism, with a cycle detector for circular references.
import sys
import gc
# Reference counting
a = [1, 2, 3]
sys.getrefcount(a) # Shows reference count
# When refcount drops to 0, memory is freed immediately
b = a # refcount = 2
del b # refcount = 1
a = None # refcount = 0 → freed
# Context managers — deterministic cleanup
with open('file.txt') as f:
data = f.read()
# File is closed here, regardless of exceptions
# Custom context manager
class ManagedResource:
def __enter__(self):
self.acquire()
return self
def __exit__(self, *args):
self.release()
# __del__ — destructor (not guaranteed to run)
class MyClass:
def __del__(self):
print("Cleaning up")
# Garbage collector control
gc.collect() # Force collection
gc.disable() # Disable automatic GC
gc.get_stats() # GC statistics
gc.get_referrers(obj) # What references this object?
# weakref — references that don't prevent GC
import weakref
obj = SomeClass()
weak = weakref.ref(obj)
weak() # Returns obj or None if GC'd
# Slots — reduce per-instance memory
class Point:
__slots__ = ('x', 'y')
def __init__(self, x, y):
self.x = x
self.y = y
# Uses ~40% less memory than regular class Performance Profiling
Built-in cProfile and timeit. Third-party tools: py-spy, scalene, line_profiler.
# timeit — quick benchmarks
import timeit
timeit.timeit('sum(range(1000))', number=10000)
# cProfile — built-in profiler
import cProfile
cProfile.run('my_function()')
# Profile to file + visualize
python -m cProfile -o output.prof script.py
# Visualize with snakeviz:
pip install snakeviz
snakeviz output.prof
# py-spy — sampling profiler (no code changes)
pip install py-spy
py-spy top --pid 12345 # Live top-like view
py-spy record -o profile.svg -- python app.py # Flame graph
# Scalene — CPU + memory + GPU profiler
pip install scalene
scalene script.py
# line_profiler — line-by-line timing
pip install line_profiler
# Decorate function with @profile, then:
kernprof -l -v script.py
# memory_profiler — line-by-line memory
pip install memory-profiler
python -m memory_profiler script.py
# tracemalloc — built-in memory tracing
import tracemalloc
tracemalloc.start()
# ... code ...
snapshot = tracemalloc.take_snapshot()
for stat in snapshot.statistics('lineno')[:10]:
print(stat) Interop
Python excels at interop via ctypes, cffi, Cython, and pybind11 for C/C++. Also strong at subprocess and REST/HTTP integration.
# ctypes — call C shared libraries directly
import ctypes
libc = ctypes.CDLL("libc.so.6") # or "libc.dylib" on macOS
libc.printf(b"Hello from C!\n")
libm = ctypes.CDLL("libm.so.6")
libm.sqrt.restype = ctypes.c_double
libm.sqrt(ctypes.c_double(16.0)) # 4.0
# cffi — cleaner C FFI
from cffi import FFI
ffi = FFI()
ffi.cdef("double sqrt(double x);")
lib = ffi.dlopen("libm.so.6")
lib.sqrt(16.0) # 4.0
# subprocess — call any system command
import subprocess
result = subprocess.run(
["ls", "-la"],
capture_output=True, text=True
)
print(result.stdout)
# pybind11 — C++ ↔ Python binding
# C++ side:
# #include <pybind11/pybind11.h>
# int add(int a, int b) { return a + b; }
# PYBIND11_MODULE(mymod, m) { m.def("add", &add); }
# Python side:
import mymod
mymod.add(1, 2) # 3
# Cython — write C extensions in Python-like syntax
# .pyx files compiled to C for performance
# REST / HTTP — universal interop
import requests
resp = requests.get("https://api.example.com/data")
data = resp.json() Packaging & Distribution
Publish libraries to PyPI. Distribute apps as Docker containers, pip-installable packages, or standalone executables.
# pyproject.toml — modern package config
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[project]
name = "my-package"
version = "1.0.0"
description = "My awesome library"
requires-python = ">=3.10"
dependencies = ["requests>=2.28"]
[project.scripts]
mycli = "my_package.cli:main"
[project.optional-dependencies]
dev = ["pytest", "ruff"]
# Build
python -m build # Creates dist/*.whl and dist/*.tar.gz
# Publish to PyPI
pip install twine
twine upload dist/*
# Or use uv
uv build
uv publish
# Standalone executable
pip install pyinstaller
pyinstaller --onefile app.py
# Output: dist/app (single binary)
# Docker
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "app.py"]