Python

Scripting & Data

Overview

General-purpose, high-level language known for readability. Widely used in data science, AI/ML, web backends, scripting, and automation.

Official Docs Python.org

Resources

Installation & Getting Started

Python comes pre-installed on macOS and most Linux distros. Use pyenv or the official installer for version management.

# macOS (Homebrew)
brew install python

# Using pyenv (recommended for version management)
brew install pyenv
pyenv install 3.12
pyenv global 3.12

# Using uv (fast Python version management)
uv python install 3.12
uv python pin 3.12

# Verify installation
python3 --version

# REPL — Python has an excellent built-in REPL
python3             # Standard REPL
python3 -i script.py  # Run script then drop into REPL

# IPython — enhanced REPL with autocomplete, syntax highlighting
pip install ipython
ipython

# Jupyter — interactive notebooks
pip install jupyter
jupyter notebook

# Run a script
python3 script.py

# Quick one-liner
python3 -c "print('Hello!')"

Project Scaffolding

Create a virtual environment and a pyproject.toml or requirements.txt.

# Basic project with venv
mkdir my-project && cd my-project
python -m venv .venv
source .venv/bin/activate

# Using uv (recommended)
uv init my-project
cd my-project
uv run python main.py

# Using poetry
poetry new my-project
cd my-project
poetry install

# Django web project
pip install django
django-admin startproject mysite

# FastAPI project
pip install "fastapi[standard]"

Package Management

pip is the standard package manager. uv and poetry are modern alternatives.

# Install a package
pip install requests

# Install from requirements file
pip install -r requirements.txt

# Using uv (fast, modern)
uv pip install requests
uv add requests

# Using poetry
poetry add requests

# Virtual environments
python -m venv .venv
source .venv/bin/activate  # macOS/Linux

Tooling & Formatter/Linter

Python has mature formatting and linting tools. Ruff is rapidly becoming the standard for both.

# Ruff — extremely fast linter + formatter (Rust-based)
pip install ruff
ruff check .          # Lint
ruff format .         # Format
ruff check --fix .    # Auto-fix

# Black — opinionated formatter
pip install black
black .

# isort — import sorting (built into Ruff)
pip install isort
isort .

# mypy — static type checker
pip install mypy
mypy src/

# pyright — fast type checker (Microsoft)
pip install pyright
pyright

# pyproject.toml
[tool.ruff]
line-length = 100
target-version = "py312"

[tool.ruff.lint]
select = ["E", "F", "I", "N", "UP", "B"]

[tool.mypy]
strict = true

Build & Compile Model

Interpreted with bytecode compilation. Python compiles source to bytecode (.pyc), then executes on the CPython VM.

# No explicit compile step — run directly
python3 app.py

# Bytecode is auto-cached in __pycache__/
# Force compile all files
python3 -m compileall src/

# PyInstaller — bundle into standalone executable
pip install pyinstaller
pyinstaller --onefile app.py
# Output: dist/app (single binary)

# Nuitka — compile to C, then native binary
pip install nuitka
nuitka --standalone app.py

# Cython — compile Python to C extension modules
cython module.pyx

Execution model:

Source → Bytecode (.pyc) → CPython VM interpreter
CPython has a GIL (Global Interpreter Lock) — one thread executes Python at a time
Alternative runtimes: PyPy (JIT), GraalPy, Cython (AOT to C)
Python 3.13+ has experimental free-threaded mode (no GIL)
No static binary output by default — requires third-party tools

Libraries & Frameworks

Python has a vast ecosystem spanning web, data science, machine learning, and automation.

Web Frameworks - Django, Flask, FastAPI, Starlette, Litestar

Data Science - NumPy, Pandas, Polars, SciPy, Matplotlib, Seaborn, Plotly

Machine Learning - scikit-learn, PyTorch, TensorFlow, Keras, Hugging Face Transformers, JAX

LLM / AI - LangChain, LlamaIndex, OpenAI SDK, Anthropic SDK

HTTP / API - Requests, httpx, aiohttp, urllib3

Databases / ORM - SQLAlchemy, Django ORM, Tortoise ORM, Peewee, databases

CLI - Click, Typer, argparse, Rich

Async - asyncio, trio, anyio, uvloop

Testing - pytest, unittest, hypothesis, coverage

DevOps / Automation - Fabric, Ansible, Boto3 (AWS), Celery

Type Checking - mypy, pyright, Pydantic

Testing

pytest is the de facto standard. Python also includes unittest in the standard library. hypothesis provides property-based testing.

# pytest (recommended)
def test_addition():
    assert 1 + 2 == 3

def test_list_contains():
    assert 2 in [1, 2, 3]

# Fixtures for setup/teardown
import pytest

@pytest.fixture
def sample_list():
    return [1, 2, 3]

def test_length(sample_list):
    assert len(sample_list) == 3

# Parametrize — run test with multiple inputs
@pytest.mark.parametrize("a,b,expected", [
    (1, 2, 3),
    (0, 0, 0),
    (-1, 1, 0),
])
def test_add(a, b, expected):
    assert a + b == expected

# unittest (standard library)
import unittest

class TestMath(unittest.TestCase):
    def test_add(self):
        self.assertEqual(1 + 2, 3)

# Run tests
pytest                  # pytest (auto-discovers tests)
pytest -v               # Verbose output
pytest --cov=mypackage  # With coverage
python -m unittest      # Standard library

Debugging

Built-in pdb debugger, enhanced ipdb/pudb, and VS Code/PyCharm GUI debuggers.

# pdb — built-in debugger
import pdb; pdb.set_trace()  # Breakpoint

# Python 3.7+ — built-in breakpoint()
breakpoint()  # Drops into pdb (or configured debugger)

# pdb commands:
# n (next), s (step into), c (continue)
# p expr (print), pp expr (pretty print)
# l (list source), w (where/stack trace)
# b 42 (breakpoint at line 42)
# q (quit)

# Run with debugger
python -m pdb script.py

# ipdb — IPython-enhanced debugger
pip install ipdb
import ipdb; ipdb.set_trace()

# pudb — full-screen TUI debugger
pip install pudb
python -m pudb script.py

# VS Code debugging
# launch.json:
{
  "type": "debugpy",
  "request": "launch",
  "program": "${file}",
  "console": "integratedTerminal"
}
# Click gutter for breakpoints, F5 to start

# PyCharm — built-in debugger
# Click gutter, Shift+F9 to debug

# Logging (prefer over print)
import logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)
logger.debug("Processing %d items", len(items))
logger.error("Failed to connect", exc_info=True)

# Rich — beautiful tracebacks
pip install rich
from rich.traceback import install
install(show_locals=True)

Variables

Variables are dynamically typed. No declaration keyword needed — just assign.

name = "Alice"
age = 30
is_active = True

# Multiple assignment
x, y, z = 1, 2, 3

# Swap
a, b = b, a

# Constants (convention only, not enforced)
MAX_SIZE = 100

# Type hints (optional, for tooling)
count: int = 0
label: str = "hello"

Types

Dynamically typed with built-in types: int, float, str, bool, list, dict, tuple, set.

# Primitives
s: str = "hello"
n: int = 42
f: float = 3.14
b: bool = True

# Collections
items: list[int] = [1, 2, 3]
mapping: dict[str, int] = {"a": 1, "b": 2}
pair: tuple[int, str] = (1, "hello")
unique: set[int] = {1, 2, 3}

# None (null equivalent)
value: str | None = None

# Type checking
isinstance(42, int)  # True
type(42)             # <class 'int'>

Data Structures

Rich built-in types: list, dict, set, tuple, frozenset. collections module adds deque, Counter, defaultdict, and more.

# List — ordered, mutable
nums = [1, 2, 3]
nums.append(4)
nums.pop()
nums[0]               # 1
nums[1:3]             # [2, 3] (slicing)

# Tuple — ordered, immutable
point = (10, 20)
x, y = point          # Unpacking

# Dict — key-value (ordered since 3.7)
user = {"name": "Alice", "age": 30}
user["email"] = "a@b.com"
user.get("missing", "default")
user.keys()
user.items()

# Set — unique values
s = {1, 2, 3, 3}     # {1, 2, 3}
s.add(4)
s & {2, 3, 5}         # Intersection: {2, 3}
s | {5, 6}            # Union: {1, 2, 3, 4, 5, 6}

# Comprehensions
squares = [x**2 for x in range(10)]
evens = {x for x in range(10) if x % 2 == 0}
mapping = {k: v for k, v in pairs}

# collections module
from collections import deque, Counter, defaultdict, namedtuple

dq = deque([1, 2, 3])
dq.appendleft(0)       # O(1) prepend

counts = Counter("abracadabra")
# Counter({'a': 5, 'b': 2, 'r': 2, 'c': 1, 'd': 1})

dd = defaultdict(list)
dd["key"].append(1)     # No KeyError

Point = namedtuple("Point", ["x", "y"])
p = Point(10, 20)

Functions

Defined with def. Support default args, keyword args, and type hints.

# Basic function
def greet(name: str) -> str:
    return f"Hello, {name}!"

# Default parameters
def power(base: int, exp: int = 2) -> int:
    return base ** exp

# *args and **kwargs
def log(*args, **kwargs):
    print(*args, **kwargs)

# Lambda (anonymous function)
add = lambda a, b: a + b

# Decorators
def timer(func):
    def wrapper(*args):
        import time
        start = time.time()
        result = func(*args)
        print(f"{time.time() - start:.2f}s")
        return result
    return wrapper

Conditionals

if/elif/else blocks. Python 3.10+ adds match/case for pattern matching.

# If / elif / else
if x > 0:
    print("positive")
elif x == 0:
    print("zero")
else:
    print("negative")

# Ternary (inline if)
label = "positive" if x > 0 else "non-positive"

# Match / case (Python 3.10+)
match command:
    case "quit":
        exit()
    case "hello":
        print("Hi!")
    case _:
        print("Unknown command")

# Truthy / falsy
if not []:
    print("empty list is falsy")

Loops

for loops iterate over any iterable. while for condition-based loops.

# For loop
for i in range(5):
    print(i)

# Iterate over a list
for item in ["a", "b", "c"]:
    print(item)

# Enumerate (index + value)
for i, val in enumerate(["a", "b", "c"]):
    print(i, val)

# While loop
n = 0
while n < 3:
    n += 1

# List comprehension
squares = [x ** 2 for x in range(10)]

# Dict comprehension
mapping = {k: v for k, v in [("a", 1), ("b", 2)]}

Generics & Type System

Dynamically typed with optional type hints (PEP 484+). Generics via typing module. Type checkers (mypy, pyright) enforce at analysis time.

# Generic functions (Python 3.12+ syntax)
def identity[T](value: T) -> T:
    return value

# Constrained generics
from typing import Protocol

class HasLength(Protocol):
    def __len__(self) -> int: ...

def get_length[T: HasLength](x: T) -> int:
    return len(x)

# Generic classes
class Box[T]:
    def __init__(self, value: T) -> None:
        self.value = value

    def map[U](self, fn: Callable[[T], U]) -> "Box[U]":
        return Box(fn(self.value))

# Pre-3.12 syntax (still common)
from typing import TypeVar, Generic

T = TypeVar("T")

class Stack(Generic[T]):
    def __init__(self) -> None:
        self._items: list[T] = []

    def push(self, item: T) -> None:
        self._items.append(item)

    def pop(self) -> T:
        return self._items.pop()

# Union types
def process(value: int | str) -> str:
    return str(value)

# TypeGuard — narrow types
from typing import TypeGuard

def is_string_list(val: list[object]) -> TypeGuard[list[str]]:
    return all(isinstance(x, str) for x in val)

# Variance
from typing import TypeVar
T_co = TypeVar("T_co", covariant=True)
T_contra = TypeVar("T_contra", contravariant=True)

Inheritance & Composition

Supports single and multiple inheritance with MRO (Method Resolution Order). Mixins and protocols (structural typing) for composition.

# Single inheritance
class Animal:
    def __init__(self, name: str):
        self.name = name
    def speak(self) -> str:
        return f"{self.name} makes a sound"

class Dog(Animal):
    def speak(self) -> str:
        return f"{self.name} barks"

# Multiple inheritance
class Swimmer:
    def swim(self): return "swimming"

class Flyer:
    def fly(self): return "flying"

class Duck(Animal, Swimmer, Flyer):
    pass

# Super and MRO
class Puppy(Dog):
    def __init__(self, name: str):
        super().__init__(name)
        self.young = True

# Abstract base classes
from abc import ABC, abstractmethod

class Shape(ABC):
    @abstractmethod
    def area(self) -> float: ...

# Protocols (structural typing, Python 3.8+)
from typing import Protocol

class Drawable(Protocol):
    def draw(self) -> None: ...
# Any class with draw() satisfies Drawable

Functional Patterns

First-class functions, comprehensions, generators, and functools for functional programming.

# Higher-order functions
def apply(fn, x):
    return fn(x)

double = lambda x: x * 2
apply(double, 5)  # 10

# Map, filter, reduce
from functools import reduce

nums = [1, 2, 3, 4, 5]
squared = list(map(lambda x: x * x, nums))
evens = list(filter(lambda x: x % 2 == 0, nums))
total = reduce(lambda a, b: a + b, nums, 0)

# List comprehensions (preferred over map/filter)
squared = [x * x for x in nums]
evens = [x for x in nums if x % 2 == 0]

# Generator expressions (lazy)
gen = (x * x for x in range(1_000_000))

# Closures
def counter(start=0):
    count = start
    def inc():
        nonlocal count
        count += 1
        return count
    return inc

# Partial application
from functools import partial
add = lambda a, b: a + b
add5 = partial(add, 5)

# Decorators (higher-order functions)
def log(fn):
    def wrapper(*args, **kwargs):
        print(f"Calling {fn.__name__}")
        return fn(*args, **kwargs)
    return wrapper

Concurrency

GIL limits true thread parallelism for CPU work. Use asyncio for I/O concurrency, multiprocessing for CPU parallelism.

# asyncio — async/await for I/O concurrency
import asyncio

async def fetch_data(url: str) -> str:
    # simulated async I/O
    await asyncio.sleep(1)
    return f"data from {url}"

async def main():
    # Run concurrently
    results = await asyncio.gather(
        fetch_data("/api/a"),
        fetch_data("/api/b"),
        fetch_data("/api/c"),
    )
    print(results)

asyncio.run(main())

# Threading — concurrent I/O (GIL limits CPU work)
from concurrent.futures import ThreadPoolExecutor

with ThreadPoolExecutor(max_workers=4) as pool:
    futures = [pool.submit(fetch, url) for url in urls]
    results = [f.result() for f in futures]

# Multiprocessing — true CPU parallelism (separate processes)
from multiprocessing import Pool

with Pool(4) as pool:
    results = pool.map(cpu_heavy_task, data)

# Python 3.13+ free-threading (experimental)
# Run without GIL: python3.13t -X gil=0 script.py

Modules & Imports

Every .py file is a module. Directories with __init__.py are packages. Use import to bring in modules. Distribution is done via wheels/sdists uploaded to PyPI.

# Import entire module
import os
import json

# Import specific names
from os.path import join, exists

# Import with alias
import numpy as np
from collections import defaultdict as dd

# Relative imports (within a package)
from . import sibling_module
from ..utils import helper

# Import everything (discouraged)
from math import *

# Dynamic import
import importlib
mod = importlib.import_module('my_module')

# Conditional import
try:
    import ujson as json
except ImportError:
    import json

# __init__.py controls package exports
# my_package/__init__.py:
# from .core import App
# from .utils import helper
# __all__ = ['App', 'helper']

Error Handling

Use try/except/finally. Python uses exceptions extensively.

# Try / except
try:
    result = 10 / 0
except ZeroDivisionError as e:
    print(f"Error: {e}")
finally:
    print("always runs")

# Multiple exceptions
try:
    data = json.loads(raw)
except (ValueError, KeyError) as e:
    print(f"Bad data: {e}")

# Raising exceptions
def divide(a, b):
    if b == 0:
        raise ValueError("Cannot divide by zero")
    return a / b

# Custom exceptions
class AppError(Exception):
    def __init__(self, message, code):
        super().__init__(message)
        self.code = code

Memory Management

Reference counting + cyclic garbage collector. CPython uses reference counting as the primary mechanism, with a cycle detector for circular references.

import sys
import gc

# Reference counting
a = [1, 2, 3]
sys.getrefcount(a)  # Shows reference count

# When refcount drops to 0, memory is freed immediately
b = a       # refcount = 2
del b       # refcount = 1
a = None    # refcount = 0 → freed

# Context managers — deterministic cleanup
with open('file.txt') as f:
    data = f.read()
# File is closed here, regardless of exceptions

# Custom context manager
class ManagedResource:
    def __enter__(self):
        self.acquire()
        return self
    def __exit__(self, *args):
        self.release()

# __del__ — destructor (not guaranteed to run)
class MyClass:
    def __del__(self):
        print("Cleaning up")

# Garbage collector control
gc.collect()            # Force collection
gc.disable()            # Disable automatic GC
gc.get_stats()          # GC statistics
gc.get_referrers(obj)   # What references this object?

# weakref — references that don't prevent GC
import weakref
obj = SomeClass()
weak = weakref.ref(obj)
weak()  # Returns obj or None if GC'd

# Slots — reduce per-instance memory
class Point:
    __slots__ = ('x', 'y')
    def __init__(self, x, y):
        self.x = x
        self.y = y
# Uses ~40% less memory than regular class

Performance Profiling

Built-in cProfile and timeit. Third-party tools: py-spy, scalene, line_profiler.

# timeit — quick benchmarks
import timeit
timeit.timeit('sum(range(1000))', number=10000)

# cProfile — built-in profiler
import cProfile
cProfile.run('my_function()')

# Profile to file + visualize
python -m cProfile -o output.prof script.py
# Visualize with snakeviz:
pip install snakeviz
snakeviz output.prof

# py-spy — sampling profiler (no code changes)
pip install py-spy
py-spy top --pid 12345           # Live top-like view
py-spy record -o profile.svg -- python app.py  # Flame graph

# Scalene — CPU + memory + GPU profiler
pip install scalene
scalene script.py

# line_profiler — line-by-line timing
pip install line_profiler
# Decorate function with @profile, then:
kernprof -l -v script.py

# memory_profiler — line-by-line memory
pip install memory-profiler
python -m memory_profiler script.py

# tracemalloc — built-in memory tracing
import tracemalloc
tracemalloc.start()
# ... code ...
snapshot = tracemalloc.take_snapshot()
for stat in snapshot.statistics('lineno')[:10]:
    print(stat)

Interop

Python excels at interop via ctypes, cffi, Cython, and pybind11 for C/C++. Also strong at subprocess and REST/HTTP integration.

# ctypes — call C shared libraries directly
import ctypes

libc = ctypes.CDLL("libc.so.6")  # or "libc.dylib" on macOS
libc.printf(b"Hello from C!\n")

libm = ctypes.CDLL("libm.so.6")
libm.sqrt.restype = ctypes.c_double
libm.sqrt(ctypes.c_double(16.0))  # 4.0

# cffi — cleaner C FFI
from cffi import FFI
ffi = FFI()
ffi.cdef("double sqrt(double x);")
lib = ffi.dlopen("libm.so.6")
lib.sqrt(16.0)  # 4.0

# subprocess — call any system command
import subprocess

result = subprocess.run(
    ["ls", "-la"],
    capture_output=True, text=True
)
print(result.stdout)

# pybind11 — C++ ↔ Python binding
# C++ side:
# #include <pybind11/pybind11.h>
# int add(int a, int b) { return a + b; }
# PYBIND11_MODULE(mymod, m) { m.def("add", &add); }
# Python side:
import mymod
mymod.add(1, 2)  # 3

# Cython — write C extensions in Python-like syntax
# .pyx files compiled to C for performance

# REST / HTTP — universal interop
import requests
resp = requests.get("https://api.example.com/data")
data = resp.json()

Packaging & Distribution

Publish libraries to PyPI. Distribute apps as Docker containers, pip-installable packages, or standalone executables.

# pyproject.toml — modern package config
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[project]
name = "my-package"
version = "1.0.0"
description = "My awesome library"
requires-python = ">=3.10"
dependencies = ["requests>=2.28"]

[project.scripts]
mycli = "my_package.cli:main"

[project.optional-dependencies]
dev = ["pytest", "ruff"]

# Build
python -m build  # Creates dist/*.whl and dist/*.tar.gz

# Publish to PyPI
pip install twine
twine upload dist/*

# Or use uv
uv build
uv publish

# Standalone executable
pip install pyinstaller
pyinstaller --onefile app.py
# Output: dist/app (single binary)

# Docker
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "app.py"]