DSPython Logo DSPython

Modules and Packages

Organize your code for reusability and scalability.

Python Intermediate Intermediate 45 min

πŸ“¦ Introduction: Why Organize Code?

As your Python projects grow, keeping all your functions and classes in one file becomes messy. Modules and Packages are Python's way of organizing code into reusable and manageable chunks.

Imagine a library:
A **Module** is like a single book containing related topics (functions).
A **Package** is like a bookshelf (a folder) containing many related books (modules).

This system promotes **reusability**, prevents **name collisions** (two functions with the same name), and improves **readability** for large-scale projects, especially in Data Science (e.g., NumPy, Pandas).

🎯 Topic 1: The Python Module

A module is simply a Python file (`.py` extension) containing Python definitions and statements.

πŸ“ Creating and Using a Module:

1. Create `calculator.py`:
PI = 3.14159
def add(a, b):
return a + b
2. Use `import` in `main.py`:
import calculator
result = calculator.add(5, 3)
print(result)

The module name acts as a **namespace** (calculator.add) to avoid confusion if another module also has an `add` function.

⚑ Topic 2: Import Statements and Aliases

There are three primary ways to import code, each with different effects on your namespace.

πŸ“˜ Import Types:

1. Standard Import (import module) import math

Access: math.sqrt(4). **Recommended** for clarity.

2. Import with Alias (import module as alias) import pandas as pd

Access: pd.DataFrame(). Used for long module names (e.g., Data Science libraries).

3. Direct Import (from module import name) from math import pi, sqrt

Access: sqrt(4). Brings names directly into the local namespace. Be cautious of name collisions.

πŸ’» Example: The Math Module

import math
from math import pi

# Access via module name
print(math.ceil(4.1))

# Access directly (no prefix)
print(pi)

πŸ—ƒοΈ Topic 3: Python Packages

A package is a collection of modules organized in a directory structure. This allows for hierarchical code organization.

The key to a Python package is the file: __init__.py.

πŸ“‚ Package Structure Example:

my_project/
__init__.py
data/
__init__.py
loader.py # Module 1
models/
__init__.py
train.py # Module 2

Any directory with an **\_\_init\_\_.py** file is considered a Python package. This file runs when the package is imported and can be used to set up package-wide variables or imports. (In modern Python 3.3+, this file is optional but still standard practice).

Importing from a Package:

from my_project.data import loader
from my_project.models.train import start_training

πŸ’» Topic 4: The Script vs. Module ($__name__$)

When Python executes a file, it sets a special variable called $__name__$. This is crucial for creating modules that can also be run directly.

The $__name__$ Variable:

  • If the file is run directly: $__name__$ is set to '__main__'.
  • If the file is imported as a module: $__name__$ is set to the module's name (e.g., 'calculator').

We use the following idiom to ensure certain code (like tests or program start) only runs when the file is executed as the main script:

def main():
print("Running main application logic...")

if __name__ == "__main__":
main()

πŸ“š Module Summary

  • Module: A single .py file containing functions, classes, and variables.
  • Package: A directory containing an (optional) __init__.py file and other modules/sub-packages.
  • Import: Used to load a module. Use import module or from module import name.
  • Alias: Use import pandas as pd to shorten the name.
  • __name__: Special variable used to determine if a file is the main script or an imported module.

πŸ€” Interview Q&A

Tap on the questions below to reveal the answers.

A **Module** is a single file (.py) containing code. A **Package** is a directory (folder) containing multiple modules and an optional __init__.py file, used to structure a large codebase hierarchically.

Aliasing (e.g., import numpy as np) is used to shorten long module names, making the code cleaner and easier to read, especially in Data Science where standard aliases (like np for NumPy) are common practice.

It imports ALL public names (functions, variables, classes) from the module directly into the current namespace. It is discouraged because it can lead to **name collisions** (overwriting existing variables/functions) and makes it unclear where a specific function came from.

It historically marks a directory as a Python package. When a package is imported, the code in __init__.py is executed, allowing for initialization tasks, setting up package-level variables, and specifying which modules should be available when a user runs from package import *.

πŸ€–
DSPython AI Assistant βœ–
πŸ‘‹ Hi! I’m your AI assistant. Paste your code here, I will find bugs for you.