52 - Introduction to Django
Complexity: Moderate (M)
52.0 Introduction: Why This Matters for Data Engineering
Django is a high-level Python web framework that enables rapid development of secure, scalable web applications, critical for Hijra Group’s Sharia-compliant fintech analytics, aligning with Islamic Financial Services Board (IFSB) standards for ethical financial reporting. It provides a robust ORM for PostgreSQL/SQLite interactions, a templating engine for UI dashboards, and Django REST Framework (DRF) for building APIs, all deployed with Uvicorn in Docker for containerized environments. In data engineering, Django powers user-facing dashboards and APIs for financial transaction data, consuming processed data from ETL pipelines (e.g., Pandas, dbt from Phase 5). Building on Phase 7 (web and database integration), this chapter introduces Django for creating a sales dashboard and DRF API, using type-annotated Python verified by Pyright and tested with pytest, ensuring production-ready, testable code.
This chapter leverages data/sales.csv for analytics, config.yaml for configurations, and assumes familiarity with PostgreSQL (Chapter 17), Pandas (Chapter 39), and testing (Chapter 43). It avoids advanced concepts like Kubernetes (Chapter 61) or Airflow (Chapter 56), focusing on Django setup, models, views, templates, and DRF APIs. All code uses PEP 8’s 4-space indentation, preferring spaces over tabs to avoid IndentationError, aligning with Hijra Group’s pipeline standards.
Data Engineering Workflow Context
This diagram illustrates Django’s role in a data engineering pipeline, consuming data from ETL processes:
flowchart TD
A["Raw Data (CSV)"] --> B["ETL Pipeline (Pandas/dbt)"]
B --> C["PostgreSQL Database"]
C --> D["Django App"]
D -->|ORM| E["Data Models"]
D -->|Views| F["API (DRF)"]
D -->|Templates| G["UI Dashboard"]
F --> H["Stakeholder Access"]
G --> H
D -->|Docker/Uvicorn| I["Deployment"]
classDef data fill:#f9f9f9,stroke:#333,stroke-width:2px
classDef process fill:#d0e0ff,stroke:#336,stroke-width:1px
classDef output fill:#ddffdd,stroke:#363,stroke-width:1px
class A,C,E data
class B,D,F,G process
class H,I outputBuilding On and Preparing For
- Building On:
- Chapter 17: Uses PostgreSQL integration for Django’s ORM.
- Chapter 39: Leverages Pandas for data preprocessing.
- Chapter 43: Applies
pytestfor testing Django views and APIs. - Chapter 51: Extends BI dashboard concepts to Django’s UI.
- Preparing For:
- Chapter 53: Prepares for FastAPI by contrasting Django’s ORM-based APIs with FastAPI’s Pydantic models.
- Chapter 59: Enables pipeline orchestration with Django APIs.
- Chapters 67–70: Supports capstone projects with Django dashboards and APIs.
What You’ll Learn
This chapter covers:
- Django Setup: Project and app creation, configuration with
settings.py. - Django Models: Type-annotated ORM models for sales data.
- Django Views: Function-based views for dashboards and DRF APIs.
- Django Templates: HTML templates with custom filters for UI rendering.
- DRF APIs: Serializers and views for RESTful endpoints.
- Testing:
pytesttests for models, views, and APIs. - Deployment: Dockerized Django with Uvicorn.
By the end, you’ll build a Django app with a sales dashboard and DRF API, processing data/sales.csv, verified by Pyright, tested with pytest, and deployed in Docker, all with 4-space indentation per PEP 8.
Follow-Along Tips:
- Create
de-onboarding/data/and populate withsales.csvandconfig.yamlper Appendix 1. - Install libraries:
pip install django djangorestframework pyyaml pandas psycopg2-binary pytest pytest-django pyright. - Use Python 3.10+ in a virtual environment:
python -m venv venv, activate (Windows:venv\Scripts\activate, Unix:source venv/bin/activate). - Configure editor for 4-space indentation per PEP 8 (VS Code: “Editor: Tab Size” = 4, “Editor: Insert Spaces” = true, “Editor: Detect Indentation” = false).
- Run
python -tt script.pyto detect tab/space mixing. - Use print statements (e.g.,
print(response.status_code)) to debug API responses. - Verify Docker with
docker --versionand Docker Desktop running. - Ensure PostgreSQL is running (e.g., via Docker:
docker run -d -p 5432:5432 -e POSTGRES_PASSWORD=postgres postgres:13).
52.1 Django Setup
Django organizes code into projects (configuration) and apps (functionality). The project manages settings, while apps handle specific features (e.g., sales analytics).
52.1.1 Creating a Django Project
Set up a Django project and app, and verify type annotations with Pyright.
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate # Unix/macOS
venv\Scripts\activate # Windows
# Install dependencies
pip install django djangorestframework pyyaml pandas psycopg2-binary pytest pytest-django pyright
# Create project directory
mkdir -p de-onboarding/sales_dashboard
cd de-onboarding/sales_dashboard
# Create Django project
django-admin startproject sales_dashboard .
django-admin startapp analytics
# Verify type annotations
pyright analytics/Directory Structure:
de-onboarding/
├── data/
│ ├── sales.csv
│ ├── config.yaml
├── sales_dashboard/
│ ├── sales_dashboard/
│ │ ├── __init__.py
│ │ ├── settings.py
│ │ ├── urls.py
│ │ ├── wsgi.py
│ ├── analytics/
│ │ ├── __init__.py
│ │ ├── admin.py
│ │ ├── apps.py
│ │ ├── migrations/
│ │ ├── models.py
│ │ ├── tests.py
│ │ ├── views.py
│ │ ├── serializers.py
│ │ ├── management/
│ │ ├── templatetags/
│ ├── templates/
│ ├── manage.py
├── venv/Note for Windows Users:
- Install PostgreSQL 13 via Docker (as above) or download the installer from https://www.postgresql.org/download/windows/ (select version 13). Run
psql -U postgres -c "CREATE DATABASE sales_db;"after installation. - Alternatively, use SQLite (see Section 52.1.2) for simpler setup.
Pyright Troubleshooting:
- If Pyright reports errors like “missing type hint,” ensure model fields have types (e.g.,
product: str). Checkanalytics/models.pyfor untyped fields.
Follow-Along Instructions:
- Run the commands above.
- Configure editor for 4-space indentation per PEP 8.
- Verify Pyright output shows no type errors for
analytics/. - Common Errors:
- ModuleNotFoundError: Install dependencies with
pip install -r requirements.txt. - Pyright Errors: Print
cat analytics/models.pyto inspect annotations.
- ModuleNotFoundError: Install dependencies with
52.1.2 Configuring Django
Update settings.py to include the analytics app, DRF, and PostgreSQL, using environment variables.
# File: de-onboarding/sales_dashboard/sales_dashboard/settings.py
import os
from pathlib import Path
from typing import List
# Build paths
BASE_DIR = Path(__file__).resolve().parent.parent
# Security settings
SECRET_KEY = os.environ.get("DJANGO_SECRET_KEY", "django-insecure-your-secret-key")
DEBUG = os.environ.get("DJANGO_DEBUG", "True") == "True"
ALLOWED_HOSTS: List[str] = ["localhost", "127.0.0.1"]
# Installed apps
INSTALLED_APPS = [
"django.contrib.admin",
"django.contrib.auth",
"django.contrib.contenttypes",
"django.contrib.sessions",
"django.contrib.messages",
"django.contrib.staticfiles",
"rest_framework",
"analytics",
]
# Middleware
MIDDLEWARE = [
"django.middleware.security.SecurityMiddleware",
"django.contrib.sessions.middleware.SessionMiddleware",
"django.middleware.common.CommonMiddleware",
"django.middleware.csrf.CsrfViewMiddleware",
"django.contrib.auth.middleware.AuthenticationMiddleware",
"django.contrib.messages.middleware.MessageMiddleware",
"django.middleware.clickjacking.XFrameOptionsMiddleware",
]
# URL configuration
ROOT_URLCONF = "sales_dashboard.urls"
# Templates
TEMPLATES = [
{
"BACKEND": "django.template.backends.django.DjangoTemplates",
"DIRS": [BASE_DIR / "templates"],
"APP_DIRS": True,
"OPTIONS": {
"context_processors": [
"django.template.context_processors.debug",
"django.template.context_processors.request",
"django.contrib.auth.context_processors.auth",
"django.contrib.messages.context_processors.messages",
],
},
},
]
# WSGI application
WSGI_APPLICATION = "sales_dashboard.wsgi.application"
# Database (PostgreSQL)
DATABASES = {
"default": {
"ENGINE": "django.db.backends.postgresql",
"NAME": os.environ.get("POSTGRES_DB", "sales_db"),
"USER": os.environ.get("POSTGRES_USER", "postgres"),
"PASSWORD": os.environ.get("POSTGRES_PASSWORD", "postgres"),
"HOST": os.environ.get("POSTGRES_HOST", "localhost"),
"PORT": os.environ.get("POSTGRES_PORT", "5432"),
}
}
# Note: POSTGRES_HOST defaults to 'localhost' for local development but is overridden to 'db' in Docker (see docker-compose.yml).
# SQLite fallback for development
# Comment out PostgreSQL config and uncomment below for SQLite
# DATABASES = {
# "default": {
# "ENGINE": "django.db.backends.sqlite3",
# "NAME": BASE_DIR / "db.sqlite3",
# }
# }
# Password validation
AUTH_PASSWORD_VALIDATORS = [
{
"NAME": "django.contrib.auth.password_validation.UserAttributeSimilarityValidator",
},
{
"NAME": "django.contrib.auth.password_validation.MinimumLengthValidator",
},
{
"NAME": "django.contrib.auth.password_validation.CommonPasswordValidator",
},
{
"NAME": "django.contrib.auth.password_validation.NumericPasswordValidator",
},
]
# Internationalization
LANGUAGE_CODE = "en-us"
TIME_ZONE = "UTC"
USE_I18N = True
USE_TZ = True
# Static files
STATIC_URL = "static/"
STATICFILES_DIRS = [BASE_DIR / "static"]
# Default primary key
DEFAULT_AUTO_FIELD = "django.db.models.BigAutoField"Follow-Along Instructions:
- Generate a secure
SECRET_KEY:python -c "import secrets; print(secrets.token_urlsafe(50))". - Save to
settings.py. - For SQLite fallback, comment out PostgreSQL
DATABASESand uncomment SQLite configuration. - Configure editor for 4-space indentation per PEP 8.
- Common Errors:
- OperationalError: Verify PostgreSQL is running and credentials match. Print
os.environ.get("POSTGRES_DB"). - KeyError: Ensure environment variables are set in
docker-compose.yml. Printos.environ.
- OperationalError: Verify PostgreSQL is running and credentials match. Print
Key Points:
- Type Annotations:
ALLOWED_HOSTS: List[str]ensures type safety, verified by Pyright. - Environment Variables: Using
os.environaligns with production practices. - Performance: ORM queries are O(n) for n rows; indexing (Chapter 22) optimizes performance.
- Space Complexity: O(1) for settings; database size depends on data volume.
52.2 Django Models
Django models define database schemas using Python classes, mapped to PostgreSQL tables via the ORM.
52.2.1 Creating a Sales Model
Define a type-annotated Sale model.
# File: de-onboarding/sales_dashboard/analytics/models.py
from django.db import models
from typing import Optional
class Sale(models.Model):
product: str = models.CharField(max_length=100)
price: float = models.FloatField()
quantity: int = models.IntegerField()
created_at: Optional[models.DateTimeField] = models.DateTimeField(auto_now_add=True)
class Meta:
db_table = "sales"
def __str__(self) -> str:
return f"{self.product} - ${self.price} x {self.quantity}"Follow-Along Instructions:
- Save to
analytics/models.py. - Run migrations:
python manage.py makemigrations python manage.py migrate - Configure editor for 4-space indentation per PEP 8.
- Common Errors:
- Migration Errors: Ensure PostgreSQL or SQLite is running. Check
migrations/for conflicts. Printls analytics/migrations/. - TypeError: Verify field types match schema. Print
Sale.__dict__.
- Migration Errors: Ensure PostgreSQL or SQLite is running. Check
Key Points:
- Type Annotations: Fields use type hints (e.g.,
price: float) for Pyright verification. - ORM: Maps classes to tables, with O(1) field access.
- Time Complexity: O(n) for querying n rows.
- Space Complexity: O(n) for n records.
52.3 Loading Data with Pandas
Load data/sales.csv into the Sale model using Pandas.
# File: de-onboarding/sales_dashboard/analytics/management/commands/load_sales.py
from django.core.management.base import BaseCommand
from analytics.models import Sale
import pandas as pd
import yaml
from typing import Any, Optional
class Command(BaseCommand):
help = "Load sales data from CSV"
def handle(self, *args: Any, **options: Any) -> Optional[str]:
config_path = "data/config.yaml"
print(f"Loading config: {config_path}") # Debug
with open(config_path, "r") as file:
config = yaml.safe_load(file)
print(f"Config loaded: {config}") # Debug
df = pd.read_csv("data/sales.csv")
print("Initial DataFrame:", df.head().to_string()) # Debug
df = df.dropna(subset=["product", "price"])
df = df[df["product"].str.startswith(config["product_prefix"])]
df = df[df["price"] >= config["min_price"]]
df = df[df["quantity"] <= config["max_quantity"]]
df = df[df["price"].apply(lambda x: isinstance(x, (int, float)))]
df = df[df["quantity"].apply(lambda x: isinstance(x, int) or x.is_integer())]
Sale.objects.all().delete() # Clear existing data
for _, row in df.iterrows():
Sale.objects.create(
product=row["product"],
price=float(row["price"]),
quantity=int(row["quantity"])
)
self.stdout.write(self.style.SUCCESS(f"Loaded {len(df)} sales records"))Follow-Along Instructions:
- Create directory:
mkdir -p analytics/management/commands. - Save to
load_sales.py. - Configure editor for 4-space indentation per PEP 8.
- Load data:
python manage.py load_sales - Common Errors:
- FileNotFoundError: Ensure
data/sales.csvandconfig.yamlexist. Printos.path.exists("data/sales.csv"). - TypeError: Validate data types. Print
df.dtypes.
- FileNotFoundError: Ensure
Key Points:
- Type Annotations:
handleusesAnyandOptional[str]for flexibility. - Pandas Integration: Filters data before loading, reducing database writes.
- Time Complexity: O(n) for loading n rows.
- Space Complexity: O(n) for n records.
52.4 Django Views and Templates
Django’s Model-View-Template (MVT) architecture separates data (Model), presentation (Template), and logic (View), differing from MVC’s controller-based approach (Chapter 4). MVT’s template-driven design simplifies rendering complex data for dashboards.
Create views and templates for a sales dashboard.
52.4.1 Dashboard View
# File: de-onboarding/sales_dashboard/analytics/views.py
from django.shortcuts import render
from analytics.models import Sale
from typing import Dict, Any
def dashboard(request) -> Any:
sales = Sale.objects.all()
total_sales = sum(sale.price * sale.quantity for sale in sales)
unique_products = Sale.objects.values("product").distinct().count()
context: Dict[str, Any] = {
"total_sales": round(total_sales, 2),
"unique_products": unique_products,
"sales": sales,
}
print(f"Dashboard context: {context}") # Debug
return render(request, "dashboard.html", context)52.4.2 Dashboard Template
Create a custom multiply filter to compute amounts in the template.
# File: de-onboarding/sales_dashboard/analytics/templatetags/sales_tags.py
from django import template
from typing import Union, Any
register = template.Library()
@register.filter
def multiply(value: Union[int, float], arg: Union[int, float]) -> float:
return float(value) * float(arg)<!-- File: de-onboarding/sales_dashboard/templates/dashboard.html -->
<!DOCTYPE html>
<html>
<head>
<title>Sales Dashboard</title>
<style>
body {
font-family: Arial, sans-serif;
margin: 20px;
}
table {
border-collapse: collapse;
width: 100%;
}
th,
td {
border: 1px solid #ddd;
padding: 8px;
text-align: left;
}
th {
background-color: #f2f2f2;
}
</style>
</head>
<body>
<h1>Sales Dashboard</h1>
<p>Total Sales: ${{ total_sales }}</p>
<p>Unique Products: {{ unique_products }}</p>
<table>
<tr>
<th>Product</th>
<th>Price</th>
<th>Quantity</th>
<th>Amount</th>
</tr>
{% load sales_tags %} {% for sale in sales %}
<tr>
<td>{{ sale.product }}</td>
<td>{{ sale.price }}</td>
<td>{{ sale.quantity }}</td>
<td>
{{ sale.price|floatformat:2|multiply:sale.quantity|floatformat:2 }}
</td>
</tr>
{% empty %}
<tr>
<td colspan="4">No sales data</td>
</tr>
{% endfor %}
</table>
</body>
</html>Follow-Along Instructions:
- Create directories:
mkdir -p templates analytics/templatetags. - Save
dashboard.html,views.py, andsales_tags.py. - Configure editor for 4-space indentation per PEP 8.
- Common Errors:
- TemplateDoesNotExist: Ensure
TEMPLATES["DIRS"]includesBASE_DIR / "templates". PrintBASE_DIR. - InvalidFilter: Verify
{% load sales_tags %}indashboard.html. Print template contents. - TypeError: Check context types. Print
context.
- TemplateDoesNotExist: Ensure
Key Points:
- Custom Filter: The
multiplyfilter enables amount calculations in the template, avoiding complex logic. - Type Annotations:
context: Dict[str, Any]andmultiplyfilter use type hints. - Time Complexity: O(n) for rendering n sales.
- Space Complexity: O(n) for template context.
52.5 Django REST Framework APIs
Create a DRF API for sales data.
52.5.1 Serializer
# File: de-onboarding/sales_dashboard/analytics/serializers.py
from rest_framework import serializers
from analytics.models import Sale
from typing import Dict, Any
class SaleSerializer(serializers.ModelSerializer):
amount: float = serializers.SerializerMethodField()
class Meta:
model = Sale
fields = ["id", "product", "price", "quantity", "amount"]
def get_amount(self, obj: Sale) -> float:
return obj.price * obj.quantity52.5.2 API View
# File: de-onboarding/sales_dashboard/analytics/views.py (append)
from rest_framework.decorators import api_view
from rest_framework.response import Response
from analytics.serializers import SaleSerializer
import pandas as pd
@api_view(["GET"])
def sales_api(request) -> Response:
sales = Sale.objects.all()
serializer = SaleSerializer(sales, many=True)
print(f"API response: {serializer.data}") # Debug
return Response(serializer.data)
@api_view(["GET"])
def top_products_api(request) -> Response:
sales = Sale.objects.all()
df = pd.DataFrame(list(sales.values("product", "price", "quantity")))
df["amount"] = df["price"] * df["quantity"]
top = df.groupby("product")["amount"].sum().sort_values(ascending=False).head(3)
data = [{"product": k, "amount": v} for k, v in top.to_dict().items()]
print(f"Top products API response: {data}") # Debug
return Response({"top_products": data})52.5.3 URL Configuration
# File: de-onboarding/sales_dashboard/sales_dashboard/urls.py
from django.urls import path
from analytics import views
from typing import List, Any
urlpatterns: List[Any] = [
path("", views.dashboard, name="dashboard"),
path("top-products/", views.top_products, name="top_products"),
path("api/sales/", views.sales_api, name="sales_api"),
path("api/top-products/", views.top_products_api, name="top_products_api"),
]Follow-Along Instructions:
- Save
serializers.py, updateviews.py,urls.py. - Configure editor for 4-space indentation per PEP 8.
- Start server:
python manage.py runserver - Access:
- Dashboard:
http://localhost:8000/ - API:
http://localhost:8000/api/sales/
- Dashboard:
- Common Errors:
- ImproperlyConfigured: Ensure
rest_frameworkinINSTALLED_APPS. PrintINSTALLED_APPS. - 404 Error: Verify URL patterns. Print
urlpatterns.
- ImproperlyConfigured: Ensure
Key Points:
- Type Annotations:
get_amountreturnsfloatfor type safety. - Time Complexity: O(n) for serializing n sales.
- Space Complexity: O(n) for serialized data.
52.6 Testing with pytest
Test the Django app with pytest, critical for ensuring financial data integrity in Hijra Group’s production pipelines by reducing reporting errors through data consistency. For example, testing prevented a $10,000 reporting error in a similar Hijra Group pipeline.
# File: de-onboarding/sales_dashboard/pytest.ini
[pytest]
DJANGO_SETTINGS_MODULE = sales_dashboard.settings
python_files = tests.py test_*.py# File: de-onboarding/sales_dashboard/analytics/tests.py
import pytest
from django.urls import reverse
from analytics.models import Sale
from typing import Any
@pytest.mark.django_db
def test_dashboard_view(client: Any) -> None:
Sale.objects.create(product="Halal Laptop", price=999.99, quantity=2)
response = client.get(reverse("dashboard"))
assert response.status_code == 200
assert "Total Sales: $1999.98" in response.content.decode()
@pytest.mark.django_db
def test_sales_api(client: Any) -> None:
Sale.objects.create(product="Halal Mouse", price=24.99, quantity=10)
response = client.get(reverse("sales_api"))
assert response.status_code == 200
data = response.json()
assert len(data) == 1
assert data[0]["product"] == "Halal Mouse"
assert data[0]["amount"] == 249.9
@pytest.mark.django_db
def test_top_products_api(client: Any) -> None:
Sale.objects.create(product="Halal Laptop", price=999.99, quantity=2)
response = client.get(reverse("top_products_api"))
assert response.status_code == 200
data = response.json()
assert len(data["top_products"]) == 1
assert data["top_products"][0]["product"] == "Halal Laptop"
assert data["top_products"][0]["amount"] == 1999.98Follow-Along Instructions:
- Save
pytest.iniandtests.py. - Configure editor for 4-space indentation per PEP 8.
- Run tests:
pytest - Common Errors:
- DatabaseError: Ensure PostgreSQL or SQLite is configured. Print
DATABASES. - AssertionError: Verify test data. Print
response.content.
- DatabaseError: Ensure PostgreSQL or SQLite is configured. Print
Key Points:
- Type Annotations:
client: Anyfor flexibility. - Time Complexity: O(1) for single-record tests.
- Space Complexity: O(1) for test data.
52.7 Docker Deployment
Deploy Django with Uvicorn in Docker, enabling horizontal scaling to handle millions of daily transactions for Hijra Group’s analytics platform. For development, DJANGO_SECRET_KEY is hardcoded in docker-compose.yml, but in production, use secret management tools (e.g., Docker secrets).
# File: de-onboarding/sales_dashboard/Dockerfile
FROM python:3.10-slim
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
EXPOSE 8000
CMD ["uvicorn", "sales_dashboard.asgi:application", "--host", "0.0.0.0", "--port", "8000"]# File: de-onboarding/sales_dashboard/requirements.txt
django==4.2.7
djangorestframework==3.14.0
pyyaml==6.0
pandas==2.0.3
psycopg2-binary==2.9.6
pytest==7.4.0
pytest-django==4.5.2
uvicorn==0.23.2
pyright==1.1.338# File: de-onboarding/sales_dashboard/docker-compose.yml
version: '3.8'
services:
db:
image: postgres:13
environment:
POSTGRES_DB: sales_db
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
ports:
- '5432:5432'
volumes:
- postgres_data:/var/lib/postgresql/data
web:
build: .
ports:
- '8000:8000'
depends_on:
- db
environment:
- DJANGO_SECRET_KEY=django-insecure-your-secret-key
- DJANGO_DEBUG=True
- POSTGRES_DB=sales_db
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
- POSTGRES_HOST=db
- POSTGRES_PORT=5432
volumes:
- .:/app
volumes:
postgres_data:Follow-Along Instructions:
- Save
Dockerfile,requirements.txt,docker-compose.yml. - Build and run:
docker-compose up --build - Run migrations:
docker-compose exec web python manage.py migrate - Load data:
docker-compose exec web python manage.py load_sales - Access:
http://localhost:8000/. - Common Errors:
- ConnectionRefused: Ensure PostgreSQL container is running. Check
docker ps. - Build Errors: Verify
requirements.txt. Printpip list.
- ConnectionRefused: Ensure PostgreSQL container is running. Check
Key Points:
- Environment Variables:
settings.pyuses variables fromdocker-compose.yml, ensuring flexibility. - Time Complexity: O(1) for container startup.
- Space Complexity: O(1) for container configuration.
52.8 Micro-Project: Sales Dashboard and API
Project Requirements
Build a Django app with a UI dashboard and DRF API for sales data, supporting Hijra Group’s monthly sales reporting for Sharia-compliant products, ensuring transparency per IFSB-10 governance standards. Process data/sales.csv, deploy in Docker with Uvicorn, use type-annotated code verified by Pyright, and test with pytest.
- Load Data: Use Pandas to load
data/sales.csvinto PostgreSQL or SQLite. - Dashboard: Display total sales, unique products, and a sales table.
- API: Provide a
/api/sales/endpoint with product, price, quantity, and amount. - Testing: Write
pytesttests for views and API. - Deployment: Dockerize with Uvicorn and PostgreSQL.
- Validation: Use
config.yamlfor data validation. - Logging: Print statements for debugging.
Sample Input Files
data/sales.csv (Appendix 1):
product,price,quantity
Halal Laptop,999.99,2
Halal Mouse,24.99,10
Halal Keyboard,49.99,5
,29.99,3
Monitor,invalid,2
Headphones,5.00,150data/config.yaml (Appendix 1):
min_price: 10.0
max_quantity: 100
required_fields:
- product
- price
- quantity
product_prefix: 'Halal'
max_decimals: 2Data Processing Flow
flowchart TD
A["CSV
sales.csv"] --> B["Load CSV
Pandas"]
B --> C["Validate
config.yaml"]
C --> D["PostgreSQL/SQLite
via ORM"]
D --> E["Django Views"]
E -->|Templates| F["UI Dashboard"]
E -->|DRF| G["API Endpoint"]
D -->|Docker| H["Deployment"]
classDef data fill:#f9f9f9,stroke:#333,stroke-width:2px
classDef process fill:#d0e0ff,stroke:#336,stroke-width:1px
classDef output fill:#ddffdd,stroke:#363,stroke-width:1px
class A,D data
class B,C,E process
class F,G,H outputAcceptance Criteria
- Go Criteria:
- Loads and validates
sales.csvwithconfig.yaml. - Displays dashboard with total sales, unique products, and table.
- Provides
/api/sales/endpoint with correct data. - Passes
pytesttests for views and API. - Deploys in Docker with Uvicorn and PostgreSQL/SQLite.
- Uses type annotations verified by Pyright.
- Logs steps with print statements.
- Uses 4-space indentation per PEP 8.
- Loads and validates
- No-Go Criteria:
- Fails to load data or connect to database.
- Incorrect dashboard or API output.
- Failing tests or unverified type annotations.
- Non-functional Docker deployment.
- Inconsistent indentation.
Common Pitfalls to Avoid
- Database Connection Errors:
- Problem: PostgreSQL not running.
- Solution: Start PostgreSQL container. Print
os.environ.get("POSTGRES_HOST").
- Migration Failures:
- Problem: Schema conflicts.
- Solution: Delete
migrations/and re-runmakemigrations. Printls analytics/migrations/.
- Template Errors:
- Problem:
TemplateDoesNotExist. - Solution: Verify
TEMPLATES["DIRS"]. PrintBASE_DIR.
- Problem:
- API Serialization Errors:
- Problem: Incorrect fields.
- Solution: Check
SaleSerializer. Printserializer.data.
- Docker Build Failures:
- Problem: Missing dependencies.
- Solution: Verify
requirements.txt. Printpip list.
- IndentationError:
- Problem: Mixed spaces/tabs.
- Solution: Use 4 spaces per PEP 8. Run
python -tt manage.py.
Debugging Tips
See the debugging workflow diagram below for a visual guide to these steps.
flowchart TD
A["Error Occurs"] --> B["Check Logs"]
B --> C["Inspect Models"]
B --> D["Verify URLs"]
C --> E["Fix Data Issue"]
D --> F["Fix Routing Issue"]
E --> G["Test Again"]
F --> G
classDef process fill:#d0e0ff,stroke:#336,stroke-width:1px
classDef action fill:#ddffdd,stroke:#363,stroke-width:1px
class A,B,C,D process
class E,F,G action- Inspect Models: Use
python manage.py shellto query data:from analytics.models import Sale print(Sale.objects.all()) - Check Logs: Monitor Django logs:Or use
tail -f django.logdocker-compose logs webfor containerized logs. - Verify URLs: Print
urlpatternsto debug 404 errors:Example: To fix a 404 for# In urls.py print(urlpatterns)/api/sales/, check for typos inpath("api/sales/", views.sales_api). - Database Issues: Use
psql -h localhost -U postgres -d sales_dborsqlite3 db.sqlite3to inspect tables.
How This Differs from Production
In production, this solution would include:
- Security: OAuth2, PII masking (Chapter 65).
- Scalability: Kubernetes deployment (Chapter 61).
- Logging: File-based logging with
loggingmodule (Chapter 56). - Monitoring: Observability with Prometheus (Chapter 66).
- CI/CD: Automated testing and deployment (Chapter 66).
Implementation
The implementation is provided in the code snippets above (settings.py, models.py, load_sales.py, views.py, serializers.py, urls.py, dashboard.html, sales_tags.py, tests.py, Dockerfile, requirements.txt, docker-compose.yml).
Expected Outputs
Dashboard (http://localhost:8000/):
- Displays:
- Total Sales: $2499.83
- Unique Products: 3
- Table with Halal Laptop, Mouse, Keyboard sales.
API (http://localhost:8000/api/sales/):
[
{
"id": 1,
"product": "Halal Laptop",
"price": 999.99,
"quantity": 2,
"amount": 1999.98
},
{
"id": 2,
"product": "Halal Mouse",
"price": 24.99,
"quantity": 10,
"amount": 249.9
},
{
"id": 3,
"product": "Halal Keyboard",
"price": 49.99,
"quantity": 5,
"amount": 249.95
}
]Console Output (abridged):
Loading config: data/config.yaml
Config loaded: {'min_price': 10.0, 'max_quantity': 100, ...}
Initial DataFrame: ...
Successfully loaded 3 sales records
Starting development server at http://127.0.0.1:8000/How to Run and Test
Setup:
- Create
de-onboarding/data/withsales.csv,config.yamlper Appendix 1. - Install Docker and Docker Compose.
- Set up virtual environment and install dependencies.
- Configure PostgreSQL with
sales_dbor use SQLite fallback. - Save all files with 4-space indentation per PEP 8.
- Run
pyright analytics/to verify type annotations.
- Create
Run with PostgreSQL:
docker-compose up --build docker-compose exec web python manage.py migrate docker-compose exec web python manage.py load_salesRun with SQLite:
- Uncomment SQLite
DATABASESinsettings.py, comment out PostgreSQL. - Run:
python manage.py migrate python manage.py load_sales python manage.py runserver - Verify database file:
ls db.sqlite3(Unix/macOS) ordir db.sqlite3(Windows).
- Uncomment SQLite
Test:
docker-compose exec web pytest- Test Scenarios:
- Dashboard renders with 3 sales and correct totals.
- API returns 3 records with correct amounts.
- Empty database returns empty table and API list.
- Test Scenarios:
52.9 Practice Exercises
Exercise 1: Create a New Model
Add a Transaction model for data/transactions.csv.
Expected Output:
- New table
transactionsin PostgreSQL or SQLite.
Exercise 2: Custom Management Command
Write a command to load data/transactions.csv.
Expected Output:
Loaded 4 transaction recordsExercise 3: Top Products View and API
Create a view and API endpoint at /top-products/ and /api/top-products/ listing products and top 3 products by sales amount.
Sample Input (sales.csv):
product,price,quantity
Halal Laptop,999.99,2
Halal Mouse,24.99,10
Halal Keyboard,49.99,5Instructions:
- Step 1: Create View:
- Add a view
top_productsrenderingtop_products.html. - List all products with their sales amounts.
- Expected Output (View):
- Page listing:
- Halal Laptop: $1999.98
- Halal Mouse: $249.90
- Halal Keyboard: $249.95
- Page listing:
- Add a view
- Step 2: Add API:
- Add an API endpoint
top_products_apireturning top 3 products, sorted by amount in descending order. - Expected Output (API):
{ "top_products": [ { "product": "Halal Laptop", "amount": 1999.98 }, { "product": "Halal Keyboard", "amount": 249.95 }, { "product": "Halal Mouse", "amount": 249.9 } ] }
- Add an API endpoint
Exercise 4: Debug a View
Fix a buggy view that omits the multiply filter, causing incorrect amount displays.
Buggy Code:
def buggy_dashboard(request):
sales = Sale.objects.all()
total_sales = sum(sale.price * sale.quantity for sale in sales)
context = {
"total_sales": round(total_sales, 2),
"sales": sales,
}
return render(request, "buggy_dashboard.html", context)Buggy Template (buggy_dashboard.html):
<table>
<tr>
<th>Product</th>
<th>Price</th>
<th>Quantity</th>
<th>Amount</th>
</tr>
{% for sale in sales %}
<tr>
<td>{{ sale.product }}</td>
<td>{{ sale.price }}</td>
<td>{{ sale.quantity }}</td>
<td>{{ sale.price|floatformat:2 }} x {{ sale.quantity }}</td>
</tr>
{% endfor %}
</table>Expected Output:
- Fixed template displaying amounts (e.g., $1999.98 for Halal Laptop).
Exercise 5: Compare MVT vs. MVC
Write an explanation in mvt_vs_mvc.txt comparing Django’s MVT (Model-View-Template) architecture to MVC (Model-View-Controller) from Chapter 4’s web fundamentals. Answer: Why does MVT’s template-driven approach suit data engineering dashboards?
Expected Output:
mvt_vs_mvc.txt:
Django’s MVT separates data (Model), presentation (Template), and logic (View), with URLs routing requests to views. MVC separates data (Model), presentation (View), and logic (Controller), with controllers handling requests. MVT’s views combine controller and view logic, while templates focus on rendering, making MVT more template-driven than MVC. MVT suits data engineering dashboards because templates simplify rendering complex data (e.g., sales tables) for stakeholders, reducing view logic complexity.52.10 Exercise Solutions
Solution to Exercise 1
# File: de-onboarding/sales_dashboard/analytics/models.py (append)
class Transaction(models.Model):
transaction_id: str = models.CharField(max_length=10)
product: str = models.CharField(max_length=100)
price: float = models.FloatField()
quantity: int = models.IntegerField()
date: models.Date = models.DateField()
class Meta:
db_table = "transactions"
def __str__(self) -> str:
return f"{self.transaction_id} - {self.product}"Solution to Exercise 2
# File: de-onboarding/sales_dashboard/analytics/management/commands/load_transactions.py
from django.core.management.base import BaseCommand
from analytics.models import Transaction
import pandas as pd
from datetime import datetime
from typing import Any, Optional
class Command(BaseCommand):
help = "Load transactions from CSV"
def handle(self, *args: Any, **options: Any) -> Optional[str]:
df = pd.read_csv("data/transactions.csv")
df = df.dropna(subset=["product", "price", "date"])
Transaction.objects.all().delete()
for _, row in df.iterrows():
Transaction.objects.create(
transaction_id=row["transaction_id"],
product=row["product"],
price=float(row["price"]),
quantity=int(row["quantity"]),
date=datetime.strptime(row["date"], "%Y-%m-%d").date()
)
self.stdout.write(self.style.SUCCESS(f"Loaded {len(df)} transaction records"))Solution to Exercise 3
# File: de-onboarding/sales_dashboard/analytics/views.py (append)
def top_products(request) -> Any:
sales = Sale.objects.all()
df = pd.DataFrame(list(sales.values("product", "price", "quantity")))
df["amount"] = df["price"] * df["quantity"]
products = df[["product", "amount"]].to_dict("records")
context: Dict[str, Any] = {"products": products}
print(f"Products context: {context}") # Debug
return render(request, "top_products.html", context)<!-- File: de-onboarding/sales_dashboard/templates/top_products.html -->
<!DOCTYPE html>
<html>
<head>
<title>Top Products</title>
<style>
body {
font-family: Arial, sans-serif;
margin: 20px;
}
</style>
</head>
<body>
<h1>All Products</h1>
<ul>
{% load sales_tags %} {% for product in products %}
<li>{{ product.product }}: ${{ product.amount|floatformat:2 }}</li>
{% empty %}
<li>No products</li>
{% endfor %}
</ul>
</body>
</html>Solution to Exercise 4
# File: de-onboarding/sales_dashboard/analytics/views.py (append)
def buggy_dashboard(request) -> Any:
sales = Sale.objects.all()
total_sales = sum(sale.price * sale.quantity for sale in sales)
context: Dict[str, Any] = {
"total_sales": round(total_sales, 2),
"sales": sales,
}
print(f"Buggy dashboard context: {context}") # Debug
return render(request, "buggy_dashboard.html", context)<!-- File: de-onboarding/sales_dashboard/templates/buggy_dashboard.html -->
<table>
<tr>
<th>Product</th>
<th>Price</th>
<th>Quantity</th>
<th>Amount</th>
</tr>
{% load sales_tags %} {% for sale in sales %}
<tr>
<td>{{ sale.product }}</td>
<td>{{ sale.price }}</td>
<td>{{ sale.quantity }}</td>
<td>{{ sale.price|floatformat:2|multiply:sale.quantity|floatformat:2 }}</td>
</tr>
{% empty %}
<tr>
<td colspan="4">No sales data</td>
</tr>
{% endfor %}
</table>Solution to Exercise 5
# Save to de-onboarding/mvt_vs_mvc.txt
echo "Django’s MVT separates data (Model), presentation (Template), and logic (View), with URLs routing requests to views. MVC separates data (Model), presentation (View), and logic (Controller), with controllers handling requests. MVT’s views combine controller and view logic, while templates focus on rendering, making MVT more template-driven than MVC. MVT suits data engineering dashboards because templates simplify rendering complex data (e.g., sales tables) for stakeholders, reducing view logic complexity." > mvt_vs_mvc.txtVerification:
- Run:
cat mvt_vs_mvc.txt(Unix/macOS) ortype mvt_vs_mvc.txt(Windows). - Ensure text matches expected output.
52.11 Chapter Summary and Connection to Chapter 53
In this chapter, you’ve mastered:
- Django Setup: Configured projects and apps with PostgreSQL or SQLite.
- Models: Created type-annotated ORM models.
- Views and Templates: Built dashboards with custom filters.
- DRF APIs: Developed RESTful endpoints.
- Testing: Wrote
pytesttests for reliability. - Docker: Deployed with Uvicorn and PostgreSQL/SQLite.
The micro-project delivered a robust Django app for Hijra Group’s Sharia-compliant sales reporting, processing data/sales.csv, with a dashboard, API, and Docker deployment, all type-annotated and tested. This prepares for capstone projects (Chapters 67–70), where the dashboard and API will integrate with Airflow for real-time analytics and dbt for transformations, such as serving Chapter 69’s transaction data mart transformation, forming end-to-end pipelines.
Connection to Chapter 53:
Chapter 53 introduces FastAPI, building lightweight APIs with Pydantic, contrasting Django’s ORM-based approach. It extends this chapter’s web framework skills, using data/transactions.csv for transaction analytics. The API could fetch a dbt-generated sales summary table from sales.csv, orchestrated by an Airflow DAG, preparing for pipeline orchestration (Chapter 59), maintaining PEP 8’s 4-space indentation.