Student reference: myths, rules vs ML, Python simulations, full ML story (splits, loss, metrics, bias), expandable examples, self-check activities, and quiz — no separate teacher script on the page.
A core question for understanding how real AI systems are built — and what role people still play.
“Can AI build itself?” Short answer for beginners: No — people still write the instructions, choose the data, and design the system. AI can help write code, but humans set goals and responsibility.
DiscussionProgramming = communication with machines. We use languages like Python so the computer can follow steps reliably, millions of times faster than we can.
DefinitionRecipe = code · Chef = computer · Dish = output. A wrong or vague recipe → bad dish. Same for bugs or missing steps in programs.
AnalogyAnimated flow — follow left to right: idea becomes code, the machine runs it, you get output.
Bridge from Module 1
Quick verbal recap: AI is the big goal (smart behaviour). Machine learning is one way to achieve it using data. Programming is how we tell the machine what to do with that data — load it, train, show results. Today we slow down on ML + light Python, not on coding drills.
Try with a friend or write in your notes: Name one app you used recently that “feels smart.” What might it have learned from — your taps, location, voice, or something else?
Myths vs facts
| Common myth | Closer to the truth |
|---|---|
| “AI can think like a human brain.” | Today’s systems match patterns in data; they don’t “understand” like people. |
| “AI will replace all programmers.” | Tools change; humans still define problems, data, ethics, and checks. |
| “More data always fixes everything.” | Bad, biased, or wrong data makes worse outcomes — quality matters. |
| “If the code runs, the AI is correct.” | Software can run perfectly and still give unfair or silly predictions. |
| Topic | Illustrative fact (industry surveys, rounded) |
|---|---|
| Developers using / learning AI-assisted coding tools | Stack Overflow Developer Survey (2025): a large majority of professional developers reported using AI tools in their workflow. |
| Python in data / ML teaching | Consistently among the most-taught languages in university and bootcamp data-science curricula worldwide. |
| ML project time (rule of thumb) | Practitioners often report that data work (collection, cleaning, labels) takes more calendar time than picking an algorithm. |
Exact percentages change every year; the point is scale and workflow, not memorising numbers.
Rules vs learning
A human writes every rule: “If email contains ‘win lottery,’ mark spam.” Works until spammers change words. Does not learn from new examples by itself.
The system adjusts from many examples. It finds patterns humans did not hand-write. Still needs good data, programming to train it, and human oversight.
What is an “algorithm”? (plain English)
An algorithm is a clear sequence of steps to solve a task — like a precise recipe. A program implements algorithms in a language the computer runs. Machine learning uses algorithms that update internal numbers (weights) when they see data; classical programs use fixed logic unless someone edits the code.
This module builds mental models; deeper math comes in later modules.
You do not need to code fluently yet — only follow the logic (storage, decisions, repetition). Use the simulations below to see what the computer would print.
What is Python? A language we use to tell the computer what to do. In AI courses it is popular because it is readable and many ML tools are built around it.
You will see Python in notebooks, scripts, and courses — but the computer does the heavy math inside libraries (written in fast languages). Your job first is to read logic: what is stored, what repeats, what branch runs.
Why Python shows up in AI
Reflection: Python loops can be slow in pure bytecode — why do researchers still use Python for AI? (Hint: libraries do heavy work in C/C++/CUDA; Python orchestrates experiments.)
Example — output to the screen:
print("Hello AI")
The computer displays exactly what we ask — that is the idea behind all later programs.
age = 20
A name (age) pointing to a value we can reuse.
if age > 18:
print("Adult")
The computer chooses a path based on a true/false check.
for i in range(3):
print("Hello")
Same action multiple times — essential for processing lots of data later.
Two more tiny patterns
name = "Priya" print(name)
Variables can hold text, not only numbers. ML pipelines often pass file paths, labels, and messages as strings.
scores = [72, 85, 90]
for s in scores:
print(s)
A loop can walk through a list — same idea as walking through rows of a dataset later.
Simulate the list loop (conceptual output):
More basic simulations (click each ▶)
Numbers and arithmetic
a = 5 b = 3 print(a + b)
if / else (two branches)
age = 16
if age >= 18:
print("Adult")
else:
print("Minor")
Loop with index i
for i in range(4):
print(i)
Joining strings
name = "Aisha"
print("Hello, " + name)
age = 16 to age = 20 in the if/else example and run again, the output becomes Adult. The machine follows the condition exactly — it does not guess your intention.
A bug is when the program does what you wrote, not what you wanted — wrong variable, wrong indent, wrong condition. ML has a cousin idea: the code runs, but predictions are wrong because of data, features, or model choice. Part 5 returns to this.
print, you can follow how ML code is structured.
In a practical sense, yes: Machine learning = learning from data. No data → no useful learning. This part ties together almost every idea you need before algorithms in Module 3.
| Human learning | Machine learning |
|---|---|
| Experience | Data (examples) |
| Brain | Model |
| Practice / study | Training |
| Guess on exam | Prediction |
With a study partner or in your notes: pick one row in the table and write a everyday analogy (e.g. “exam practice ≈ training”).
What “learn” means here
We are not claiming a laptop has feelings or consciousness. In ML, “learn” means: the system updates from data so that its predictions improve on similar future examples. That update is implemented with math and code — usually by minimising mistakes on training data while hoping it still works on new data (generalisation — a theme for later modules).
| Phase | What happens (story version) |
|---|---|
| Training (learning) | Model sees many labelled or unlabelled examples and adjusts internal parameters. |
| Inference (prediction) | Trained model receives a new example and outputs a label, score, or action — fast, like using a finished calculator. |
Core pipeline — animation suggests information moving stage to stage.
Illustrative “more examples → richer patterns” (not real metrics)
Labels: the “answer key”
Labelled data means each training example comes with the correct output we want the model to imitate later — spam/not spam, price sold, disease yes/no. Unlabelled data is only inputs; the algorithm must discover structure (unsupervised learning). Most beginner stories start with labels — that is supervised learning.
Self-check: Where do labels come from? Humans (annotators), sensors, historical records, or rules — if labels are wrong or noisy, the model learns that noise.
| House ID | Size (100 sq.ft units) | Rooms | City zone (code) | Label: sold price (₹ lakhs) |
|---|---|---|---|---|
| H01 | 12 | 3 | 2 | 85 |
| H02 | 9 | 2 | 1 | 52 |
| H03 | 15 | 4 | 3 | 118 |
| H04 | 11 | 3 | 2 | 79 |
| H05 | 8 | 2 | 1 | 45 |
Toy numbers for learning only — not a real market dataset.
One supervised row: inputs on the left, label on the right — the model learns to predict the label from the inputs.
Splitting data (train · validation · test)
Real projects usually divide examples into training (fit the model), validation (tune choices — which feature set, how complex the model), and test (one final honest check on data that did not influence those choices). If you tune on the test set, scores look artificially high — a form of cheating the metric.
| Split | Role (simple) |
|---|---|
| Training | Model updates its parameters to reduce error here. |
| Validation | Compare variants and hyperparameters without touching the test set. |
| Test | Final estimate of how the chosen model behaves on new-like data. |
Long walkthrough · Recommendations
Generalisation
Generalisation means the model performs well on new examples drawn from the same kind of reality — not only on the rows it memorised. Overfitting = fits training noise; underfitting = too simple to capture real patterns. Domain shift = training data and real-world data differ (new city, new device, new slang), so performance drops even if the code is unchanged.
Not like a diary. It stores patterns (weights), not your exact clips. Services still log events for product and policy reasons — privacy is a separate, important topic (Module 7 touches deployment & ethics).
Different histories, locations, languages, and A/B tests. The model is personalising to your signals and the app’s business rules.
Same big picture (data → train → predict), different model family and data. ChatGPT predicts the next piece of text; recommenders score items for you. We keep details for later modules.
Supervised is the most important for beginners — it matches “question + answer.” Reinforcement and unsupervised add other toolkits.
Compare: supervised has explicit targets; unsupervised discovers structure without provided labels.
Each example includes the right output (label): e.g. “this email is spam,” “this house sold for ₹X.” The model learns to map inputs → labels. Think: teacher shows the class the answer key while practising.
Everyday supervised tasks:
Self-check: Where does the label come from? Humans, sensors, historical records, or rules — noisy labels teach noisy behaviour.
Only the inputs are given. The algorithm looks for groups, patterns, or structure — like sorting people into similar taste clusters without naming the clusters first. You might later name a cluster “budget shoppers” after you inspect it.
Typical uses:
Pitfall: Clusters are mathematically real but need human interpretation — “cluster 3” is not automatically “good customers.”
An agent tries actions, gets rewards or penalties, and improves over time — like scoring points in a game. There is often no single “correct label” per step; instead, many steps build toward a goal.
Where you may have seen this story:
Contrast with supervised: RL needs a defined environment and reward signal; supervised needs many input–output pairs. Both still need programming to set up.
Quick “which type?” — guess, then reveal
Cover the answers, decide supervised / unsupervised / reinforcement for each, then tap Reveal.
A. Predict tomorrow’s temperature from past weather readings (numbers).
B. Group news articles into themes without telling the system theme names.
C. Drone learns to hover by trying motor speeds and staying stable.
D. Detect credit-card fraud using past transactions marked fraud / not fraud.
| Type | You mainly have… | Typical question |
|---|---|---|
| Supervised | Inputs + labels | “What category / number for this new input?” |
| Unsupervised | Inputs only | “What groups or structure exist?” |
| Reinforcement | States, actions, rewards | “What policy maximises long-term reward?” |
End-to-end language people use in ML teams: pipeline, splits, loss, epochs, metrics, errors, bias — still intuitive, no heavy formulas.
Extended pipeline — features refine raw data; training updates the model before prediction.
| Symbol / word (common in courses) | Meaning (beginner) |
|---|---|
| Input x | One example’s features (one row of data). |
| Label y | Correct output for supervised learning (class or number). |
| Prediction ŷ (“y-hat”) | What the model outputs for that input. |
| Dataset | Many (x, y) pairs or many x alone (unsupervised). |
Self-check: in your own words, trace the extended pipeline from “raw logs” to “prediction” in one short paragraph.
1 · Data — fuel of AI
Images, text, clicks, GPS traces, audio… quality and quantity both matter.
2 · Features (very important)
Not every byte goes into the model directly. We pick meaningful inputs — for house price: size, location, number of rooms, floor, age…
Garbage in, garbage out: If data is wrong, incomplete, or collected unfairly, the fanciest model cannot invent truth. Cleaning data — removing duplicates, fixing typos, handling missing values — is a huge part of real ML work (often more than choosing an algorithm).
Noise & outliers: Wrong labels, sensor glitches, or one-off extreme values can pull a model off course. Teams use cleaning rules, robust losses, or outlier detection — you only need the idea that not every row is equally trustworthy.
Raw data vs features — two more examples
Raw: full message bytes. Features might include: count of “free/offer/click,” presence of suspicious links, sender reputation score, time since account created — not the whole email dumped naively into one number.
Raw: grid of pixel brightness/colour. Deep models learn their own internal features; classical ML might use hand-crafted summaries (edges, colours). In short: pixels are data; the model gradually builds useful summaries.
House price story — features feed the model; output is a predicted price.
Loss, epochs, batches (how training “runs”)
Loss (error): A number that says how wrong the model is on the examples it is training on — e.g. “predicted price minus true price, squared.” Training tries to reduce loss step by step.
Epoch: One full pass through the training set (or through the sampling plan used in practice). Often you need many epochs; too many can encourage overfitting unless you regularise or stop early.
Batch / mini-batch: Training is usually done on small chunks of examples at a time (e.g. 32 or 256 rows) for speed and stable updates — not always one row at a time, rarely the entire dataset at once on huge data.
| Term | One-line memory hook |
|---|---|
| Loss | “How bad are we?” — training minimises it. |
| Epoch | “Saw the whole training set once.” |
| Batch | “Small pack of examples per update step.” |
| Learning rate | “Step size” when adjusting the model — too big can diverge; too small can be slow (Module 3). |
Bias vs variance (picture in words)
| Idea | Beginner explanation |
|---|---|
| High bias (underfitting) | Model too simple — misses real patterns even on training data (like using a straight line for a curved trend). |
| High variance (overfitting) | Model too flexible — fits training noise; great on train, worse on new data. |
| Goal | Balance: capture real signal, ignore noise — more data, better features, simpler model, or regularisation help. |
3 · Model
The learned “brain” that turns inputs into outputs — can be simple or complex.
4 · Training
The process of adjusting the model using data (learning).
5 · Prediction
The output on new data: class (spam/not), price, ETA, recommended dish, etc.
Training data vs “new” data
We usually split the world into examples the model studied during training and examples reserved to check if it generalises. If it only memorises training examples, it may fail on new people, new cities, or new slang — that failure mode is often called overfitting (rote learning). A model that is too simple and misses real patterns is sometimes called underfitting.
Train vs held-out / new data — if the real world drifts away from training (domain shift), error rises even with the same code.
Metrics — how we score predictions
Accuracy = fraction of examples where prediction matches the label. Easy to read but misleading when one class is rare (e.g. 99% “not fraud” — a dumb model that always says “not fraud” gets 99% accuracy and catches zero fraud).
Precision (for a class): when the model says “yes,” how often was it right? Recall: of all real “yes” cases, how many did we catch? Trade-offs matter in medicine, spam, and safety.
| Confusion matrix (spam example, 100 emails) | ||
|---|---|---|
| Predicted: not spam | Predicted: spam | |
| Actually not spam | 88 (good) | 2 (false alarm) |
| Actually spam | 5 (missed spam) | 5 (caught) |
Toy counts: accuracy = 93/100 = 93%, but missed 5/10 spam → recall for spam = 50%. Numbers chosen to show accuracy alone can hide pain points.
| Metric | When teams care a lot |
|---|---|
| Accuracy | Balanced classes and similar mistake costs. |
| Precision | False alarms are expensive (e.g. blocking legit transactions). |
| Recall | Missing a positive is dangerous (disease, fraud, safety). |
| RMSE / MAE | Regression — how far off predicted numbers are from true values. |
| Classification (categories) | Prediction (numbers) |
|---|---|
| Yes / No, Spam / Not spam | House price, exam score, delivery time |
| Discrete labels | Continuous values |
Correct or useful on new real cases — not only on examples it memorised.
Often wrong predictions, or only works on a tiny test set — misleading in practice.
If data is wrong or unfair, the AI copies that. Example: biased hiring data → unfair suggestions. Fix data and process, not only code.
Good vs bad — extend the idea
Not all mistakes cost the same. False alarm on spam (inbox mail lost) vs missed spam (phishing) vs wrong medical triage — different costs. Teams choose metrics and safeguards accordingly. This connects to ethics and product design (Module 7).
Short exercises you can do alone or with a study group. Local apps (Swiggy, Zomato, Instagram, UPI) are good examples when you relate ideas to real life.
Guess, then reveal.
Spam filter — labelled emails.
Customer grouping — no segment names given, only behaviour.
List ideas you might use in a food-delivery app:
Follow-up: For each idea, note what data you need, what prediction you output, and whether it is likely supervised, unsupervised, or RL.
These steps are shuffled. Arrange them mentally or on paper: Prediction, Training, Raw logs, Features, Model file. (Optional distractor: “Internet memes” — not a pipeline stage.)
For student marks prediction, which are features vs label vs not useful?
Scenario: College trains a model on past admission data that mostly accepted one gender or one region historically. Should we deploy it unchanged?
10 questions on ML basics: paradigms, features, loss, data, bias, and train/test splits. Instant feedback on every answer.
Module 2 distilled: Python logic, ML pipeline, splits, loss and epochs, metrics and baselines, three ML types, bias, and self-check activities.