English
Handwritten Digit Softmax Classifier in C: From 784 Pixels to submission.csv
Once the dataset layout is clear, the most useful part of this handwritten digit project is the C implementation itself. It does not rely on a deep learning framework. Instead, it uses a direct multi-class softmax model that maps a 784-dimensional input vector to ten digit classes.
This is a good kind of project for learning how model formulas become code. You can inspect the weight matrix, the softmax probability calculation, the cross-entropy loss accumulation, and the gradient-based parameter updates without a large abstraction layer getting in the way.
1. The model structure is deliberately small
The main parameters are only:
- W[10][784]: one weight vector of length 784 for each class
- b[10]: one bias term for each class
For a single input sample x, the classifier first computes one raw score per class:
z[k] = b[k];
for (int j = 0; j < FEATURES; j++) {
z[k] += W[k][j] * x[j];
}
Those ten values are the logits for the current sample.
2. Softmax turns raw scores into probabilities
Raw linear scores are not directly interpretable as probabilities, so the implementation normalizes them with softmax:
p[i] = exp(z[i] - max_z);
sum += p[i];
...
p[i] /= sum;
The subtraction by max_z is a stability trick. It keeps the exponentials from blowing up numerically. After softmax, the probabilities over the ten classes add up to one, and the predicted label is just the class with the largest probability.
3. What the training loop is actually doing
The current project runs 20 epochs with a learning rate of 0.01. In each epoch, it loops through every training sample and repeats the same sequence:
- Compute ten logits
- Apply softmax to get a probability distribution
- Compare that distribution to the true label
- Update the weights and biases with the resulting error
The update rule is written in a very transparent way:
double error = p[k] - (k == y_train[i] ? 1.0 : 0.0);
for (int j = 0; j < FEATURES; j++) {
W[k][j] -= LEARNING_RATE * error * X_train[i][j];
}
b[k] -= LEARNING_RATE * error;
If you already know logistic regression or linear multi-class classification, this will look familiar. It is essentially softmax regression trained with stochastic gradient descent.
4. Which metrics are worth checking
During training, the program prints epoch loss and training accuracy. After training, it prints the final training accuracy and a confusion matrix. Those are the most useful outputs to read first:
- Loss: whether optimization is moving in the right direction
- Accuracy: whether the classification result is improving
- Confusion matrix: which digits are most often mixed up
If a few classes remain confused with each other, that is usually a sign that the digit shapes are visually close or that the linear model has reached its representational limit.
5. How submission.csv is generated
After training, the program reads test.csv, calls predict_one for each sample, and writes the result back into the required CSV structure:
ImageId,Label
1,7
2,2
3,1
...
That is the final submission.csv. From an engineering perspective, this step matters because it turns the training code into a complete pipeline that can process unseen inputs and export predictions in a reusable format.
6. How to run it locally
The downloads section now includes the source file plus compressed copies of the training and test data. The current implementation expects train.csv and test.csv in the same working directory:
unzip train.csv.zip
unzip test.csv.zip
gcc digit_softmax_classifier.c -lm -O2 -o digit_classifier
./digit_classifier
A normal run should print:
- the number of training and test samples
- loss and accuracy for each epoch
- final training accuracy and the confusion matrix
- a message confirming that
submission.csvwas written
7. What this C version does not try to do
The current implementation is already enough for a complete multi-class practice project, but its boundaries are also clear:
- the model is still linear, not convolutional
- a strong training accuracy does not automatically mean the best generalization
- there is no dedicated validation split for tuning
- there is no mini-batch schedule, regularization, or more advanced optimization
Those are not flaws so much as the next layer of work. A clean, understandable, end-to-end baseline is already valuable.
8. Where to go next
If you want an interactive version before reading more source code, open the handwritten digit tab in the playground. The browser version does not retrain on the full dataset. Instead, it loads a compact pre-trained softmax demo so you can draw digits, inspect probability scores, and try labeled samples directly in the page.
The source code, zipped datasets, sample submission file, generated submission, and browser model bundle are all available on the downloads page. If you have not read the previous post yet, start with the dataset structure article so the arrays and loops in this C file are easier to place in context.
Chinese
用 C 实现手写数字 Softmax 分类器:从 784 维像素到 submission.csv
Open as a full page读懂数据之后,这个手写数字项目最值得看的部分就是 C 语言实现本身。它没有依赖深度学习框架,而是用一个非常直接的多分类 softmax 模型,把 784 维输入像素映射到 10 个数字类别上。
这类实现很适合训练“把模型公式翻译成代码”的能力。你会看到:权重矩阵怎么定义、softmax 概率怎么计算、交叉熵损失怎么累计、梯度下降怎么一步步更新参数。
一、模型结构其实很简单
项目里最核心的参数只有两组:
- W[10][784]:10 个类别各自对应一组长度为 784 的权重
- b[10]:10 个类别的偏置项
对一条输入样本 x 来说,程序先计算每个类别的线性分数:
z[k] = b[k];
for (int j = 0; j < FEATURES; j++) {
z[k] += W[k][j] * x[j];
}
这一步得到的是 10 个 logits,也就是每个类别当前的原始分数。
二、softmax 把分数变成概率
线性分数本身不方便直接解释成“属于某个数字的概率”,所以程序接着用 softmax 做归一化:
p[i] = exp(z[i] - max_z);
sum += p[i];
...
p[i] /= sum;
这里减去 max_z 是为了数值稳定,避免指数计算时数值太大。softmax 之后,10 个类别的概率会加起来等于 1,程序再选概率最大的类别作为当前预测值。
三、训练循环在做什么
当前项目设置了 20 轮训练,学习率是 0.01。每轮都会遍历训练集中的全部样本:
- 计算 10 个类别的 logits
- 做 softmax,得到概率分布
- 根据真实标签计算误差
- 用误差更新权重和偏置
更新规则写得很直接:
double error = p[k] - (k == y_train[i] ? 1.0 : 0.0);
for (int j = 0; j < FEATURES; j++) {
W[k][j] -= LEARNING_RATE * error * X_train[i][j];
}
b[k] -= LEARNING_RATE * error;
如果你学过逻辑回归或多分类线性模型,会发现这套写法本质上就是 softmax 回归的随机梯度下降版本。它不花哨,但很适合练基本功。
四、输出里最该看哪几个指标
这个项目训练时会打印每轮的损失和训练准确率,训练结束后还会输出训练集准确率和混淆矩阵。对初学者来说,这几项最重要:
- loss:有没有持续下降
- accuracy:分类正确比例有没有逐步提升
- 混淆矩阵:哪些数字最容易互相混淆
如果某几类长期互相错分,通常说明这几个数字的局部形状更接近,或者当前线性模型的表达能力已经接近上限。
五、它是怎么生成 submission.csv 的
训练结束后,程序会逐条读取 test.csv,对每条样本调用一次 predict_one,再写成:
ImageId,Label
1,7
2,2
3,1
...
这就是最终的 submission.csv。从工程角度看,这一步很关键,因为它把“训练代码”真正变成了一个能处理未知输入并导出结果的完整项目。
六、如何在本地运行
站点下载区已经放好了源码、训练集压缩包和测试集压缩包。当前版本直接在源码所在目录读取 train.csv 和 test.csv:
unzip train.csv.zip
unzip test.csv.zip
gcc digit_softmax_classifier.c -lm -O2 -o digit_classifier
./digit_classifier
正常情况下,程序会依次输出:
- 训练样本数和测试样本数
- 每一轮训练的 loss 和 accuracy
- 训练集准确率与混淆矩阵
- 生成
submission.csv的提示
七、这个 C 项目目前的边界在哪里
它已经足够完成一个完整的多分类练习,但也有很清楚的边界:
- 模型仍然是线性的,没有卷积层或更复杂的表示能力
- 当前训练集准确率高,并不等于线上泛化一定最好
- 没有单独划出验证集做调参
- 没有 mini-batch、正则化或更细的学习率调度
这些都不是缺点,而是后续扩展空间。先把一份能跑通、能解释、能导出结果的基础实现做好,本身就很有价值。
八、下一步怎么继续
如果你想先试交互演示,再回来看源码,可以继续打开 实验台里的手写数字标签页。浏览器版不会直接跑完整训练,而是加载一份预训练的轻量 softmax 权重,让你可以手绘数字、看预测概率和样本效果。
源码、压缩数据、样例提交文件和浏览器模型文件都已经放到 下载页。如果你还没看前一篇,建议补读 手写数字数据结构文章,这样这份 C 代码里的每个数组就更容易对上数据来源。
Once the dataset layout is clear, the most useful part of this handwritten digit project is the C implementation itself. It does not rely on a deep learning framework. Instead, it uses a direct multi-class softmax model that maps a 784-dimensional input vector to ten digit classes.
This is a good kind of project for learning how model formulas become code. You can inspect the weight matrix, the softmax probability calculation, the cross-entropy loss accumulation, and the gradient-based parameter updates without a large abstraction layer getting in the way.
1. The model structure is deliberately small
The main parameters are only:
- W[10][784]: one weight vector of length 784 for each class
- b[10]: one bias term for each class
For a single input sample x, the classifier first computes one raw score per class:
z[k] = b[k];
for (int j = 0; j < FEATURES; j++) {
z[k] += W[k][j] * x[j];
}
Those ten values are the logits for the current sample.
2. Softmax turns raw scores into probabilities
Raw linear scores are not directly interpretable as probabilities, so the implementation normalizes them with softmax:
p[i] = exp(z[i] - max_z);
sum += p[i];
...
p[i] /= sum;
The subtraction by max_z is a stability trick. It keeps the exponentials from blowing up numerically. After softmax, the probabilities over the ten classes add up to one, and the predicted label is just the class with the largest probability.
3. What the training loop is actually doing
The current project runs 20 epochs with a learning rate of 0.01. In each epoch, it loops through every training sample and repeats the same sequence:
- Compute ten logits
- Apply softmax to get a probability distribution
- Compare that distribution to the true label
- Update the weights and biases with the resulting error
The update rule is written in a very transparent way:
double error = p[k] - (k == y_train[i] ? 1.0 : 0.0);
for (int j = 0; j < FEATURES; j++) {
W[k][j] -= LEARNING_RATE * error * X_train[i][j];
}
b[k] -= LEARNING_RATE * error;
If you already know logistic regression or linear multi-class classification, this will look familiar. It is essentially softmax regression trained with stochastic gradient descent.
4. Which metrics are worth checking
During training, the program prints epoch loss and training accuracy. After training, it prints the final training accuracy and a confusion matrix. Those are the most useful outputs to read first:
- Loss: whether optimization is moving in the right direction
- Accuracy: whether the classification result is improving
- Confusion matrix: which digits are most often mixed up
If a few classes remain confused with each other, that is usually a sign that the digit shapes are visually close or that the linear model has reached its representational limit.
5. How submission.csv is generated
After training, the program reads test.csv, calls predict_one for each sample, and writes the result back into the required CSV structure:
ImageId,Label
1,7
2,2
3,1
...
That is the final submission.csv. From an engineering perspective, this step matters because it turns the training code into a complete pipeline that can process unseen inputs and export predictions in a reusable format.
6. How to run it locally
The downloads section now includes the source file plus compressed copies of the training and test data. The current implementation expects train.csv and test.csv in the same working directory:
unzip train.csv.zip
unzip test.csv.zip
gcc digit_softmax_classifier.c -lm -O2 -o digit_classifier
./digit_classifier
A normal run should print:
- the number of training and test samples
- loss and accuracy for each epoch
- final training accuracy and the confusion matrix
- a message confirming that
submission.csvwas written
7. What this C version does not try to do
The current implementation is already enough for a complete multi-class practice project, but its boundaries are also clear:
- the model is still linear, not convolutional
- a strong training accuracy does not automatically mean the best generalization
- there is no dedicated validation split for tuning
- there is no mini-batch schedule, regularization, or more advanced optimization
Those are not flaws so much as the next layer of work. A clean, understandable, end-to-end baseline is already valuable.
8. Where to go next
If you want an interactive version before reading more source code, open the handwritten digit tab in the playground. The browser version does not retrain on the full dataset. Instead, it loads a compact pre-trained softmax demo so you can draw digits, inspect probability scores, and try labeled samples directly in the page.
The source code, zipped datasets, sample submission file, generated submission, and browser model bundle are all available on the downloads page. If you have not read the previous post yet, start with the dataset structure article so the arrays and loops in this C file are easier to place in context.
Search questions
FAQ
Who is this article for?
This article is for readers who want a practice-level guide to Handwritten Digit Softmax in C. It takes about 11 min and focuses on C, Softmax, Classification.
What should I read next?
The recommended next step is Handwritten Digit Playground Notes, so the article connects into a longer learning route instead of ending as an isolated note.
Does this article include runnable code or companion resources?
Yes. Use the run notes, resource cards, and download links on the page to reproduce the example or inspect the companion files.
How does this article fit into the larger site?
It is connected to the article context block, learning routes, resources, and project timeline so readers can move from concept to implementation.
Article context
AI Learning Project
A practical route from AI concepts to machine learning workflow, evaluation, neural networks, Python practice, handwritten digits, a CIFAR-10 CNN, adversarial traffic-defense notes, and AI security.
Follow the C implementation from logits and softmax probabilities to confusion matrices and submission export.
Download share card Open share centerCompanion resources
AI Learning Project / CODE
digit_softmax_classifier.c
The C source for the handwritten digit softmax classifier.
AI Learning Project / DATASET
train.csv.zip
Compressed handwritten digit training set with 42000 labeled samples.
AI Learning Project / DATASET
test.csv.zip
Compressed handwritten digit test set with 28000 unlabeled samples.
AI Learning Project / DATASET
submission.csv
The prediction file generated by the current C project.
AI Learning Project / ARCHIVE
Handwritten digit project bundle
Contains the source file, compressed datasets, submission files, browser model, and preview grid.
Project timeline
Published posts
- AI Basics Learning Roadmap Separate AI, machine learning, and deep learning before going into implementation details.
- Machine Learning Workflow Follow the practical path from data and features to training, prediction, and evaluation.
- Model Training and Evaluation Understand loss, overfitting, train/test splits, accuracy, recall, and F1.
- Neural Network Basics Move from perceptrons to activation, forward propagation, backpropagation, and training loops.
- NLP Basics: Understanding Bag of Words and TF-IDF An introduction to the most fundamental text representation methods in NLP: Bag of Words (BoW) and TF-IDF.
- RNN Basics: Handling Sequential Data with Memory Understand the core concepts of Recurrent Neural Networks (RNN), the role of hidden states, and their application in NLP.
- Transformer Self-Attention Read Q/K/V, scaled dot-product attention, multi-head attention, and positional encoding before exploring LLM internals.
- Python AI Mini Practice Run a small scikit-learn classification task and read the experiment output.
- Handwritten Digit Dataset Basics Read train.csv, test.csv, labels, and the flattened 28 by 28 pixel layout before training the classifier.
- Handwritten Digit Softmax in C Follow the C implementation from logits and softmax probabilities to confusion matrices and submission export.
- Handwritten Digit Playground Notes See how the offline classifier was adapted into a browser demo with drawing input and probability output.
- CIFAR-10 Tiny CNN Tutorial in C Build and train a small convolutional neural network for CIFAR-10 image classification, then read its loss and accuracy output.
- Building a Tiny CIFAR-10 CNN in C: Convolution, Pooling, and Backpropagation A source-based walkthrough of cifar10_tiny_cnn.c, covering CIFAR-10 binary input, 3x3 convolution, ReLU, max pooling, fully connected logits, softmax, backpropagation, and local commands.
- High-Entropy Traffic Defense Notes Study encrypted metadata leaks, entropy, traffic classifiers, and a defensive Python chaffing prototype.
- AI Security Threat Modeling Build a defense map with NIST adversarial ML, MITRE ATLAS, and OWASP LLM risks.
- Adversarial Examples and Robust Evaluation Evaluate clean and perturbed accuracy with an FGSM-style digits experiment.
- Data Poisoning and Backdoor Defense Study poison rate, trigger behavior, attack success rate, and training pipeline controls.
- Model Privacy and Extraction Defense Measure membership inference signal and surrogate fidelity against a local toy model.
- LLM, RAG, and Agent Security Separate instructions from data and enforce tool permissions against indirect prompt injection.
Published resources
- Python AI practice code guide The article includes a runnable scikit-learn classification script.
- digit_softmax_classifier.c The C source for the handwritten digit softmax classifier.
- train.csv.zip Compressed handwritten digit training set with 42000 labeled samples.
- test.csv.zip Compressed handwritten digit test set with 28000 unlabeled samples.
- sample_submission.csv The official submission format example for checking the final output columns.
- submission.csv The prediction file generated by the current C project.
- digit-playground-model.json The compact softmax demo model and sample set used by the browser playground.
- digit-sample-grid.svg A small handwritten digit preview grid extracted from the training set.
- Handwritten digit project bundle Contains the source file, compressed datasets, submission files, browser model, and preview grid.
- cifar10_tiny_cnn.c source Single-file C tiny CNN with CIFAR-10 loading, convolution, pooling, softmax, and backpropagation.
- model_weights.bin sample weights Model weights generated by one local small-sample run.
- test_predictions.csv sample predictions Sample test prediction output from the CIFAR-10 tiny CNN.
- CNN project explanation PDF Companion explanation material for the CNN project.
- Virtual Mirror redacted code skeleton A redacted mld_chaffing_v2.py control-flow skeleton with secrets, node topology, and target lists removed.
- Virtual Mirror stress-test template A redacted CSV template for CPU, memory, peak threads, pulse rate, latency, and error measurements.
- Virtual Mirror classifier-evaluation template A CSV template for TP, FN, FP, TN, accuracy, precision, recall, F1, ROC-AUC, entropy, and JS divergence.
- Virtual Mirror resource notes Notes explaining why the public resources include only redacted code, test templates, and architecture context.
- AI Security Lab README Setup, safety boundaries, and quick-run commands for the AI Security series.
- AI Security Lab full bundle Includes safe toy scripts, result CSVs, risk register, attack-defense matrix, and architecture diagram.
- AI security risk register CSV risk register template for AI threat modeling and release review.
- AI attack-defense matrix Maps attack surface, toy demo, metric, and defensive control into one CSV table.
- AI Security Lab architecture diagram Shows threat modeling, robustness, data integrity, model privacy, and RAG guardrails.
- FGSM digits robustness script FGSM-style perturbation and accuracy-drop experiment for a local digits classifier.
- Data poisoning and backdoor toy script Demonstrates poison rate, trigger behavior, and attack success rate on digits.
- Model privacy and extraction toy script Outputs membership AUC, target accuracy, surrogate fidelity, and surrogate accuracy.
- RAG prompt injection guard toy script Uses a deterministic toy agent to demonstrate external-data demotion and tool-policy blocking.
- Deep Learning topic share card A 1200x630 SVG card for sharing the Deep Learning / CNN topic hub.
- Machine Learning From Scratch share card A 1200x630 SVG card for the K-means, Iris, and ML workflow topic hub.
- Student AI Projects share card A 1200x630 SVG card for handwritten digits, C classifiers, and browser demos.
- CNN convolution scan animation An 8-second Remotion animation showing how a 3x3 convolution kernel scans an input and builds a feature map.
Current route
- AI Basics Learning Roadmap Learning path step
- Machine Learning Workflow Learning path step
- Model Training and Evaluation Learning path step
- Neural Network Basics Learning path step
- Transformer Self-Attention Learning path step
- LLM Visualizer Learning path step
- Python AI Mini Practice Learning path step
- Handwritten Digit Dataset Basics Learning path step
- Handwritten Digit Softmax in C Learning path step
- Handwritten Digit Playground Notes Learning path step
- CIFAR-10 Tiny CNN Tutorial in C Learning path step
- High-Entropy Traffic Defense Notes Learning path step
- AI Security Threat Modeling Learning path step
- Adversarial Examples and Robust Evaluation Learning path step
- Data Poisoning and Backdoor Defense Learning path step
- Model Privacy and Extraction Defense Learning path step
- LLM, RAG, and Agent Security Learning path step
Next notes
- Add more image-classification and error-analysis cases
- Turn common metrics into a quick reference
- Add more AI security defense experiment notes
