An AI model is a mathematical algorithm that can recognize patterns and make decisions. These decisions are based on data it has previously seen (and "learned"). For instance, it can determine whether an email is spam, identify objects in an image, or generate meaningful replies in a conversation.
At its core, this is what we do: "We provide data, and the model learns from it."
Everything begins with data. AI models are only as good as the data they are trained on. If a model is trained on incomplete, biased, or noisy data, it will produce faulty results. For example, a facial recognition system requires thousands of labeled images from diverse age groups, ethnicities, and lighting conditions.
Note: The more diverse and balanced your data is, the more robust your model will be.
Note: I'd like to make an analogy. In real life, we can't dream about things we've never experienced. Interesting, right? Our dreams are shaped by what we know and remember. Similarly, AI cannot know anything beyond what exists in its dataset. But because AI often "pretends" to know everything (marketing?), it rarely says "I don't know." Instead, it begins to hallucinate — it makes things up. But that's a topic for another post.
You must choose a model type suited to the problem. For image recognition, CNNs (Convolutional Neural Networks) are often used. For language tasks, Transformer-based architectures (like GPT – Generative Pre-trained Transformer) are more appropriate. Each model type has strengths and limitations.
The model starts learning patterns from the data. In this phase, predictions are made and compared to actual values. Errors are calculated (loss function), and the model adjusts accordingly (backpropagation). This process is repeated across many cycles (epochs).
For example, during training, the model updates its weights and parameters to improve accuracy.
The model is tested on data it has never seen before (validation data). If the performance is unsatisfactory, hyperparameters (like learning rate, number of layers, etc.) are adjusted. This process is known as fine-tuning.
Once the model performs well, it's ready for real-world use. It might be integrated into a mobile app or offered as an API. But this is not the end — models need to be retrained with new data over time to stay relevant.
If you're using a language model (like ChatGPT), your input — the prompt — determines what kind of output you get. Writing prompts is the art of steering the model. The clearer and more precise your prompt, the better the response.
Bad prompt = vague answer.
Good prompt = accurate and useful result.
Epoch | Training Accuracy | Test Accuracy |
---|---|---|
1 | 60% | 55% |
10 | 85% | 80% |
50 | 99% | 88% |
⸻
Final Thoughts
In this post, I've tried to summarize the foundational steps of AI model development. Before diving into advanced topics, it's essential to understand these core ideas. If you want to think critically about AI, you first need to understand how it works.
In the next post, we can dive deeper into the topic. See you there!
Data Collection → Model Architecture → Training → Evaluation → Deployment
Prompt: "Write an email."
➡️ Too vague, vague result.
Prompt: "Write a polite email to a colleague explaining a delay in the project. The email should be in a professional tone and should be in the format of a business email."
➡️ Clear, targeted, better result.