SQL & Bases de Données — Cours Complet
Analyse de Données Python — Cours Professionnel
Fondamentaux Python
Machine Learning avec Python
Développement Web Django
IA (Intelligence Artificielle)
L'IA permet aux machines d'effectuer des taches qui demandent habituellement une intelligence humaine, comme la classification et la prediction.
# Simple threshold classifier
loans = [1200, 3400, 900, 5100, 2200]
threshold = 2500
labels = ["high" if x >= threshold else "low" for x in loans]
print("Loans:", loans)
print("Risk labels:", labels)
Pas a pas
Define a rule and a threshold.
Compute one label per observation.
Compare output labels with your expected risk classes.
This is a deterministic classifier baseline before neural models.
Un LLM est un modele de langage entraine sur de grands volumes de texte pour generer, resumer et transformer du contenu.
# Tiny toy language model with bigram counts
text = "ai helps teams make better decisions with data"
words = text.split()
bigrams = {}
for i in range(len(words) - 1):
pair = (words[i], words[i + 1])
bigrams[pair] = bigrams.get(pair, 0) + 1
print("Bigrams:")
for pair, count in sorted(bigrams.items()):
print(f"{pair}: {count}")
Pas a pas
Split text into tokens.
Count each adjacent token pair.
Use counts to estimate next-word likelihood.
This is the conceptual bridge to autoregressive LLM decoding.
Modele de langage tres petit (bigrammes) pour enseigner tokens, logits et probabilites.
# Tiny LLM teaching model (char-level bigram logits)
try:
import torch
except Exception as e:
print("PyTorch not available:", e)
else:
text = "hello llm"
vocab = sorted(set(text))
stoi = {c: i for i, c in enumerate(vocab)}
itos = {i: c for c, i in stoi.items()}
ids = torch.tensor([stoi[c] for c in text], dtype=torch.long)
model = torch.nn.Embedding(len(vocab), len(vocab))
logits = model(ids)
probs = torch.softmax(logits[-1], dim=-1)
top_p, top_i = torch.topk(probs, k=min(3, len(vocab)))
print("Vocab:", vocab)
print("Last input token:", repr(itos[int(ids[-1])]))
print("Top next-token probabilities:")
for p, i in zip(top_p, top_i):
print(itos[int(i)], round(float(p), 4))
Pas a pas
Build a character vocabulary and token IDs.
Use embedding table as learnable logits producer.
Apply softmax and read top-k next-token probabilities.
Mathematique derriere
p(next) = softmax(logits_last_token)
Sortie attendue:
Vocab: [...]
Last input token: ...
Top next-token probabilities: 3 lignes token-probabilite
RAG (Retrieval-Augmented Generation)
Le RAG combine recherche de contexte + generation pour des reponses plus fiables et auditables.
# Tiny RAG demo (keyword overlap retrieval + prompt assembly)
docs = [
"LGPD requires lawful basis, consent management, and data subject rights.",
"Dedicated AI servers provide tenant isolation and private GPU workloads.",
"RAG improves answer grounding by injecting retrieved context into prompts.",
]
query = "How to run AI servers with LGPD compliance?"
q_terms = set(w.strip('.,!?').lower() for w in query.split())
scored = []
for d in docs:
d_terms = set(w.strip('.,!?').lower() for w in d.split())
score = len(q_terms & d_terms)
scored.append((score, d))
scored.sort(reverse=True, key=lambda x: x[0])
top_doc = scored[0][1]
prompt = f"Question: {query}\nContext: {top_doc}\nAnswer:"
print("Query:", query)
print("Top document:", top_doc)
print("Prompt preview:", prompt[:120] + "...")
Pas a pas
1) Indexez une petite base de connaissance (documents).
2) Recuperez le meilleur contexte pour la question.
3) Construisez le prompt final avec question + contexte recupere.