0 Posted 2025-04-28Updated 2025-04-28Machine Learning7 minutes read (About 1102 words)

Backpropagation Hand by Hand

Backpropagation is the algorithm that trains neural networks by adjusting their weights to minimize the loss. It works by applying the chain rule of calculus to efficiently compute how the loss changes with respect to each weight. Starting from the output layer, it propagates the error backward through the network, layer by layer, updating the weights based on their contribution to the error.$$W^\ell\!\leftarrow W^\ell - \eta\, (a^ {\ell-1}) ^\top \delta^\ell,\quad b^\ell\!\leftarrow b^\ell - \eta\,\sum\delta^\ell.$$

0 Posted 2025-04-23Updated 2025-07-17Machine Learning8 minutes read (About 1130 words)

GNN: Graph Neural Networks

Graph Neural Networks (GNNs) are a class of neural networks designed to work directly with graph-structured data. They have gained significant attention in recent years due to their ability to model complex relationships and interactions in various domains, including social networks, molecular biology, and recommendation systems.

0 Posted 2025-04-10Updated 2025-04-10Machine Learning / LM / Proteina minute read (About 178 words)

Rosetta, the Pioneer of Protein Structure Prediction

Rosetta is a comprehensive computational suite that plays a pivotal role in the protein folding field by predicting and designing protein structures based on amino acid sequences. It employs a combination of physics-based energy functions and advanced algorithms, such as fragment assembly and Monte Carlo sampling, to simulate the folding process and explore the vast conformational landscape of proteins. By iteratively optimizing potential structures, Rosetta helps researchers identify low-energy, stable configurations that closely resemble naturally occurring proteins. This tool not only aids in elucidating fundamental principles of protein structure and function but also supports the design of novel proteins and therapeutic interventions, making it an indispensable resource in structural biology and bioengineering.

0 Posted 2025-04-05Updated 2025-04-09Machine Learning / LM / Protein11 minutes read (About 1671 words)

AlphaFold

AlphaFold2

0 Posted 2025-01-01Updated 2025-01-06Notes / Class / UIUC / AI12 minutes read (About 1859 words)

High Dimension Data

0 Posted 2024-12-30Updated 2024-12-30Notes / Class / UIUC / AI7 minutes read (About 1090 words)

AI: Logistic Regression

Logistic regression is a supervised machine learning algorithm used for binary classification tasks. Unlike linear regression, which predicts continuous values, logistic regression predicts the probability that a given input belongs to a certain class.

0 Posted 2024-12-30Updated 2024-12-30Notes / Class / UIUC / AI19 minutes read (About 2802 words)

Linear Model Optimization

0 Posted 2024-12-30Updated 2024-12-30Notes / Class / UIUC / AI12 minutes read (About 1796 words)

Regularization

Regularization is a way to make sure our model doesn't become too complicated. It ensures the model doesn’t overfit the training data while still making good predictions on new data. Think of it as adding a 'rule' or 'constraint' that prevents the model from relying too much on any specific feature or predictor.

0 Posted 2024-10-23Updated 2024-10-30Machine Learning / Data Format7 minutes read (About 1027 words)

HDF5 Data Format Introduction

HDF5 (Hierarchical Data Format version 5) is a file format designed for efficiently storing and organizing large, complex datasets. It uses a hierarchical structure of **groups** (like directories) and **datasets** (like files) to store data, supporting multidimensional arrays, metadata, and a wide variety of data types. Key advantages include **compression**, **cross-platform compatibility**, and the ability to handle large datasets that don’t fit in memory. It’s widely used in fields like scientific computing, machine learning, and bioinformatics due to its efficiency and flexibility.

0 Posted 2024-10-09Updated 2024-10-09Notes / Class / UIUC / AI7 minutes read (About 1005 words)

Softmax

Softmax is a mathematical function commonly used in machine learning, particularly in the context of classification problems. It transforms a vector of raw scores, often called logits, from a model into a vector of probabilities that sum to one. The probabilities generated by the softmax function represent the likelihood of each class being the correct classification. $$\sigma(\mathbf{z})_i = \frac{e^{z_i}}{\sum_{j=1}^K e^{z_j}}$$

0 Posted 2024-09-29Updated 2024-10-09Notes / Class / UIUC / AI14 minutes read (About 2144 words)

Support Vector Machine

Support Vector Machine (SVM) is a supervised learning algorithm used for classification and regression. It finds the best hyperplane that separates the data into different classes with the largest possible margin. SVM can work well with high-dimensional data and use different kernel functions to transform data for better separation when it is not linearly separable.$$f(x) = sign(w^T x + b)$$

0 Posted 2024-09-29Updated 2024-10-09Notes / Class / UIUC / AI16 minutes read (About 2423 words)

Random Forest

Random Forest is an ensemble machine learning algorithm that builds multiple decision trees during training and merges their outputs to improve accuracy and reduce overfitting. It is commonly used for both classification and regression tasks. By averaging the predictions of several decision trees, Random Forest reduces the variance and increases model robustness, making it less prone to errors from noisy data. $$\text{Entropy}_{\text{after}} = \frac{|S_l|}{|S|}\text{Entropy}(S_l) + \frac{|S_r|}{|S|}\text{Entropy}(S_r)$$

0 Posted 2024-08-16Updated 2024-10-09Machine Learning / Regression5 minutes read (About 709 words)

Kernel Density Estimation (KDE)

Kernel Density Estimation (KDE) is a non-parametric method to estimate the probability density function (PDF) of a random variable based on a finite set of data points. Unlike parametric methods, which assume that the underlying data follows a specific distribution (like normal, exponential, etc.), KDE makes no such assumptions and can model more complex data distributions.$$ \hat{f}(x) = \frac{1}{n \cdot h} \sum_{i=1}^{n} K\left(\frac{x - x_i}{h}\right) $$

0 Posted 2024-05-30Updated 2024-08-16Machine Learning / Math6 minutes read (About 942 words)

Simulated Annealing (SA)

Simulated Annealing (SA) is a probabilistic technique used for finding an approximate solution to an optimization problem. It is particularly useful for problems where the search space is large and complex, and other methods might get stuck in local optima.