0 Posted 2025-04-28Updated 2025-04-28Machine Learning5 minutes read (About 805 words)

Backpropagation Hand by Hand

Backpropagation is the algorithm that trains neural networks by adjusting their weights to minimize the loss. It works by applying the chain rule of calculus to efficiently compute how the loss changes with respect to each weight. Starting from the output layer, it propagates the error backward through the network, layer by layer, updating the weights based on their contribution to the error.$$\frac{d(\text{Loss})}{dW_l} = \frac{d(\text{Loss})}{da_l} \times \frac{da_l}{dz_l} \times \frac{dz_l}{dW_l}$$

0 Posted 2025-04-23Updated 2025-04-28Machine Learning2 minutes read (About 294 words)

GNN: Graph Neural Networks

Graph Neural Networks (GNNs) are a class of neural networks designed to work directly with graph-structured data. They have gained significant attention in recent years due to their ability to model complex relationships and interactions in various domains, including social networks, molecular biology, and recommendation systems.

0 Posted 2025-04-10Updated 2025-04-10Machine Learning / LM / Proteina minute read (About 178 words)

Rosetta, the Pioneer of Protein Structure Prediction

Rosetta is a comprehensive computational suite that plays a pivotal role in the protein folding field by predicting and designing protein structures based on amino acid sequences. It employs a combination of physics-based energy functions and advanced algorithms, such as fragment assembly and Monte Carlo sampling, to simulate the folding process and explore the vast conformational landscape of proteins. By iteratively optimizing potential structures, Rosetta helps researchers identify low-energy, stable configurations that closely resemble naturally occurring proteins. This tool not only aids in elucidating fundamental principles of protein structure and function but also supports the design of novel proteins and therapeutic interventions, making it an indispensable resource in structural biology and bioengineering.

0 Posted 2025-04-05Updated 2025-04-09Machine Learning / LM / Protein11 minutes read (About 1671 words)

AlphaFold

AlphaFold2

0 Posted 2024-10-23Updated 2024-10-30Machine Learning / Data Format7 minutes read (About 1027 words)

HDF5 Data Format Introduction

HDF5 (Hierarchical Data Format version 5) is a file format designed for efficiently storing and organizing large, complex datasets. It uses a hierarchical structure of **groups** (like directories) and **datasets** (like files) to store data, supporting multidimensional arrays, metadata, and a wide variety of data types. Key advantages include **compression**, **cross-platform compatibility**, and the ability to handle large datasets that don’t fit in memory. It’s widely used in fields like scientific computing, machine learning, and bioinformatics due to its efficiency and flexibility.

0 Posted 2024-08-16Updated 2024-10-09Machine Learning / Regression5 minutes read (About 709 words)

Kernel Density Estimation (KDE)

Kernel Density Estimation (KDE) is a non-parametric method to estimate the probability density function (PDF) of a random variable based on a finite set of data points. Unlike parametric methods, which assume that the underlying data follows a specific distribution (like normal, exponential, etc.), KDE makes no such assumptions and can model more complex data distributions.$$ \hat{f}(x) = \frac{1}{n \cdot h} \sum_{i=1}^{n} K\left(\frac{x - x_i}{h}\right) $$

0 Posted 2024-08-09Updated 2024-10-09Machine Learning / Math9 minutes read (About 1323 words)

Understanding the Taylor Series and Its Applications in Machine Learning

The Taylor Series is a mathematical tool that approximates complex functions with polynomials, playing a crucial role in machine learning optimization. It enhances gradient descent by incorporating second-order information, leading to faster and more stable convergence. Additionally, it aids in linearizing non-linear models and informs regularization techniques. This post explores the significance of the Taylor Series in improving model training efficiency and understanding model behavior. $$\cos(x) = \sum_{n=0}^{\infty} \frac{(-1)^n}{(2n)!} x^{2n}$$

0 Posted 2024-05-30Updated 2024-08-16Machine Learning / Math6 minutes read (About 942 words)

Simulated Annealing (SA)

Simulated Annealing (SA) is a probabilistic technique used for finding an approximate solution to an optimization problem. It is particularly useful for problems where the search space is large and complex, and other methods might get stuck in local optima.