Backpropagation Hand by Hand

Backpropagation is the algorithm that trains neural networks by adjusting their weights to minimize the loss. It works by applying the chain rule of calculus to efficiently compute how the loss changes with respect to each weight. Starting from the output layer, it propagates the error backward through the network, layer by layer, updating the weights based on their contribution to the error.$$\frac{d(\text{Loss})}{dW_l} = \frac{d(\text{Loss})}{da_l} \times \frac{da_l}{dz_l} \times \frac{dz_l}{dW_l}$$
Read more

GNN: Graph Neural Networks

Graph Neural Networks (GNNs) are a class of neural networks designed to work directly with graph-structured data. They have gained significant attention in recent years due to their ability to model complex relationships and interactions in various domains, including social networks, molecular biology, and recommendation systems.
Read more

Rosetta, the Pioneer of Protein Structure Prediction

Rosetta is a comprehensive computational suite that plays a pivotal role in the protein folding field by predicting and designing protein structures based on amino acid sequences. It employs a combination of physics-based energy functions and advanced algorithms, such as fragment assembly and Monte Carlo sampling, to simulate the folding process and explore the vast conformational landscape of proteins. By iteratively optimizing potential structures, Rosetta helps researchers identify low-energy, stable configurations that closely resemble naturally occurring proteins. This tool not only aids in elucidating fundamental principles of protein structure and function but also supports the design of novel proteins and therapeutic interventions, making it an indispensable resource in structural biology and bioengineering.
Read more
AlphaFold© Karobben
HDF5 Data Format Introduction© Karobben

HDF5 Data Format Introduction

HDF5 (Hierarchical Data Format version 5) is a file format designed for efficiently storing and organizing large, complex datasets. It uses a hierarchical structure of **groups** (like directories) and **datasets** (like files) to store data, supporting multidimensional arrays, metadata, and a wide variety of data types. Key advantages include **compression**, **cross-platform compatibility**, and the ability to handle large datasets that don’t fit in memory. It’s widely used in fields like scientific computing, machine learning, and bioinformatics due to its efficiency and flexibility.
Read more
Kernel Density Estimation (KDE)© Karobben

Kernel Density Estimation (KDE)

Kernel Density Estimation (KDE) is a non-parametric method to estimate the probability density function (PDF) of a random variable based on a finite set of data points. Unlike parametric methods, which assume that the underlying data follows a specific distribution (like normal, exponential, etc.), KDE makes no such assumptions and can model more complex data distributions.$$ \hat{f}(x) = \frac{1}{n \cdot h} \sum_{i=1}^{n} K\left(\frac{x - x_i}{h}\right) $$
Read more
Understanding the Taylor Series and Its Applications in Machine Learning© Karobben

Understanding the Taylor Series and Its Applications in Machine Learning

The Taylor Series is a mathematical tool that approximates complex functions with polynomials, playing a crucial role in machine learning optimization. It enhances gradient descent by incorporating second-order information, leading to faster and more stable convergence. Additionally, it aids in linearizing non-linear models and informs regularization techniques. This post explores the significance of the Taylor Series in improving model training efficiency and understanding model behavior. $$\cos(x) = \sum_{n=0}^{\infty} \frac{(-1)^n}{(2n)!} x^{2n}$$
Read more
Simulated Annealing (SA)© Karobben

Simulated Annealing (SA)

Simulated Annealing (SA) is a probabilistic technique used for finding an approximate solution to an optimization problem. It is particularly useful for problems where the search space is large and complex, and other methods might get stuck in local optima.
Read more