->Attention mechanism is a type of neural network layer that allows the model to selectively focus on different parts of the input when making predictions.
->It is commonly used in natural language processing tasks such as machine translation, text classification, and question answering.
Attention : described as alertness or ability to engage with surroundings
-when a person is shown with multiple images , the eye movement can show his attention on the most attracted part
->Neurons at the earliest stages are tuned to simple visual attributes such as intensity contrast, colour opponency, orientation, direction and velocity of motion, or stereo disparity at several spatial scales. Neuronal tuning becomes increasingly more specialized with the progression from low-level to high-level visual areas, such that higher-level visual areas include neurons that respond only to corners or junctions, shape-from-shading cues or views of specific real-world objects.
#Relation between human memory and attention:
-it stores information in the memory that the human subject most pays attention too.
-This model allow the network to focus on different aspects of complex input individually until the entire data set is categorized.
-The goal is to break down complex tasks into smaller areas of attention that are processed sequentially.
-they are used to overcome the limitations of traditional machine translation models
-In translation systems , Instead of translating word by word, attention mechanisms assign fixed-length vectors to capture the overall meaning and sentiment of the input, resulting in more accurate translations.
#Working :
->attention models make use of a function that maps a query and a set of key-value pairs to generate an output.
->These elements, including the query, keys, values, and final output, are all represented as vectors.
->The output is calculated by taking a weighted sum of the values, with the weights determined by a compatibility function that evaluates the similarity between the query and the corresponding key.
->the model focuses intensely on a specific point in an image, providing a “high-resolution” understanding, while perceiving the surrounding areas with less detail, akin to “low-resolution.” As the network gains a better understanding of the scene, it adjusts the focal point accordingly.