You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/blog/digital-neuromorphic-hardware-read-list/index.md
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ show_author_bios: true
12
12
13
13
Here's a list of articles and theses related to digital hardware designs for neuomorphic applications. I plan to update it regularly. To be redirected directly to the sources, click on the titles!
14
14
15
-
If you are new to neuromorphic computing, I strongly suggest to get a grasp of how an SNN works from [this paper](https://arxiv.org/abs/2109.12894). Otherwise, it will be pretty difficult to understand the content of the papers listed here.
15
+
If you are new to [neuromorphic computing](/neuromorphic-computing/), I strongly suggest to get a grasp of how an SNN works from [this paper](https://arxiv.org/abs/2109.12894). Otherwise, it will be pretty difficult to understand the content of the papers listed here.
16
16
17
17
## 2015
18
18
@@ -44,7 +44,7 @@ The Loihi chip employs **128 neuromorphic cores**, each of which consisting of *
44
44
45
45
[*A 0.086-mm2 12.7-pJ/SOP 64k-Synapse 256-Neuron Online-Learning Digital Spiking Neuromorphic Processor in 28nm CMOS*](https://arxiv.org/abs/1804.07858), Charlotte Frenkel et al., 2019
46
46
47
-
In this paper, a digital neuromorphic processor is presented. The Verilog is also [open source](https://github.com/ChFrenkel/ODIN)!
47
+
In this paper, a digital neuromorphic processor is presented. The Verilog is also [open source](https://github.com/ChFrenkel/ODIN)! The processor is also known as [ODIN](/neuromorphic-computing/hardware/odin-frenkel/).
48
48
49
49
The neurons states and the synapses weights are stored in two foundry SRAMs on chip. In order to emulate a crossbar, **time-multiplexing** is adopted: the synapses weights and neurons states are updated in a sequential manner instead of in parallel. On the core, **256 neurons (4kB SRAM)** and **256x256 synapses (64kB SRAM)** are embedded. This allows to get a very high synapses and neuron densities: **741k synapses per squared millimiters** and **3k neurons per squared millimeters**, using a **28nm CMOS FDSOI** process.
Copy file name to clipboardExpand all lines: content/blog/efficient-compression-event-based-data-neuromorphic-applications/index.md
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,7 @@ show_author_bios: true
14
14
---
15
15
16
16
## Datasets grow larger in size
17
-
As neuromorphic algorithms tackle more complex tasks that are linked to bigger datasets, and event cameras mature to have higher spatial resolution, it is worth looking at how to encode that data efficiently when storing it on disk. To give you an example, Prophesee's latest automotive [object detection dataset](https://docs.prophesee.ai/stable/datasets.html) is some 3.5 TB in size for under 40h of recordings with a single camera.
17
+
As [neuromorphic algorithms](/neuromorphic-computing/software/) tackle more complex tasks that are linked to bigger datasets, and event cameras mature to have higher spatial resolution, it is worth looking at how to encode that data efficiently when storing it on disk. To give you an example, Prophesee's latest automotive [object detection dataset](https://docs.prophesee.ai/stable/datasets.html) is some 3.5 TB in size for under 40h of recordings with a single camera.
18
18
19
19
## Event cameras record with fine-grained temporal resolution
20
20
In contrast to conventional cameras, event cameras output changes in illumination, which is already a form of compression. But the output data rate is still a lot higher cameras because of the microsecond temporal resolution that event cameras are able to record with. When streaming data, we get millions of tuples of microsecond timestamps, x/y coordinates and polarity indicators per second that look nothing like a frame but are a list of events:
@@ -39,7 +39,7 @@ Ideally, we want to be close to the origin where we read fast and compression is
39
39
The authors of this post have released [Expelliarmus](/neuromorphic-computing/software/data-tools/expelliarmus/) as a lightweight, well-tested, pip-installable framework that can read and write different formats easily. If you're working with dat, evt2 or evt3 formats, why not give it a try?
40
40
41
41
## Summary
42
-
When training spiking neural networks on event-based data, we want to be able to feed new data to the network as fast as possible. But given the high data rate of an event camera, the amount of data quickly becomes an issue itself, especially for more complex tasks. So we want to choose a good trade-off between a dataset size that's manageable and reading speed. We hope that this article will help future groups that record large-scale datasets to pick a good encoding format.
42
+
When training [spiking neural networks](/neuromorphic-computing/software/snn-frameworks/) on event-based data, we want to be able to feed new data to the network as fast as possible. But given the high data rate of an event camera, the amount of data quickly becomes an issue itself, especially for more complex tasks. So we want to choose a good trade-off between a dataset size that's manageable and reading speed. We hope that this article will help future groups that record large-scale datasets to pick a good encoding format.
43
43
44
44
## Comments
45
45
The aedat4 file contains IMU events as well as change detection events, which increases the file size artificially in contrast to the other benchmarked formats.
Copy file name to clipboardExpand all lines: content/blog/spiking-neural-network-framework-benchmarking/index.md
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -18,13 +18,13 @@ show_author_bios: true
18
18
19
19
## Introduction
20
20
21
-
Open Neuromorphic's [list of SNN frameworks](https://github.com/open-neuromorphic/open-neuromorphic) currently counts 11 libraries, and those are only the most popular ones! As the sizes of spiking neural network models grow thanks to deep learning, optimization becomes more important for researchers and practitioners alike. Training SNNs is often slow, as the stateful networks are typically fed sequential inputs. Today's most popular training method then is some form of backpropagation through time, whose time complexity scales with the number of time steps. We benchmark libraries that all take slightly different approaches on how to extend deep learning frameworks for gradient-based optimization of SNNs. We focus on the total time it takes to pass data forward and backward through the network as well as the memory required to do so. However, there are obviously other, non-tangible qualities of frameworks such as extensibility, quality of documentation, ease of install or support for neuromorphic hardware that we're not going to try to capture here. In our benchmarks, we use a single fully-connected (linear) and a leaky integrate and fire (LIF) layer. The input data has batch size of 16, 500 time steps and n neurons.
21
+
Open Neuromorphic's [list of SNN frameworks](https://github.com/open-neuromorphic/open-neuromorphic) currently counts 11 libraries, and those are only the most popular ones! As the sizes of [spiking neural network models](/neuromorphic-computing/software/snn-frameworks/) grow thanks to deep learning, optimization becomes more important for researchers and practitioners alike. Training SNNs is often slow, as the stateful networks are typically fed sequential inputs. Today's most popular training method then is some form of backpropagation through time, whose time complexity scales with the number of time steps. We benchmark libraries that all take slightly different approaches on how to extend deep learning frameworks for gradient-based optimization of SNNs. We focus on the total time it takes to pass data forward and backward through the network as well as the memory required to do so. However, there are obviously other, non-tangible qualities of frameworks such as extensibility, quality of documentation, ease of install or support for neuromorphic hardware that we're not going to try to capture here. In our benchmarks, we use a single fully-connected (linear) and a leaky integrate and fire (LIF) layer. The input data has batch size of 16, 500 time steps and n neurons.
22
22
23
23
## Benchmark Results
24
24
25
25
{{< chart data="framework-benchmarking-16k" caption="Comparison of time taken for forward and backward passes in different frameworks, for 16k neurons." mobile="framework-benchmarking-16k.png">}}
26
26
27
-
The first figure shows runtime results for a 16k neuron network. The SNN libraries evaluated can be broken into three categories: 1. frameworks with tailored/custom CUDA kernels, 2. frameworks that purely leverage PyTorch functionality, and 3. a library that uses JAX exclusively for acceleration. For the custom CUDA libraries, [SpikingJelly](https://github.com/fangwei123456/spikingjelly) with a CuPy backend clocks in at just 0.26s for both forward and backward call combined. The libraries that use an implementation of [SLAYER](https://proceedings.neurips.cc/paper_files/paper/2018/hash/82f2b308c3b01637c607ce05f52a2fed-Abstract.html) ([Lava DL](https://github.com/lava-nc/lava-dl)) or [EXODUS](https://www.frontiersin.org/articles/10.3389/fnins.2023.1110444/full) ([Sinabs EXODUS](https://github.com/synsense/sinabs-exodus) / [Rockpool EXODUS](https://rockpool.ai/reference/_autosummary/nn.modules.LIFExodus.html?)) benefit from custom CUDA code and vectorization across the time dimension in both forward and backward passes and come within 1.5-2x the latency. It is noteworthy that such custom implementations exist for specific neuron models (such as the LIF under test), but not for arbitrary neuron models. On top of that, custom CUDA/CuPy backend implementations need to be compiled and then it is up to the maintainer to test it on different systems. Networks that are implemented in SLAYER, EXODUS or SpikingJelly with a CuPy backend cannot be executed on a CPU (unless converted).
27
+
The first figure shows runtime results for a 16k neuron network. The SNN libraries evaluated can be broken into three categories: 1. frameworks with tailored/custom CUDA kernels, 2. frameworks that purely leverage PyTorch functionality, and 3. a library that uses JAX exclusively for acceleration. For the custom CUDA libraries, [SpikingJelly](https://github.com/fangwei123456/spikingjelly) with a CuPy backend clocks in at just 0.26s for both forward and backward call combined. The libraries that use an implementation of [SLAYER](https://proceedings.neurips.cc/paper_files/paper/2018/hash/82f2b308c3b01637c607ce05f52a2fed-Abstract.html) ([Lava DL](https://github.com/lava-nc/lava-dl)) or [EXODUS](https://www.frontiersin.org/articles/10.3389/fnins.2023.1110444/full) ([Sinabs EXODUS](/neuromorphic-computing/software/snn-frameworks/sinabs/) / [Rockpool EXODUS](/neuromorphic-computing/software/snn-frameworks/rockpool/)) benefit from custom CUDA code and vectorization across the time dimension in both forward and backward passes and come within 1.5-2x the latency. It is noteworthy that such custom implementations exist for specific neuron models (such as the LIF under test), but not for arbitrary neuron models. On top of that, custom CUDA/CuPy backend implementations need to be compiled and then it is up to the maintainer to test it on different systems. Networks that are implemented in SLAYER, EXODUS or SpikingJelly with a CuPy backend cannot be executed on a CPU (unless converted).
28
28
29
29
In contrast, frameworks such as [snnTorch](/neuromorphic-computing/software/snn-frameworks/snntorch/), [Norse](/neuromorphic-computing/software/snn-frameworks/norse/), [Sinabs](/neuromorphic-computing/software/snn-frameworks/sinabs/) or [Rockpool](/neuromorphic-computing/software/snn-frameworks/rockpool/) are very flexible when it comes to defining custom neuron models.
30
30
For some libraries, that flexibility comes at a cost of slower computation.
Copy file name to clipboardExpand all lines: content/blog/strategic-vision-open-neuromorphic/index.md
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -28,7 +28,7 @@ Join the discussion [on Discord](https://discord.gg/hUygPUdD8E), star us [on Git
28
28
29
29
Open Neuromorphic is almost 4 years old.
30
30
31
-
We set out to make the field of neuromorphic engineering more transparent, open, and accessible to newcomers. It's been a tremendous success: Open Neuromorphic is the biggest online neuromorphic community *in the world*, our videos are seen by thousands of researchers, our material is reaching even further, and the 2000+ academics and students on our Discord server are actively and happily collaborating to further the scientific vision of neuromorphic engineering.
31
+
We set out to make the field of [neuromorphic engineering](/neuromorphic-computing/) more transparent, open, and accessible to newcomers. It's been a tremendous success: Open Neuromorphic is the biggest online neuromorphic community *in the world*, our videos are seen by thousands of researchers, our material is reaching even further, and the 2000+ academics and students on our Discord server are actively and happily collaborating to further the scientific vision of neuromorphic engineering.
32
32
33
33
But, let's face it: we still have a long way to go.
Copy file name to clipboardExpand all lines: content/blog/truenorth-deep-dive-ibm-neuromorphic-chip-design/index.md
+3-3Lines changed: 3 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -14,13 +14,13 @@ show_author_bios: true
14
14
15
15
## Why do we want to emulate the brain?
16
16
17
-
If you have ever read an article on neuromorphic computing, you might have noticed that in the introduction of each of these there is the same statement: "The brain is much powerful than any AI machine when it comes to cognitive tasks but it runs on a **10W** power budget!". This is absolutely true: neurons in the brain communicate among each other by means of **spikes**, which are short voltage pulses that propagate from one neuron to the other. The average spiking activity is estimated to be around **10Hz** (i.e. a spike every 100ms). This yields **very low processing power consumption**, since the activity in the brain results to be **really sparse** (at least, this is the hypothesis).
17
+
If you have ever read an article on [neuromorphic computing](/neuromorphic-computing/), you might have noticed that in the introduction of each of these there is the same statement: "The brain is much powerful than any AI machine when it comes to cognitive tasks but it runs on a **10W** power budget!". This is absolutely true: neurons in the brain communicate among each other by means of **spikes**, which are short voltage pulses that propagate from one neuron to the other. The average spiking activity is estimated to be around **10Hz** (i.e. a spike every 100ms). This yields **very low processing power consumption**, since the activity in the brain results to be **really sparse** (at least, this is the hypothesis).
18
18
19
19
How can the brain do all this? There are several reasons (or hypotheses, I should say):
20
20
* the **3D connectivity** among neurons. While in nowadays chip we can place connections among logic gates and circuits only in the 2D space, in the brain we have the whole 3D space at our disposal; this allows the mammalian brain to reach a fanout in the order or **10 thousand connections** per neuron.
21
21
***extremely low power operation**. Trough thousands of years of evolution, the most power efficient "brain implementation" has won, since the ones that consume less energy to live are the ones that turn out to survive when there is no food (not entirely correct but I hope that true scientists won't kill me). The power density in the brain is estimated to be **10mW per squared centimeter**, while in a modern digital processor we easily reach **100W per squared centimeter**.
22
22
23
-
Hence, IBM decide to try to emulate the brain with **TrueNorth**, a **4096 cores** chip packing **1 million neurons** and **256 million synapses**. Let's dive into its design!
23
+
Hence, IBM decide to try to emulate the brain with **[TrueNorth](/neuromorphic-computing/hardware/truenorth-ibm/)**, a **4096 cores** chip packing **1 million neurons** and **256 million synapses**. Let's dive into its design!
24
24
25
25
## Introduction
26
26
@@ -42,7 +42,7 @@ zoomable="false"
42
42
43
43
In general, in a GALS architecture, there is an array of processing elements (PEs) which are synchronised through a global clock. The local clocks in the PEs can be different for each of them, since each PE may be running at a different speed. When two different **clock domains** have to be interfaced, the communication among them is effectively asynchronous: **handshake** protocols have to be implement among these in order to guarantee proper global operation.
44
44
45
-
In TrueNorth, as in [SpiNNaker](http://apt.cs.manchester.ac.uk/projects/SpiNNaker/SpiNNchip/), there is no global clock: the PEs, which are **neurosynaptic cores**, are interconnected through a **completely asynchronous network**. In this way, the chip operations is event-driven, since the network gets activated only when there are spikes (and other kind of events) to be transmitted.
45
+
In TrueNorth, as in [SpiNNaker](/neuromorphic-computing/hardware/spinnaker-2-university-of-dresden/), there is no global clock: the PEs, which are **neurosynaptic cores**, are interconnected through a **completely asynchronous network**. In this way, the chip operations is event-driven, since the network gets activated only when there are spikes (and other kind of events) to be transmitted.
0 commit comments