Skip to content

Commit 5e3433d

Browse files
committed
docs: Improve internal resource linking
1 parent f2708fc commit 5e3433d

File tree

5 files changed

+10
-10
lines changed
  • content/blog
    • digital-neuromorphic-hardware-read-list
    • efficient-compression-event-based-data-neuromorphic-applications
    • spiking-neural-network-framework-benchmarking
    • strategic-vision-open-neuromorphic
    • truenorth-deep-dive-ibm-neuromorphic-chip-design

5 files changed

+10
-10
lines changed

content/blog/digital-neuromorphic-hardware-read-list/index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ show_author_bios: true
1212

1313
Here's a list of articles and theses related to digital hardware designs for neuomorphic applications. I plan to update it regularly. To be redirected directly to the sources, click on the titles!
1414

15-
If you are new to neuromorphic computing, I strongly suggest to get a grasp of how an SNN works from [this paper](https://arxiv.org/abs/2109.12894). Otherwise, it will be pretty difficult to understand the content of the papers listed here.
15+
If you are new to [neuromorphic computing](/neuromorphic-computing/), I strongly suggest to get a grasp of how an SNN works from [this paper](https://arxiv.org/abs/2109.12894). Otherwise, it will be pretty difficult to understand the content of the papers listed here.
1616

1717
## 2015
1818

@@ -44,7 +44,7 @@ The Loihi chip employs **128 neuromorphic cores**, each of which consisting of *
4444

4545
[*A 0.086-mm2 12.7-pJ/SOP 64k-Synapse 256-Neuron Online-Learning Digital Spiking Neuromorphic Processor in 28nm CMOS*](https://arxiv.org/abs/1804.07858), Charlotte Frenkel et al., 2019
4646

47-
In this paper, a digital neuromorphic processor is presented. The Verilog is also [open source](https://github.com/ChFrenkel/ODIN)!
47+
In this paper, a digital neuromorphic processor is presented. The Verilog is also [open source](https://github.com/ChFrenkel/ODIN)! The processor is also known as [ODIN](/neuromorphic-computing/hardware/odin-frenkel/).
4848

4949
The neurons states and the synapses weights are stored in two foundry SRAMs on chip. In order to emulate a crossbar, **time-multiplexing** is adopted: the synapses weights and neurons states are updated in a sequential manner instead of in parallel. On the core, **256 neurons (4kB SRAM)** and **256x256 synapses (64kB SRAM)** are embedded. This allows to get a very high synapses and neuron densities: **741k synapses per squared millimiters** and **3k neurons per squared millimeters**, using a **28nm CMOS FDSOI** process.
5050

content/blog/efficient-compression-event-based-data-neuromorphic-applications/index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ show_author_bios: true
1414
---
1515

1616
## Datasets grow larger in size
17-
As neuromorphic algorithms tackle more complex tasks that are linked to bigger datasets, and event cameras mature to have higher spatial resolution, it is worth looking at how to encode that data efficiently when storing it on disk. To give you an example, Prophesee's latest automotive [object detection dataset](https://docs.prophesee.ai/stable/datasets.html) is some 3.5 TB in size for under 40h of recordings with a single camera.
17+
As [neuromorphic algorithms](/neuromorphic-computing/software/) tackle more complex tasks that are linked to bigger datasets, and event cameras mature to have higher spatial resolution, it is worth looking at how to encode that data efficiently when storing it on disk. To give you an example, Prophesee's latest automotive [object detection dataset](https://docs.prophesee.ai/stable/datasets.html) is some 3.5 TB in size for under 40h of recordings with a single camera.
1818

1919
## Event cameras record with fine-grained temporal resolution
2020
In contrast to conventional cameras, event cameras output changes in illumination, which is already a form of compression. But the output data rate is still a lot higher cameras because of the microsecond temporal resolution that event cameras are able to record with. When streaming data, we get millions of tuples of microsecond timestamps, x/y coordinates and polarity indicators per second that look nothing like a frame but are a list of events:
@@ -39,7 +39,7 @@ Ideally, we want to be close to the origin where we read fast and compression is
3939
The authors of this post have released [Expelliarmus](/neuromorphic-computing/software/data-tools/expelliarmus/) as a lightweight, well-tested, pip-installable framework that can read and write different formats easily. If you're working with dat, evt2 or evt3 formats, why not give it a try?
4040

4141
## Summary
42-
When training spiking neural networks on event-based data, we want to be able to feed new data to the network as fast as possible. But given the high data rate of an event camera, the amount of data quickly becomes an issue itself, especially for more complex tasks. So we want to choose a good trade-off between a dataset size that's manageable and reading speed. We hope that this article will help future groups that record large-scale datasets to pick a good encoding format.
42+
When training [spiking neural networks](/neuromorphic-computing/software/snn-frameworks/) on event-based data, we want to be able to feed new data to the network as fast as possible. But given the high data rate of an event camera, the amount of data quickly becomes an issue itself, especially for more complex tasks. So we want to choose a good trade-off between a dataset size that's manageable and reading speed. We hope that this article will help future groups that record large-scale datasets to pick a good encoding format.
4343

4444
## Comments
4545
The aedat4 file contains IMU events as well as change detection events, which increases the file size artificially in contrast to the other benchmarked formats.

content/blog/spiking-neural-network-framework-benchmarking/index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,13 +18,13 @@ show_author_bios: true
1818

1919
## Introduction
2020

21-
Open Neuromorphic's [list of SNN frameworks](https://github.com/open-neuromorphic/open-neuromorphic) currently counts 11 libraries, and those are only the most popular ones! As the sizes of spiking neural network models grow thanks to deep learning, optimization becomes more important for researchers and practitioners alike. Training SNNs is often slow, as the stateful networks are typically fed sequential inputs. Today's most popular training method then is some form of backpropagation through time, whose time complexity scales with the number of time steps. We benchmark libraries that all take slightly different approaches on how to extend deep learning frameworks for gradient-based optimization of SNNs. We focus on the total time it takes to pass data forward and backward through the network as well as the memory required to do so. However, there are obviously other, non-tangible qualities of frameworks such as extensibility, quality of documentation, ease of install or support for neuromorphic hardware that we're not going to try to capture here. In our benchmarks, we use a single fully-connected (linear) and a leaky integrate and fire (LIF) layer. The input data has batch size of 16, 500 time steps and n neurons.
21+
Open Neuromorphic's [list of SNN frameworks](https://github.com/open-neuromorphic/open-neuromorphic) currently counts 11 libraries, and those are only the most popular ones! As the sizes of [spiking neural network models](/neuromorphic-computing/software/snn-frameworks/) grow thanks to deep learning, optimization becomes more important for researchers and practitioners alike. Training SNNs is often slow, as the stateful networks are typically fed sequential inputs. Today's most popular training method then is some form of backpropagation through time, whose time complexity scales with the number of time steps. We benchmark libraries that all take slightly different approaches on how to extend deep learning frameworks for gradient-based optimization of SNNs. We focus on the total time it takes to pass data forward and backward through the network as well as the memory required to do so. However, there are obviously other, non-tangible qualities of frameworks such as extensibility, quality of documentation, ease of install or support for neuromorphic hardware that we're not going to try to capture here. In our benchmarks, we use a single fully-connected (linear) and a leaky integrate and fire (LIF) layer. The input data has batch size of 16, 500 time steps and n neurons.
2222

2323
## Benchmark Results
2424

2525
{{< chart data="framework-benchmarking-16k" caption="Comparison of time taken for forward and backward passes in different frameworks, for 16k neurons." mobile="framework-benchmarking-16k.png">}}
2626

27-
The first figure shows runtime results for a 16k neuron network. The SNN libraries evaluated can be broken into three categories: 1. frameworks with tailored/custom CUDA kernels, 2. frameworks that purely leverage PyTorch functionality, and 3. a library that uses JAX exclusively for acceleration. For the custom CUDA libraries, [SpikingJelly](https://github.com/fangwei123456/spikingjelly) with a CuPy backend clocks in at just 0.26s for both forward and backward call combined. The libraries that use an implementation of [SLAYER](https://proceedings.neurips.cc/paper_files/paper/2018/hash/82f2b308c3b01637c607ce05f52a2fed-Abstract.html) ([Lava DL](https://github.com/lava-nc/lava-dl)) or [EXODUS](https://www.frontiersin.org/articles/10.3389/fnins.2023.1110444/full) ([Sinabs EXODUS](https://github.com/synsense/sinabs-exodus) / [Rockpool EXODUS](https://rockpool.ai/reference/_autosummary/nn.modules.LIFExodus.html?)) benefit from custom CUDA code and vectorization across the time dimension in both forward and backward passes and come within 1.5-2x the latency. It is noteworthy that such custom implementations exist for specific neuron models (such as the LIF under test), but not for arbitrary neuron models. On top of that, custom CUDA/CuPy backend implementations need to be compiled and then it is up to the maintainer to test it on different systems. Networks that are implemented in SLAYER, EXODUS or SpikingJelly with a CuPy backend cannot be executed on a CPU (unless converted).
27+
The first figure shows runtime results for a 16k neuron network. The SNN libraries evaluated can be broken into three categories: 1. frameworks with tailored/custom CUDA kernels, 2. frameworks that purely leverage PyTorch functionality, and 3. a library that uses JAX exclusively for acceleration. For the custom CUDA libraries, [SpikingJelly](https://github.com/fangwei123456/spikingjelly) with a CuPy backend clocks in at just 0.26s for both forward and backward call combined. The libraries that use an implementation of [SLAYER](https://proceedings.neurips.cc/paper_files/paper/2018/hash/82f2b308c3b01637c607ce05f52a2fed-Abstract.html) ([Lava DL](https://github.com/lava-nc/lava-dl)) or [EXODUS](https://www.frontiersin.org/articles/10.3389/fnins.2023.1110444/full) ([Sinabs EXODUS](/neuromorphic-computing/software/snn-frameworks/sinabs/) / [Rockpool EXODUS](/neuromorphic-computing/software/snn-frameworks/rockpool/)) benefit from custom CUDA code and vectorization across the time dimension in both forward and backward passes and come within 1.5-2x the latency. It is noteworthy that such custom implementations exist for specific neuron models (such as the LIF under test), but not for arbitrary neuron models. On top of that, custom CUDA/CuPy backend implementations need to be compiled and then it is up to the maintainer to test it on different systems. Networks that are implemented in SLAYER, EXODUS or SpikingJelly with a CuPy backend cannot be executed on a CPU (unless converted).
2828

2929
In contrast, frameworks such as [snnTorch](/neuromorphic-computing/software/snn-frameworks/snntorch/), [Norse](/neuromorphic-computing/software/snn-frameworks/norse/), [Sinabs](/neuromorphic-computing/software/snn-frameworks/sinabs/) or [Rockpool](/neuromorphic-computing/software/snn-frameworks/rockpool/) are very flexible when it comes to defining custom neuron models.
3030
For some libraries, that flexibility comes at a cost of slower computation.

content/blog/strategic-vision-open-neuromorphic/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ Join the discussion [on Discord](https://discord.gg/hUygPUdD8E), star us [on Git
2828

2929
Open Neuromorphic is almost 4 years old.
3030

31-
We set out to make the field of neuromorphic engineering more transparent, open, and accessible to newcomers. It's been a tremendous success: Open Neuromorphic is the biggest online neuromorphic community *in the world*, our videos are seen by thousands of researchers, our material is reaching even further, and the 2000+ academics and students on our Discord server are actively and happily collaborating to further the scientific vision of neuromorphic engineering.
31+
We set out to make the field of [neuromorphic engineering](/neuromorphic-computing/) more transparent, open, and accessible to newcomers. It's been a tremendous success: Open Neuromorphic is the biggest online neuromorphic community *in the world*, our videos are seen by thousands of researchers, our material is reaching even further, and the 2000+ academics and students on our Discord server are actively and happily collaborating to further the scientific vision of neuromorphic engineering.
3232

3333
But, let's face it: we still have a long way to go.
3434

content/blog/truenorth-deep-dive-ibm-neuromorphic-chip-design/index.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,13 +14,13 @@ show_author_bios: true
1414

1515
## Why do we want to emulate the brain?
1616

17-
If you have ever read an article on neuromorphic computing, you might have noticed that in the introduction of each of these there is the same statement: "The brain is much powerful than any AI machine when it comes to cognitive tasks but it runs on a **10W** power budget!". This is absolutely true: neurons in the brain communicate among each other by means of **spikes**, which are short voltage pulses that propagate from one neuron to the other. The average spiking activity is estimated to be around **10Hz** (i.e. a spike every 100ms). This yields **very low processing power consumption**, since the activity in the brain results to be **really sparse** (at least, this is the hypothesis).
17+
If you have ever read an article on [neuromorphic computing](/neuromorphic-computing/), you might have noticed that in the introduction of each of these there is the same statement: "The brain is much powerful than any AI machine when it comes to cognitive tasks but it runs on a **10W** power budget!". This is absolutely true: neurons in the brain communicate among each other by means of **spikes**, which are short voltage pulses that propagate from one neuron to the other. The average spiking activity is estimated to be around **10Hz** (i.e. a spike every 100ms). This yields **very low processing power consumption**, since the activity in the brain results to be **really sparse** (at least, this is the hypothesis).
1818

1919
How can the brain do all this? There are several reasons (or hypotheses, I should say):
2020
* the **3D connectivity** among neurons. While in nowadays chip we can place connections among logic gates and circuits only in the 2D space, in the brain we have the whole 3D space at our disposal; this allows the mammalian brain to reach a fanout in the order or **10 thousand connections** per neuron.
2121
* **extremely low power operation**. Trough thousands of years of evolution, the most power efficient "brain implementation" has won, since the ones that consume less energy to live are the ones that turn out to survive when there is no food (not entirely correct but I hope that true scientists won't kill me). The power density in the brain is estimated to be **10mW per squared centimeter**, while in a modern digital processor we easily reach **100W per squared centimeter**.
2222

23-
Hence, IBM decide to try to emulate the brain with **TrueNorth**, a **4096 cores** chip packing **1 million neurons** and **256 million synapses**. Let's dive into its design!
23+
Hence, IBM decide to try to emulate the brain with **[TrueNorth](/neuromorphic-computing/hardware/truenorth-ibm/)**, a **4096 cores** chip packing **1 million neurons** and **256 million synapses**. Let's dive into its design!
2424

2525
## Introduction
2626

@@ -42,7 +42,7 @@ zoomable="false"
4242

4343
In general, in a GALS architecture, there is an array of processing elements (PEs) which are synchronised through a global clock. The local clocks in the PEs can be different for each of them, since each PE may be running at a different speed. When two different **clock domains** have to be interfaced, the communication among them is effectively asynchronous: **handshake** protocols have to be implement among these in order to guarantee proper global operation.
4444

45-
In TrueNorth, as in [SpiNNaker](http://apt.cs.manchester.ac.uk/projects/SpiNNaker/SpiNNchip/), there is no global clock: the PEs, which are **neurosynaptic cores**, are interconnected through a **completely asynchronous network**. In this way, the chip operations is event-driven, since the network gets activated only when there are spikes (and other kind of events) to be transmitted.
45+
In TrueNorth, as in [SpiNNaker](/neuromorphic-computing/hardware/spinnaker-2-university-of-dresden/), there is no global clock: the PEs, which are **neurosynaptic cores**, are interconnected through a **completely asynchronous network**. In this way, the chip operations is event-driven, since the network gets activated only when there are spikes (and other kind of events) to be transmitted.
4646

4747
### Low power operation
4848

0 commit comments

Comments
 (0)