Skip to content

Read Zarr Connectome Files #8717

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 141 commits into from
Jul 8, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
141 commits
Select commit Hold shift + click to select a range
4dd9409
WIP: Read zarr agglomerate files
fm3 May 19, 2025
380bd69
zarr group path
fm3 May 19, 2025
7fb643f
test reading from zarr array
fm3 May 19, 2025
f987ebf
axisOrder: make y optional
fm3 May 19, 2025
935571a
Merge branch 'master' into agglomerates-zarr
fm3 May 22, 2025
7c1cc8b
undo attempt to make axisOrder.y optional
fm3 May 22, 2025
5af14fc
read multi array, ignoring underlying storage and axis order
fm3 May 22, 2025
f055c1e
apply agglomerate
fm3 May 27, 2025
13ff0e3
offset can be long; pass tokencontext
fm3 May 27, 2025
d853389
WIP read agglomerate skeleton
fm3 May 27, 2025
56ce08b
fix reading agglomerate skeleton
fm3 May 27, 2025
2855d2b
Change DatasetArray shape from Int to Long. Implement reading largest…
fm3 May 27, 2025
4ed5483
remove unused agglomeratesForAllSegments
fm3 May 27, 2025
7f8f662
Merge branch 'master' into agglomerates-zarr
fm3 May 27, 2025
291aab5
add shortcut for shape.product==0; implement segmentIdsForAgglomerateId
fm3 May 27, 2025
0439bce
remove unused test
fm3 May 27, 2025
11027d7
implement positionForSegmentId; agglomerateIdsForSegmentIds
fm3 May 28, 2025
0c5d647
select mapping by request
fm3 May 28, 2025
90d97cd
shortcut for single-dimension shape+offset
fm3 May 28, 2025
8551f99
handle uint32 agglomerate_to_segments arrays
fm3 May 28, 2025
d183c99
useZarr=false to test ci
fm3 May 28, 2025
cd04466
change chunkIndices back to list
fm3 May 28, 2025
9eac53d
use headOption instead of list deconstruction
fm3 May 28, 2025
0105b8e
Merge branch 'master' into agglomerates-zarr
fm3 Jun 3, 2025
fd7a281
WIP distinguish btw hdf5 and zarr according to registered layer attac…
fm3 Jun 3, 2025
19641af
pass datasource id + layer
fm3 Jun 3, 2025
716794d
list attached agglomerate files
fm3 Jun 3, 2025
feb8cb4
format
fm3 Jun 3, 2025
5c924dc
Merge branch 'master' into agglomerates-zarr
fm3 Jun 4, 2025
f040483
use agglomeratefilekey as cache key for proper cache clear support
fm3 Jun 4, 2025
e781baf
clear agglomerate caches on layer/ds reload
fm3 Jun 4, 2025
419490f
avoid injection
fm3 Jun 4, 2025
9c78167
prioritize WebknossosZarrExplorer
fm3 Jun 5, 2025
fdd7b4a
cleanup
fm3 Jun 5, 2025
09dfc0c
changelog
fm3 Jun 5, 2025
10dc7c3
make dummy datasource id more explicit
fm3 Jun 10, 2025
73b2f80
WIP Read Zarr Meshfiles
fm3 Jun 10, 2025
ca30ebe
read metadata from zarr group header
fm3 Jun 10, 2025
29073e8
wip read neuroglancer segment manifests
fm3 Jun 10, 2025
41de8c8
enrich
fm3 Jun 10, 2025
adb7d07
find local offset in bucket
fm3 Jun 12, 2025
e63fb2a
sort meshfile services, lookup with MeshFileKey
fm3 Jun 12, 2025
fd8dc30
move more code
fm3 Jun 12, 2025
0f9e5c8
Merge branch 'master' into meshfile-zarr
fm3 Jun 16, 2025
00e775c
iterate on meshfile services
fm3 Jun 16, 2025
7d51512
adapt frontend to simplified protocol
fm3 Jun 16, 2025
5d1b768
keys
fm3 Jun 16, 2025
ca64481
Merge branch 'master' into agglomerates-zarr
fm3 Jun 16, 2025
8cf2853
Merge branch 'agglomerates-zarr' into meshfile-zarr
fm3 Jun 16, 2025
16b38d6
explore + list meshfiles
fm3 Jun 17, 2025
7c94d31
fix frontend type
fm3 Jun 17, 2025
93d560c
adapt schema to neuroglancerPrecomputed dataformat for attachments
fm3 Jun 17, 2025
74a0bc3
Merge branch 'master' into agglomerates-zarr
fm3 Jun 17, 2025
12a6423
Merge branch 'agglomerates-zarr' into meshfile-zarr
fm3 Jun 17, 2025
1d492d1
adapt to new json format
fm3 Jun 17, 2025
9af3004
clear caches
fm3 Jun 17, 2025
331d178
some cleanup
fm3 Jun 17, 2025
43d9051
in list request, only return successes
fm3 Jun 17, 2025
d6a9817
add migration to guide
fm3 Jun 17, 2025
73a7ac2
unify spelling meshFile
fm3 Jun 17, 2025
16ae134
fix class injection
fm3 Jun 17, 2025
8077858
Adapt full mesh service; introduce credentials for attachments
fm3 Jun 18, 2025
7d676bf
fix updating job status for jobs with no credit transactions
fm3 Jun 18, 2025
76cd9d6
fix adhocMag selection in create animation modal
fm3 Jun 18, 2025
09567eb
make typechecker happy
fm3 Jun 18, 2025
f79b171
Merge branch 'master' into agglomerates-zarr
fm3 Jun 18, 2025
e170ebe
pr feedback; add services as singletons for proper cache use
fm3 Jun 18, 2025
b76d473
Merge branch 'agglomerates-zarr' into meshfile-zarr
fm3 Jun 18, 2025
1360184
address coderabbit review suggestions
fm3 Jun 18, 2025
72d5c5b
typo
fm3 Jun 18, 2025
0c961eb
Fix dtype bug, remove singleton instantiations again
fm3 Jun 19, 2025
365ec5d
rename lookup function as suggested in pr review
fm3 Jun 19, 2025
be77185
add ucar dependency resolver
fm3 Jun 19, 2025
5920905
remove sciJava resolver
fm3 Jun 19, 2025
753c25b
Revert "remove sciJava resolver"
fm3 Jun 19, 2025
11d2611
Revert "add ucar dependency resolver"
fm3 Jun 19, 2025
a9aed9e
Merge branch 'master' into agglomerates-zarr
fm3 Jun 19, 2025
bb07185
Merge branch 'master' into agglomerates-zarr
fm3 Jun 19, 2025
d7a667a
Merge branch 'master' into agglomerates-zarr
fm3 Jun 23, 2025
2c53ae6
Merge branch 'agglomerates-zarr' into meshfile-zarr
fm3 Jun 23, 2025
cdd3075
Merge branch 'master' into meshfile-zarr
fm3 Jun 23, 2025
0fddb3d
unify function names
fm3 Jun 23, 2025
6f8ed21
Merge branch 'master' into meshfile-zarr
fm3 Jun 23, 2025
c5f12f5
WIP: Read Zarr Segment Index Files
fm3 Jun 24, 2025
b294061
introduce abstraction for attached segment index files
fm3 Jun 24, 2025
d47b240
implement arrSegmentIndexFileService
fm3 Jun 24, 2025
b1f5831
Correctly read segment index file as mag1 segment positions
fm3 Jun 24, 2025
d34fe29
unused imports
fm3 Jun 24, 2025
0f10892
Merge branch 'master' into meshfile-zarr
fm3 Jun 25, 2025
0eb73d2
implement pr feedback
fm3 Jun 25, 2025
9217eb5
unused import
fm3 Jun 25, 2025
63c5ebe
Merge branch 'meshfile-zarr' into zarr-segment-index
fm3 Jun 25, 2025
537a824
Merge branch 'master' into meshfile-zarr
fm3 Jun 25, 2025
fd932e6
same cleanup also when looking up agglomerates
fm3 Jun 25, 2025
99ec110
Merge branch 'meshfile-zarr' into zarr-segment-index
fm3 Jun 25, 2025
05f477d
add cache clear for segment index files
fm3 Jun 25, 2025
dd0e803
changelog
fm3 Jun 25, 2025
5318f06
WIP: Read Zarr Connectome Files
fm3 Jun 25, 2025
1d79dfc
WIP list +lookup
fm3 Jun 25, 2025
0b98d57
Merge branch 'master' into meshfile-zarr
fm3 Jun 25, 2025
d6f11cb
delegate to correct service depending on attachment dataformat
fm3 Jun 25, 2025
08b57fd
hard coded synapse type names for hdf5
fm3 Jun 25, 2025
56f8385
Update webknossos-datastore/app/com/scalableminds/webknossos/datastor…
fm3 Jun 25, 2025
c3dc703
Merge branch 'master' into meshfile-zarr
fm3 Jun 26, 2025
4f313b4
Merge branch 'meshfile-zarr' into zarr-segment-index
fm3 Jun 26, 2025
ed53532
Merge branch 'zarr-segment-index' into zarr-connectome
fm3 Jun 26, 2025
36f4cf8
read attributes
fm3 Jun 26, 2025
325d784
move functions used in both hdf5 and zarr case up to ConnectomeFileSe…
fm3 Jun 26, 2025
75fbccf
Merge branch 'master' into meshfile-zarr
MichaelBuessemeyer Jun 26, 2025
e50dc66
implement first functions in zarr connectome file service
fm3 Jun 26, 2025
95a8dab
implement remaining features
fm3 Jun 26, 2025
5c390ea
cache clear, cleanup
fm3 Jun 26, 2025
6706343
changelog
fm3 Jun 26, 2025
b01ce43
Merge branch 'master' into meshfile-zarr
fm3 Jun 30, 2025
b36841a
Merge branch 'meshfile-zarr' into zarr-segment-index
fm3 Jun 30, 2025
33eeba0
Merge branch 'zarr-segment-index' into zarr-connectome
fm3 Jun 30, 2025
78fefbc
Merge branch 'master' into zarr-segment-index
fm3 Jun 30, 2025
7532b61
Merge branch 'zarr-segment-index' into zarr-connectome
fm3 Jun 30, 2025
aed235e
fix error message
fm3 Jun 30, 2025
7761d41
normalize paths; fix reading connectome file metadata; inline delegat…
fm3 Jun 30, 2025
1fbf7b7
correctly look up local fallback hdf5 files
fm3 Jun 30, 2025
f3f5077
read correct arrays; fix returning fill value when whole shard is mis…
fm3 Jun 30, 2025
1f12060
Merge branch 'master' into zarr-segment-index
fm3 Jun 30, 2025
5fc925b
Merge branch 'zarr-segment-index' into zarr-connectome
fm3 Jun 30, 2025
9e73f73
Merge branch 'master' into zarr-segment-index
fm3 Jul 2, 2025
661b258
implement pr feedback; extract common string values to traits
fm3 Jul 2, 2025
313838c
format
fm3 Jul 2, 2025
2ab3ab2
Merge branch 'master' into zarr-segment-index
MichaelBuessemeyer Jul 3, 2025
53570e6
Merge branch 'zarr-segment-index' into zarr-connectome
fm3 Jul 3, 2025
f6ddc57
extract string constants to ConnectomeFileUtils
fm3 Jul 3, 2025
1687300
Merge branch 'master' into zarr-connectome
fm3 Jul 3, 2025
b18eb66
once more csc csr typo
fm3 Jul 3, 2025
66c6197
Merge branch 'master' into zarr-connectome
fm3 Jul 3, 2025
871256d
fix typos in error messages
fm3 Jul 3, 2025
d065282
Merge branch 'zarr-connectome' of github.com:scalableminds/webknossos…
fm3 Jul 3, 2025
79603c0
Merge branch 'master' into zarr-connectome
fm3 Jul 7, 2025
09ef7a5
implement pr feedback (part 1)
fm3 Jul 7, 2025
6beaf69
catch toPtr<fromPtr
fm3 Jul 7, 2025
6280991
add finishAccess calls to allow cache release
fm3 Jul 7, 2025
2974e9f
add missing cache clear; flip >=
fm3 Jul 7, 2025
ad2f59b
Merge branch 'master' into zarr-connectome
fm3 Jul 8, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions unreleased_changes/8717.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
### Added
- Connectomes can now also be read from the new zarr3-based format, and from remote object storage.
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
package com.scalableminds.util.collections

import scala.collection.Searching.{Found, InsertionPoint}

object SequenceUtils {
def findUniqueElement[T](list: Seq[T]): Option[T] = {
val uniqueElements = list.distinct
Expand Down Expand Up @@ -51,4 +53,11 @@ object SequenceUtils {
val batchTo = Math.min(to, (batchIndex + 1) * batchSize + from - 1)
(batchFrom, batchTo)
}

// Search in a sorted array, returns Box of index where element is found or, if missing, where element would be inserted
def searchSorted(haystack: Array[Long], needle: Long): Int =
haystack.search(needle) match {
case Found(i) => i
case InsertionPoint(i) => i
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,11 @@ import org.apache.pekko.actor.ActorSystem
import com.google.inject.AbstractModule
import com.google.inject.name.Names
import com.scalableminds.webknossos.datastore.services._
import com.scalableminds.webknossos.datastore.services.connectome.{
ConnectomeFileService,
Hdf5ConnectomeFileService,
ZarrConnectomeFileService
}
import com.scalableminds.webknossos.datastore.services.mapping.{
AgglomerateService,
Hdf5AgglomerateService,
Expand Down Expand Up @@ -52,6 +57,9 @@ class DataStoreModule extends AbstractModule {
bind(classOf[SegmentIndexFileService]).asEagerSingleton()
bind(classOf[ZarrSegmentIndexFileService]).asEagerSingleton()
bind(classOf[Hdf5SegmentIndexFileService]).asEagerSingleton()
bind(classOf[ConnectomeFileService]).asEagerSingleton()
bind(classOf[ZarrConnectomeFileService]).asEagerSingleton()
bind(classOf[Hdf5ConnectomeFileService]).asEagerSingleton()
bind(classOf[NeuroglancerPrecomputedMeshFileService]).asEagerSingleton()
bind(classOf[RemoteSourceDescriptorService]).asEagerSingleton()
bind(classOf[ChunkCacheService]).asEagerSingleton()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,12 @@ import com.scalableminds.webknossos.datastore.services.uploading._
import com.scalableminds.webknossos.datastore.storage.DataVaultService
import com.scalableminds.util.tools.Box.tryo
import com.scalableminds.util.tools.{Box, Empty, Failure, Full}
import com.scalableminds.webknossos.datastore.services.connectome.{
ByAgglomerateIdsRequest,
BySynapseIdsRequest,
ConnectomeFileService,
SynapticPartnerDirection
}
import com.scalableminds.webknossos.datastore.services.mapping.AgglomerateService
import play.api.data.Form
import play.api.data.Forms.{longNumber, nonEmptyText, number, tuple}
Expand Down Expand Up @@ -452,14 +458,16 @@ class DataSourceController @Inject()(
meshFileService.clearCache(dataSourceId, layerName)
val closedSegmentIndexFileHandleCount =
segmentIndexFileService.clearCache(dataSourceId, layerName)
val closedConnectomeFileHandleCount =
connectomeFileService.clearCache(dataSourceId, layerName)
val reloadedDataSource: InboxDataSource = dataSourceService.dataSourceFromDir(
dataSourceService.dataBaseDir.resolve(organizationId).resolve(datasetDirectoryName),
organizationId)
datasetErrorLoggingService.clearForDataset(organizationId, datasetDirectoryName)
val clearedVaultCacheEntriesOpt = dataSourceService.invalidateVaultCache(reloadedDataSource, layerName)
clearedVaultCacheEntriesOpt.foreach { clearedVaultCacheEntries =>
logger.info(
s"Cleared caches for ${layerName.map(l => s"layer '$l' of ").getOrElse("")}dataset $organizationId/$datasetDirectoryName: closed $closedAgglomerateFileHandleCount agglomerate file handles, $closedMeshFileHandleCount mesh file handles, $closedSegmentIndexFileHandleCount segment index file handles, removed $clearedBucketProviderCount bucketProviders, $clearedVaultCacheEntries vault cache entries and $removedChunksCount image chunk cache entries.")
s"Cleared caches for ${layerName.map(l => s"layer '$l' of ").getOrElse("")}dataset $organizationId/$datasetDirectoryName: closed $closedAgglomerateFileHandleCount agglomerate file handles, $closedMeshFileHandleCount mesh file handles, $closedSegmentIndexFileHandleCount segment index file handles, $closedConnectomeFileHandleCount connectome file handles, removed $clearedBucketProviderCount bucketProviders, $clearedVaultCacheEntries vault cache entries and $removedChunksCount image chunk cache entries.")
}
reloadedDataSource
}
Expand Down Expand Up @@ -510,21 +518,12 @@ class DataSourceController @Inject()(
Action.async { implicit request =>
accessTokenService.validateAccessFromTokenContext(
UserAccessRequest.readDataSources(DataSourceId(datasetDirectoryName, organizationId))) {
val connectomeFileNames =
connectomeFileService.exploreConnectomeFiles(organizationId, datasetDirectoryName, dataLayerName)
for {
mappingNames <- Fox.serialCombined(connectomeFileNames.toList) { connectomeFileName =>
val path =
connectomeFileService.connectomeFilePath(organizationId,
datasetDirectoryName,
dataLayerName,
connectomeFileName)
connectomeFileService.mappingNameForConnectomeFile(path)
}
connectomesWithMappings = connectomeFileNames
.zip(mappingNames)
.map(tuple => ConnectomeFileNameWithMappingName(tuple._1, tuple._2))
} yield Ok(Json.toJson(connectomesWithMappings))
(dataSource, dataLayer) <- dataSourceRepository.getDataSourceAndDataLayer(organizationId,
datasetDirectoryName,
dataLayerName)
connectomeFileInfos <- connectomeFileService.listConnectomeFiles(dataSource.id, dataLayer)
} yield Ok(Json.toJson(connectomeFileInfos))
}
}

Expand All @@ -535,10 +534,13 @@ class DataSourceController @Inject()(
accessTokenService.validateAccessFromTokenContext(
UserAccessRequest.readDataSources(DataSourceId(datasetDirectoryName, organizationId))) {
for {
meshFilePath <- Fox.successful(
connectomeFileService
.connectomeFilePath(organizationId, datasetDirectoryName, dataLayerName, request.body.connectomeFile))
synapses <- connectomeFileService.synapsesForAgglomerates(meshFilePath, request.body.agglomerateIds)
(dataSource, dataLayer) <- dataSourceRepository.getDataSourceAndDataLayer(organizationId,
datasetDirectoryName,
dataLayerName)
meshFileKey <- connectomeFileService.lookUpConnectomeFileKey(dataSource.id,
dataLayer,
request.body.connectomeFile)
synapses <- connectomeFileService.synapsesForAgglomerates(meshFileKey, request.body.agglomerateIds)
} yield Ok(Json.toJson(synapses))
}
}
Expand All @@ -551,12 +553,18 @@ class DataSourceController @Inject()(
accessTokenService.validateAccessFromTokenContext(
UserAccessRequest.readDataSources(DataSourceId(datasetDirectoryName, organizationId))) {
for {
meshFilePath <- Fox.successful(
connectomeFileService
.connectomeFilePath(organizationId, datasetDirectoryName, dataLayerName, request.body.connectomeFile))
agglomerateIds <- connectomeFileService.synapticPartnerForSynapses(meshFilePath,
directionValidated <- SynapticPartnerDirection
.fromString(direction)
.toFox ?~> "could not parse synaptic partner direction"
(dataSource, dataLayer) <- dataSourceRepository.getDataSourceAndDataLayer(organizationId,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A high level thing I just noticed: Why do you always "load" the dataLayer and pass it to the lookUp....FileKey functions? At least for lookUpConnectomeFileKey only the dataLayer.name prop is being used. But isn't dataLayer.name equal to dataLayerName?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, that’s the same value. However, the lookUp* functions also use dataLayer.attachments, which needs the full layer. Also, getDataSourceAndDataLayer is cheap as it only uses the in-memory dataSourceRepository. So we can use it to also validate that the dataSource really exists, even if we then only use dataSource.id

datasetDirectoryName,
dataLayerName)
meshFileKey <- connectomeFileService.lookUpConnectomeFileKey(dataSource.id,
dataLayer,
request.body.connectomeFile)
agglomerateIds <- connectomeFileService.synapticPartnerForSynapses(meshFileKey,
request.body.synapseIds,
direction)
directionValidated)
} yield Ok(Json.toJson(agglomerateIds))
}
}
Expand All @@ -568,10 +576,13 @@ class DataSourceController @Inject()(
accessTokenService.validateAccessFromTokenContext(
UserAccessRequest.readDataSources(DataSourceId(datasetDirectoryName, organizationId))) {
for {
meshFilePath <- Fox.successful(
connectomeFileService
.connectomeFilePath(organizationId, datasetDirectoryName, dataLayerName, request.body.connectomeFile))
synapsePositions <- connectomeFileService.positionsForSynapses(meshFilePath, request.body.synapseIds)
(dataSource, dataLayer) <- dataSourceRepository.getDataSourceAndDataLayer(organizationId,
datasetDirectoryName,
dataLayerName)
meshFileKey <- connectomeFileService.lookUpConnectomeFileKey(dataSource.id,
dataLayer,
request.body.connectomeFile)
synapsePositions <- connectomeFileService.positionsForSynapses(meshFileKey, request.body.synapseIds)
} yield Ok(Json.toJson(synapsePositions))
}
}
Expand All @@ -583,10 +594,13 @@ class DataSourceController @Inject()(
accessTokenService.validateAccessFromTokenContext(
UserAccessRequest.readDataSources(DataSourceId(datasetDirectoryName, organizationId))) {
for {
meshFilePath <- Fox.successful(
connectomeFileService
.connectomeFilePath(organizationId, datasetDirectoryName, dataLayerName, request.body.connectomeFile))
synapseTypes <- connectomeFileService.typesForSynapses(meshFilePath, request.body.synapseIds)
(dataSource, dataLayer) <- dataSourceRepository.getDataSourceAndDataLayer(organizationId,
datasetDirectoryName,
dataLayerName)
meshFileKey <- connectomeFileService.lookUpConnectomeFileKey(dataSource.id,
dataLayer,
request.body.connectomeFile)
synapseTypes <- connectomeFileService.typesForSynapses(meshFileKey, request.body.synapseIds)
} yield Ok(Json.toJson(synapseTypes))
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,8 @@ class DatasetArray(vaultPath: VaultPath,
tc: TokenContext): Fox[MultiArray] =
if (shape.contains(0)) {
Fox.successful(MultiArrayUtils.createEmpty(header.resolvedDataType, rank))
} else if (shape.exists(_ < 0)) {
Fox.failure(s"Trying to read negative shape from DatasetArray: ${shape.mkString(",")}")
} else {
val totalOffset: Array[Long] = offset.zip(header.voxelOffset).map { case (o, v) => o - v }.padTo(offset.length, 0)
val chunkIndices = ChunkUtils.computeChunkIndices(datasetShape, chunkShape, shape, totalOffset)
Expand Down
Loading