(Work in progress) Implementation of adaptive Dataset class which adapts to different data structures

Review changes
Open in Workspace
Download
Patches
Plain diff

Closed (Work in progress) Implementation of adaptive Dataset class which adapts to different data structures

tr_val_te_splits into main

Overview 0
Commits 10
Pipelines 0
Changes 10

Closed (Work in progress) Implementation of adaptive Dataset class which adapts to different data structures

ofhkrrequested to merge

tr_val_te_splits into main Jan 31, 2024

Overview 0
Commits 10
Pipelines 0
Changes 10

The original way the dataset Class was implemented only accounted for the specific data structure with:

|-- train
|   |-- images # ordered with numbers
|   |-- labels # same order as 'images'
|-- test
|   |-- images
|   |-- labels

It cannot be assumed that the user has their data already prepared in training and testing folders, but rather in one of the following structures:

Case 1: There are no folder - all images and targets are stored in the same data directory. The image and corresponding target have similar names (eg: data1.tif, data1mask.tif)

        |-- data
            |-- img1.tif
            |-- img1_mask.tif
            |-- img2.tif
            |-- img2_mask.tif
            |-- ...

Case 2: There are two folders - one with all the images and one with all the targets.

        |-- data
            |-- images
                |-- img1.tif
                |-- img2.tif
                |-- ...
            |-- masks
                |-- img1_mask.tif
                |-- img2_mask.tif
                |-- ...

Case 3: There are many folders - each folder with a case (eg. patient) and multiple images.

        |-- data
            |-- patient1
                |-- p1_img1.tif
                |-- p1_img1_mask.tif
                |-- p1_img2.tif
                |-- p1_img2_mask.tif
                |-- ...
            |-- patient2
                |-- p2_img1.tif
                |-- p2_img1_mask.tif
                |-- p2_img2.tif
                |-- p2_img2_mask.tif
                |-- ...
            |-- ...

The dataloader is also changed such that the user can specify if they have a dedicated train, validation and/or test folder that they want to specify.

For the last case (Case 3), I still haven't implemented that the images from the same patients are be kept in the same splits. (#TODO can be found where case 3 is implemented).

This branch gives the user the freedom to specify the path to their data with a larger flexibility on how the dataset is structured.

Edited Jan 31, 2024 by ofhkr