(Work in progress) Implementation of adaptive Dataset class which adapts to different data structures
Compare changes
The original way the dataset Class was implemented only accounted for the specific data structure with:
|-- train
| |-- images # ordered with numbers
| |-- labels # same order as 'images'
|-- test
| |-- images
| |-- labels
It cannot be assumed that the user has their data already prepared in training and testing folders, but rather in one of the following structures:
Case 1: There are no folder - all images and targets are stored in the same data directory. The image and corresponding target have similar names (eg: data1.tif, data1mask.tif)
|-- data
|-- img1.tif
|-- img1_mask.tif
|-- img2.tif
|-- img2_mask.tif
|-- ...
Case 2: There are two folders - one with all the images and one with all the targets.
|-- data
|-- images
|-- img1.tif
|-- img2.tif
|-- ...
|-- masks
|-- img1_mask.tif
|-- img2_mask.tif
|-- ...
Case 3: There are many folders - each folder with a case (eg. patient) and multiple images.
|-- data
|-- patient1
|-- p1_img1.tif
|-- p1_img1_mask.tif
|-- p1_img2.tif
|-- p1_img2_mask.tif
|-- ...
|-- patient2
|-- p2_img1.tif
|-- p2_img1_mask.tif
|-- p2_img2.tif
|-- p2_img2_mask.tif
|-- ...
|-- ...
The dataloader is also changed such that the user can specify if they have a dedicated train
, validation
and/or test
folder that they want to specify.
For the last case (Case 3), I still haven't implemented that the images from the same patients are be kept in the same splits. (#TODO can be found where case 3 is implemented).
This branch gives the user the freedom to specify the path to their data with a larger flexibility on how the dataset is structured.