geom2bscan module

This module contains code to:

  1. Train a CNN-based black box neural network model to approximate the B-scan response from a given dataset.

  2. Use a trained model to quickly compute B-scan predictions from a (possibly) geometry-only dataset.

src.dataset_creation.geom2bscan.build_network()[source]

Builds a geom2bscan keras model

Returns:
tf.keras.Model

the geom2bscan keras model

src.dataset_creation.geom2bscan.check_shapes_equal(dataset_output_path: str | Path, model) bool[source]

Checks if the shapes of model input and dataset samples are equal.

Parameters:
dataset_output_pathstr | Path

dataset output folder path.

modelkeras.Model

model used for inference.

Returns:
bool

True if the model input and geometry shapes are equal, False otherwise.

src.dataset_creation.geom2bscan.filter_PML_bug_bscans(geometries: ndarray, bscans: ndarray, upper_limit: float = 1400000.0)[source]

Filters the samples of the dataset to remove the ones in which the PML bug appears. This is done using a threshold value on the sum of the absolute value of the pixels of B-scans. The reason is that the samples with the PML bug exibit much more reflections, so are well divided from the other ones.

Parameters:
geometriesnp.ndarray of shape [B, C, H, W]

sample geometries

bscansnp.ndarray of shape [B, H, W]

sample bscans

upper_limitfloat, optional

the upper limit threshold. A value of 1.4e6 has been empirically found to divide the dataset in a clean way.

Returns:
tuple[np.ndarray, np.ndarray]

the filtered geometries and bscans

src.dataset_creation.geom2bscan.filter_initial_wave(train_labels: ndarray, test_labels: ndarray)[source]

Filters the initial wave in the labels by removing the median of the values present in the pixels of the train labels.

Parameters:
train_labelsnp.ndarray of shape [B, H, W, C]

the train B-scan labels

test_labelsnp.ndarray of shape [B, H, W, C]

the test B-scan labels

Returns:
tuple[np.ndarray, np.ndarray, np.ndarray]

train labels, test labels, median image used for the filtering

src.dataset_creation.geom2bscan.load_dataset(dataset_output_path: str | Path = PosixPath('dataset_bscan/gprmax_output_files'), indexes_interval: tuple[int, int] | None = None, verbose=True)[source]

Loads the B-scan dataset at the specified location. Performs filtering of the PML bug related to steel sleepers.

Parameters:
dataset_output_pathstr | Path, optional

location of the output folder of the dataset, by default Path(“dataset_bscan/gprmax_output_files”)

indexes_intervaltuple[int, int] | None

If specified, the interval of indexes to load, upper limit excluded. Default : None.

verbosebool

Controls weather to print info and show a progress bar for the data loading. Default: True.

Returns:
tuple[np.ndarray, np.ndarray, list[str]]

data, labels, sample names

src.dataset_creation.geom2bscan.predict(dataset_output_path: str | Path, model_checkpoint_path: str | Path, output_dir: str | Path, label_mask_path: str | Path | None, memory_batch_size: int | None = None)[source]

Predicts B-scans for the full dataset given and stores them as numpy arrays.

Parameters:
dataset_output_pathstr | Path

Path to the dataset output folder with samples to predict.

model_checkpoint_pathstr | Path

Path to the keras model checkpoint to use for prediction.

output_dirstr | Path

Directory in which the predictions will be stored.

label_mask_pathstr | Path | None

Path to the label median mask used for preprocessing labels during training. This will be added to the predictions to obtain the output B-scan. If None, no postprocessing is done.

memory_batch_sizeint | None

Size of the sample batches loaded into memory for each predictions cycle. If None, the full dataset is loaded into memory.

Returns:
np.ndarray | None

predictions, only if memory_batch_size is None.

src.dataset_creation.geom2bscan.predict_batch(geometries: ndarray, model, mask: ndarray | None)[source]

Predicts the B-scans for the specified geometries

Parameters:
geometriesnp.ndarray

Input geometries

modeltf.keras.Model

Model used for inference

masknp.ndarray | None

Mask added to the predictions of the model to calculate the B-scans, or None.

Returns:
np.ndarray

predictions

src.dataset_creation.geom2bscan.predict_batch_wide(geometries: ndarray, model, mask: ndarray | None)[source]

Splits wide geometries into multiple inputs for the model, then combines the predictions back into a single B-scan.

Uses a sliding window approach for predictions, with an offset of half image (192/2 = 96 pixels for the pretrained model).

Parameters:
geometriesnp.ndarray

input wide geometries.

modeltf.keras.Model

Model used for inference.

masknp.ndarray | None

label mask, or None.

Returns:
np.ndarray

predictions

src.dataset_creation.geom2bscan.preprocess_data(geoms: ndarray, bscans: ndarray)[source]

Preprocesses the data by applying a rescaling of 1/80 to the relative permittivity values and the cube root to both the conductivity and bscan values

Parameters:
geomsnp.ndarray of shape [B, 2, H, W]

sample geometries

bscansnp.ndarray of shape [B, H, W]

sample bscans

Returns:
tuple[np.ndarray, np.ndarray]

processed geometries and bscans

src.dataset_creation.geom2bscan.save_predictions(preds: ndarray, output_dir: str | Path, sample_names: list[str])[source]

Saves the predictions, each in its own file.

Parameters:
predsnp.ndarray

Predictions

output_dirstr | Path

Directory in which to save the files

sample_nameslist[str]

ordered list of names for the samples to save

src.dataset_creation.geom2bscan.split_dataset(geometries: ndarray, bscans: ndarray, random_state: int = 42)[source]

Splits the dataset into train and test set.

Parameters:
geometriesnp.ndarray

dataset geometries (input data)

bscansnp.ndarray

dataset bscans (labels)

random_stateint, optional

seed for the dataset split, by default 42

Returns:
tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]

train data, test data, train labels, test labels.

src.dataset_creation.geom2bscan.train(dataset_output_path: str | Path, output_path: str | Path, epochs: int, batch_size: int)[source]

Trains a geom2bscan model with the (labelled) data in the dataset.

Parameters:
dataset_output_pathstr | Path

dataset output folder.

output_pathstr | Path

Directory in which all training results will be saved.

epochsint

number of training epochs.

batch_sizeint

training batch size.