geom2bscan module
This module contains code to:
Train a CNN-based black box neural network model to approximate the B-scan response from a given dataset.
Use a trained model to quickly compute B-scan predictions from a (possibly) geometry-only dataset.
- src.dataset_creation.geom2bscan.build_network()[source]
Builds a geom2bscan keras model
- Returns:
- tf.keras.Model
the geom2bscan keras model
- src.dataset_creation.geom2bscan.check_shapes_equal(dataset_output_path: str | Path, model) bool[source]
Checks if the shapes of model input and dataset samples are equal.
- Parameters:
- dataset_output_pathstr | Path
dataset output folder path.
- modelkeras.Model
model used for inference.
- Returns:
- bool
True if the model input and geometry shapes are equal, False otherwise.
- src.dataset_creation.geom2bscan.filter_PML_bug_bscans(geometries: ndarray, bscans: ndarray, upper_limit: float = 1400000.0)[source]
Filters the samples of the dataset to remove the ones in which the PML bug appears. This is done using a threshold value on the sum of the absolute value of the pixels of B-scans. The reason is that the samples with the PML bug exibit much more reflections, so are well divided from the other ones.
- Parameters:
- geometriesnp.ndarray of shape [B, C, H, W]
sample geometries
- bscansnp.ndarray of shape [B, H, W]
sample bscans
- upper_limitfloat, optional
the upper limit threshold. A value of 1.4e6 has been empirically found to divide the dataset in a clean way.
- Returns:
- tuple[np.ndarray, np.ndarray]
the filtered geometries and bscans
- src.dataset_creation.geom2bscan.filter_initial_wave(train_labels: ndarray, test_labels: ndarray)[source]
Filters the initial wave in the labels by removing the median of the values present in the pixels of the train labels.
- Parameters:
- train_labelsnp.ndarray of shape [B, H, W, C]
the train B-scan labels
- test_labelsnp.ndarray of shape [B, H, W, C]
the test B-scan labels
- Returns:
- tuple[np.ndarray, np.ndarray, np.ndarray]
train labels, test labels, median image used for the filtering
- src.dataset_creation.geom2bscan.load_dataset(dataset_output_path: str | Path = PosixPath('dataset_bscan/gprmax_output_files'), indexes_interval: tuple[int, int] | None = None, verbose=True)[source]
Loads the B-scan dataset at the specified location. Performs filtering of the PML bug related to steel sleepers.
- Parameters:
- dataset_output_pathstr | Path, optional
location of the output folder of the dataset, by default Path(“dataset_bscan/gprmax_output_files”)
- indexes_intervaltuple[int, int] | None
If specified, the interval of indexes to load, upper limit excluded. Default : None.
- verbosebool
Controls weather to print info and show a progress bar for the data loading. Default: True.
- Returns:
- tuple[np.ndarray, np.ndarray, list[str]]
data, labels, sample names
- src.dataset_creation.geom2bscan.predict(dataset_output_path: str | Path, model_checkpoint_path: str | Path, output_dir: str | Path, label_mask_path: str | Path | None, memory_batch_size: int | None = None)[source]
Predicts B-scans for the full dataset given and stores them as numpy arrays.
- Parameters:
- dataset_output_pathstr | Path
Path to the dataset output folder with samples to predict.
- model_checkpoint_pathstr | Path
Path to the keras model checkpoint to use for prediction.
- output_dirstr | Path
Directory in which the predictions will be stored.
- label_mask_pathstr | Path | None
Path to the label median mask used for preprocessing labels during training. This will be added to the predictions to obtain the output B-scan. If None, no postprocessing is done.
- memory_batch_sizeint | None
Size of the sample batches loaded into memory for each predictions cycle. If None, the full dataset is loaded into memory.
- Returns:
- np.ndarray | None
predictions, only if memory_batch_size is None.
- src.dataset_creation.geom2bscan.predict_batch(geometries: ndarray, model, mask: ndarray | None)[source]
Predicts the B-scans for the specified geometries
- Parameters:
- geometriesnp.ndarray
Input geometries
- modeltf.keras.Model
Model used for inference
- masknp.ndarray | None
Mask added to the predictions of the model to calculate the B-scans, or None.
- Returns:
- np.ndarray
predictions
- src.dataset_creation.geom2bscan.predict_batch_wide(geometries: ndarray, model, mask: ndarray | None)[source]
Splits wide geometries into multiple inputs for the model, then combines the predictions back into a single B-scan.
Uses a sliding window approach for predictions, with an offset of half image (192/2 = 96 pixels for the pretrained model).
- Parameters:
- geometriesnp.ndarray
input wide geometries.
- modeltf.keras.Model
Model used for inference.
- masknp.ndarray | None
label mask, or None.
- Returns:
- np.ndarray
predictions
- src.dataset_creation.geom2bscan.preprocess_data(geoms: ndarray, bscans: ndarray)[source]
Preprocesses the data by applying a rescaling of 1/80 to the relative permittivity values and the cube root to both the conductivity and bscan values
- Parameters:
- geomsnp.ndarray of shape [B, 2, H, W]
sample geometries
- bscansnp.ndarray of shape [B, H, W]
sample bscans
- Returns:
- tuple[np.ndarray, np.ndarray]
processed geometries and bscans
- src.dataset_creation.geom2bscan.save_predictions(preds: ndarray, output_dir: str | Path, sample_names: list[str])[source]
Saves the predictions, each in its own file.
- Parameters:
- predsnp.ndarray
Predictions
- output_dirstr | Path
Directory in which to save the files
- sample_nameslist[str]
ordered list of names for the samples to save
- src.dataset_creation.geom2bscan.split_dataset(geometries: ndarray, bscans: ndarray, random_state: int = 42)[source]
Splits the dataset into train and test set.
- Parameters:
- geometriesnp.ndarray
dataset geometries (input data)
- bscansnp.ndarray
dataset bscans (labels)
- random_stateint, optional
seed for the dataset split, by default 42
- Returns:
- tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]
train data, test data, train labels, test labels.
- src.dataset_creation.geom2bscan.train(dataset_output_path: str | Path, output_path: str | Path, epochs: int, batch_size: int)[source]
Trains a geom2bscan model with the (labelled) data in the dataset.
- Parameters:
- dataset_output_pathstr | Path
dataset output folder.
- output_pathstr | Path
Directory in which all training results will be saved.
- epochsint
number of training epochs.
- batch_sizeint
training batch size.