The Westernmost Tethys Blog Geology mapping, basin analysis and 3D modeling


Python libraries and k-Nearest neighbors algorithms to delineate syn-sedimentary faults

 This paper introduces a methodology based on Python libraries and machine learning k-Nearest Neighbors (KNN) algorithms to create an interactive 3D HTML model (3D_Vertical_Sections_Faults_LRD.html) that combines 2D grain-size KNN-prediction vertical maps (vertical sections) from which syn-sedimentary faults and other features in sedimentary porous media can be delineated. The model can be visualized and handled with conventional web browsers. The grain-size physical parameter is measurable, constant over instrumental time, handleable mathematically, and its range can be associated to lithological classes. Grain-size input data comes from a public database of 433 boreholes in the Llobregat River Delta (LRD) in NE Spain. Four lithological classes were defined: Pre-Quaternary basement, and Quaternary gravel, sand, and clay–silt. Using a new KNN-prediction algorithm, seven NW–SE (transversal) and three SW–NE (longitudinal) vertical sections were created following the orientation of faults identified in surface and detected in reflection seismic geophysical surveys. For exploratory K values in the 1–75 range were used. K around 25 provides the general and smoothy shape of the basement top surface, whereas K = 1 is a optimal value to represent the heterogeneity of the LRD at short distance. Using a new KNN-prediction confidence algorithm inspired in the Similarity Ratio algorithm for machine-learning KNN, the vertical sections overall confidence was evaluated as satisfactory. A general decreasing confidence trend according to the decreasing data density with depth and from inland to seaward was found. The vertical sections created with K = 1 show horizontal interruptions (displacements or vertical steps) in the basement continuity and in the Quaternary coarse bodies (gravel and sand) attributable to the action of Quaternary active faults. These faults have been linked or correlated with well-known active faults in the area related in much cases with the Valencia Trough opening. Moreover, several faults detected in surface and other identified in this paper by the first time have been revealed as fault zones made of fault branches with different steps in an echelon-like arrangement.

Longitudinal (SW–NE) vertical sections A–B to M–N with the grain-size (lithological) classes, showing the location of correlated syn-sedimentary faults, and lateral migration of syn-tectonic gravel bodies and channels. The boundary of the Lower and Upper Detrital Complexes and the location of the boreholes (highlighted accordingly to their proximity) are also indicated.

Faulting seems to be more evident in the Pleistocene Lower Detrital Complex and much less active or inactive in the Holocene Upper Detrital Complex. Syn-tectonic gravel channels faulty controlled, progradation of gravel lobes, and lateral migration of channel bars were also observed. At its current development stage, this methodology could also be applied to other geological environments, making the due minor modifications of the code, and is especially suitable to reduce the high (usually unmeasurable) uncertainty associated to the qualitative geological data used in more complex numerical tools aimed at modelling a lot of geological resources (groundwater, minerals, geothermal, petroleum) or different Earth phenomena. 

Cite as: Martín-Martín,  M., Bullejos, M., Cabezas, D., Alcalá, F. J., (2023). Using python libraries and k-Nearest neighbors algorithms to delineate syn-sedimentary faults in sedimentary porous media. Marine and Petroleum Geology, 153. 106283. doi: 10.1016/j.marpetgeo.2023.106283 



K-nearest neighbors algorithm used for classifying geological variables.

The k-nearest neighbors (KNN) algorithm is a non-parametric supervised machine learning classifier; which uses proximity and similarity to make classifications or predictions about the grouping of an individual data point. This ability makes the KNN algorithm ideal for classifying datasets of geological variables and parameters prior to 3D visualization. This paper introduces a machine learning KNN algorithm and Python libraries for visualizing the 3D stratigraphic architecture of sedimentary porous media in the Quaternary onshore Llobregat River Delta (LRD) in northeastern Spain. A first HTML model showed a consecutive 5 m-equispaced set of horizontal sections of the granulometry classes created with the KNN algorithm from 0 to 120 m below sea level in the onshore LRD. A second HTML model showed the 3D mapping of the main Quaternary gravel and coarse sand sedimentary bodies (lithosomes) and the basement (Pliocene and older rocks) top surface created with Python libraries. These results reproduce well the complex sedimentary structure of the LRD reported in recent scientific publications and proves the suitability of the KNN algorithm and Python libraries for visualizing the 3D stratigraphic structure of sedimentary porous media, which is a crucial stage in making decisions in different environmental and economic geology disciplines.

The 3D stratigraphic architecture (coarse lithosomes and the basement top surface (BTS)) of the onshore LRD. (A) Gravel and coarse sand lithosomes and BTS. (B) Gravel lithosomes and BTS. (C) Coarse sand lithosomes and BTS. (D) Basement top surface. The color assigned to each granulometry class is cyan for gravel, yellow for coarse sand, and reddish-brownish for the basement. An interactive 3D HTML version of this model is included in Supplementary Materials

Interactive figures here

Cite as: Bullejos, M., Cabezas, D., Martín-Martín, M., Alcalá, F.J., 2022. A K-Nearest Neighbors Alborithm in Python for Visualizing the 3D Stratigraphic Architecture of the Llobregat River Delta in NE Spain.  J. Mar. Sci. Eng. 

Powered by WordPress