Smoothing
Gaussian kernel spatial smoothing of features.
featurely.smoothing
Spatial kernel smoothing of features.
Smoothing replaces each row's feature value with a weighted average over its spatial neighborhood, suppressing row-level noise while preserving regional structure. This is Nadaraya-Watson kernel regression truncated to the nearest neighbors for tractability. Only feature columns are smoothed, never the target, so the candidates are leakage-free.
compute_spatial_smoothed(df, features, lat_col='Latitude', lon_col='Longitude', n_neighbors=50, bandwidth=None, prefix='smooth')
Return Gaussian-kernel smoothed feature candidates.
For each row, the smoothed value is a Gaussian-weighted average of the
feature over its n_neighbors nearest points in latitude-longitude
space (each row is its own nearest neighbor, so the original value gets
the largest single weight).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
Input frame; not modified. |
required |
features
|
list[str]
|
Columns to smooth. |
required |
lat_col
|
str
|
Name of the latitude column. |
'Latitude'
|
lon_col
|
str
|
Name of the longitude column. |
'Longitude'
|
n_neighbors
|
int
|
Neighborhood size for the truncated kernel. |
50
|
bandwidth
|
float | None
|
Gaussian kernel width in coordinate units. When None it defaults to the median distance to the farthest retained neighbor, which adapts the width to local point density. |
None
|
prefix
|
str
|
Prefix for output column names, e.g. |
'smooth'
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
A frame of smoothed candidate columns. |
Source code in src/featurely/smoothing.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 | |