I have a dataset with point data within a given country. Let's say my dataset looks somewhat like this:
tree_id | species | age | geom ------------------------------ 0 | Ash | null|… 1 | Beech | 70 | … 2 | Ash | 10 |… 3 | Beech | 70 | … 4 | Beech | null| … 5 | Beech | 60 | … |… |… | …
As you can see the dataset has some missing data. For instance,
tree_id 0has no age. Therefore I would like to interpolate those missing values from a 100 meter radius.
I am looking for the mean of the species. The result should also include the number of sample trees used. A result table could then look like this:
tree_id | age_avg | samples --------------------------- 0 | 11.8 | 113 3 | 12.2 | 97 5 | 50.7 | 272… |… |…
Could you get me started with some PostgreSQL query code, please?
Ok so let's start again this is the answer to do what you want, but this will only be useful in a non-meanigfull context. For instance to render a 3D scene where some data are missing and you want to draw a "local medium tree" for each species.
I'm assuming your original table is called "mytrees".
Create two alias a & b from your table mytrees ,join b to a table if in your search radius, then summarize data for each point using aggregates.
SELECT a.tree_id, a.species, avg(b.age) as age_avg, count(*) as sample, a.geom FROM mytrees a LEFT JOIN mytrees b ON ST_DWithin(a.geom,b.geom,100) AND a.species = b.species GROUP BY a.tree_id, a.species, a.geom ORDER BY a.tree_id
Again a last warning, it will work but DONT USE IT for meaningful data-analysis. Only for rendering or as proof of concept.
Edited : using ST_DWithin as suggested by John Barça, way easier