Air quality information for Frankfurt/Main (DE)

Hi again,

@oliverr asked at Which OS and version of the is best for the luftdatenpumpe · earthobservations/luftdatenpumpe · Discussion #29 · GitHub

I wanted to see if it is easy to load only the historical data of Frankfurt, or if it is easier to load all historical data and then filter it later.

On this matter, I want to link to Air quality information for specific regions where we show different examples how this can happen and which might be a starting point for others.

The answer is: We put in some efforts to bring geospatial query features into the mix based on importing location data into PostGIS. So, reusing this will be the most convenient way to query data by geographic regions.

With kind regards,
Andreas.

1 Like

Hi,

i start to load the historical data over

wget --mirror --continue --no-host-directories --directory-prefix=/var/spool/archive.luftdaten.info --accept-regex=‘2020’ http://archive.luftdaten.info/

but this is very slow, is there a other way to load it a little bit faster?
Iam not good in regex, what must i use to get data from 2015 to now(2021)?

If i have the historical Data i want transfer the Data to a Database. What is the best Database for this, PostgreSQL or InfluxDB?

Regards,
Oliver

Hi Oliver,

maybe using httrack works better for you?

httrack \
    --continue --sockets=8 \
    --path=$HOME/var/spool/archive.sensor.community \
    '-*' '+*2016*' https://archive.sensor.community/ --verbose

Just omit all filtering parameters. That is, --accept-regex='2020' for wget and '-*' '+*2016*' for httrack.

With kind regards,
Andreas.

P.S.: We also have a ticket outlining how to improve the ingesting performance by reaching out to the Parquet files at http://archive.sensor.community/parquet/ instead of the CSV files, see also [LDI] Improve performance of historical data import · Issue #11 · earthobservations/luftdatenpumpe · GitHub.

1 Like

When using luftdatenpumpe for processing the data, timeseries data (measurement values) will go into InfluxDB while geospatial data (station/location information) will go into PostGIS.

We also just added a dashboard Luft: LDI » luftdaten.info map for Frankfurt/Main (DE). Enjoy.