Help! When pushing data from NSIDC into InfluxDB, it croaks with
{"error":"engine: error rolling WAL segment: error opening new segment file for wal (2): open /var/lib/influxdb/wal/nsidc/autogen/55885/_00001.wal: too many open files"}
Help! When pushing data from NSIDC into InfluxDB, it croaks with
{"error":"engine: error rolling WAL segment: error opening new segment file for wal (2): open /var/lib/influxdb/wal/nsidc/autogen/55885/_00001.wal: too many open files"}
You might want to have a look at this…
If it’s really the client, as also outlined within [1] please try invoking
ulimit -n 65535
before running your import program.
[1] Too many open file on client · Issue #4569 · influxdata/influxdb · GitHub
According to /lib/systemd/system/influxdb.service
, the InfluxDB service itself is already running with
LimitNOFILE=65536
Currently, I’m hesitant on increasing the overall server limits even further.
However, it looks like InfluxDB is creating a huge number of shards on the nsidc
database, which might not be intended.
root@eltiempo:~# l /var/lib/influxdb/wal/nsidc/autogen/ | wc -l
1479
Do you see any way to share your import program with us? Maybe we can optimize this detail.
So, it makes sense for InfluxDB to operate like that when the time series covers a huge timespan. Is this the case with your specific dataset?
at least I can say: we’ve a maximum of one record a day. within the last ~20years: daily, before than (until 1978) we have 2-4days between each … four records.
So, you might consider creating the database with a specific shard group duration.
See also:
SHARD DURATION
According to the recommendation for backfilling data cited above, this might help you along:
CREATE DATABASE <database_name> WITH SHARD DURATION 52w
When saying “we highly recommend temporarily setting a longer shard group duration so fewer shards are created”, how and to what value am I reverting afterwards?
Just leave it like it is as it should reasonably match the time resolution of this dataset, right?
The current database shows it contains just 45 shards (probably matching the number of years aka. blocks of 52 weeks each)
root@eltiempo:~# l /var/lib/influxdb/data/nsidc/autogen | wc -l
45
each containing only a few kB worth of data
root@eltiempo:~# du -sch /var/lib/influxdb/data/nsidc/autogen/*
44K /var/lib/influxdb/data/nsidc/autogen/61615
44K /var/lib/influxdb/data/nsidc/autogen/61616
44K /var/lib/influxdb/data/nsidc/autogen/61617
44K /var/lib/influxdb/data/nsidc/autogen/61618
So, when querying and processing it, nobody will suffer.
P.S.: Unless further experiences regarding this…