3 questions about externalizing…
1. Where to put the Cutoff?
According to the doc there is a line in the sand for what to put into external storage and what to keep in the db. This line is adjusted with $ExternalStorageCutoffSize
. So there is a default value but how is that selected. There is of course plenty of things to consider when selecting where to put this line. But thinking about this I consider moving virtually everything out of the db and into regular files. What are the penalties of this?
Without knowing exactly how the size is calculated I assumed that it was the size of Content
and produced the following query to get an idea about where to put the limit:
mysql> SELECT
-> digits,
-> cnt,
-> FLOOR(POW(10,digits+0.5)*cnt) AS sum_bytes,
-> FLOOR(POW(10,digits+0.5)*cnt/(1024*1024)) AS sum_Mb,
-> FLOOR(POW(10,digits+0.5)*cnt/(1024*1024*1024)) AS sum_Gb
-> FROM (
-> SELECT
-> FLOOR(LOG10(CHAR_LENGTH(Content))) as digits,
-> count(*) AS cnt
-> FROM Attachments
-> GROUP BY 1
-> ORDER BY 1) foo;
+--------+--------+-------------+--------+--------+
| digits | cnt | sum_bytes | sum_Mb | sum_Gb |
+--------+--------+-------------+--------+--------+
| NULL | 375821 | NULL | NULL | NULL |
| 0 | 19620 | 62043 | 0 | 0 |
| 1 | 55758 | 1763222 | 1 | 0 |
| 2 | 837271 | 264768337 | 252 | 0 |
| 3 | 612762 | 1937723583 | 1847 | 1 |
| 4 | 41616 | 1316013471 | 1255 | 1 |
| 5 | 25654 | 8112507109 | 7736 | 7 |
| 6 | 7971 | 25206515229 | 24038 | 23 |
| 7 | 114 | 3604996532 | 3437 | 3 |
+--------+--------+-------------+--------+--------+
9 rows in set, 16432 warnings (4 min 3,97 sec)
- The log-column is the
length-1
of the size of Content, in other words The size of the row. - The cnt-column is the count of rows of a specific size.
- The sum-columns is calculated as the estimated average (
+0.5
) size times the number of rows of that size.
So if we set the cutoff at 10Kb (digits = 4) 37Gb will be moved out of the db and 1 will be left). This would be a huge benefit when handling and doing backups.
Where is the drawbacks of this?
2. Reverting externalized objects?
Is there a way to move objects back into the db ones they have been externalized?
3. More Cutoff-options?
To rt-shredder
you can define several rules of what to shred so that only object older than X days are shredded. Is there some similar thing for rt-externalize-attachments
? Our tickets have an active life of at most one month I would say, but we need to keep all for backtracking. So for us it would be convenient to externalize everything older than a ~month regardless of size since there is no speed-requirements on those old tickets. How could this be achieved?