One of my Arq Backups finally hit the disk limit. This was expected as I had not set a budget, so the backup kept growing until it failed. Unlike Time Machine, when Arq hits a limit it does not automatically delete old backups. This is good because it lets me decide if I want to move the backup to a larger disk. If I don’t, I can always set a budget to be slightly less than the size of the disk. I like that it’s my choice.
So I set the budget and manually told Arq to enforce the budget. For 240 GB over 185 backups, it took about 3 hours to figure out what needed to be deleted. But the results were … uh … curious.
I had to quit the Arq app while the agent did the budget enforcement. That’s curious as no other action I’ve used has that limitation. But, whatever, it’s not a big deal.
Arq’s backup log told me exactly which backups were being kept. As it turned out, all of them. The log then said,
Total size of all backups (239.995 GB) is under budget
Finished budget enforcement on backup data at Arq Backup Red
Deleting 25243 unreferenced objects at Arq Backup Red
Deleted 25243 unreferenced objects at Arq Backup Red
But, if all backups are under budget, why are there 25,243 unreferenced objects totalling almost 19 GB?
Arq’s documentation says,
Deleting Unreferenced Objects
When Arq Agent finishes dropping old backups, it deletes any objects that are no longer referenced by any backup version.
Since I have thinning enabled, which reduces hourly backups into daily and weekly backups, I can see that objects would become unreferenced. But Arq claims it will delete these objects. So why wasn’t it until budget enforcement was enabled did it clean up 25,000+ objects?
Since Arq’s documentation is not geared towards people who want to understand how their backup software works, which would be every I.T. person at least, we’re left to wonder what’s going on.
So I recommend using a budget even if you don’t think you need one. At least once every 30 days, by default but you can change it, Arq will clean up after itself.
Now, here’s another weird thing about budget enforcement. It’s not considered to be a backup so your pre/post scripts don’t run. Usually, when a backup runs, it halts all other backups. This is one of the worst “features” of Arq—all backups can halt for hours, days or weeks while it does maintenance. But in this case, the backups to that set kept running on schedule and complaining that there was no more disk space. Arq chose to run a backup for a set it was doing budget enforcement on.
I like this multitasking if budget enforcement can be done in parallel with backups. But, when the backup is already failing due to lack of space, it doesn’t make sense to keep doing something over and over again that’s doomed to fail. Isn’t that the definition of insanity?
Anyway, now that budget enforcement is enabled, I shouldn’t see any out-of-space errors unless a lot of data is backed up in that 30-day window. If I do see errors, I can get a larger disk, set a smaller budget, or shorten the enforcement window.
Earlier I reported on a problem with Arq not connecting to my 1and1.com SFTP server because Arq’s connection policy is hostile towards servers. I often get failure notifications. This latest time though I noticed something special in the log during object cleanup:
Skipping budget enforcement
But, but, I never configured that destination to even do budget enforcement. So yeah, I sure hope it would skip it. 🙄 I’ve only configured budget enforcement on the destination that ran out of disk space, yet Arq issues this confusing message for an unrelated destination. I sure hope this isn’t another serious bug like the one where configuring a pause in one destination acts like a global pause for all backups. 🤦🏻♂️ [It is! See the next update.]
In the evening, Arq started complaining that the first destination that started this post was once again full. This occurred faster than expected, but I guess I didn’t set the budget low enough. Did Arq then automatically trigger budget enforcement? Nope. It was content to keep running the backups and generating errors. So, yeah, insanity.
About 4 hours later, Arq ran budget enforcement (why now?!) and sent me an email saying there were 2 errors. But the email contained no error lines. I thought Arq’s counting bug had been fixed, but it’s still there under some circumstances.
When budget enforcement is turned on, Arq should regulate its disk usage like Time Machine does. On every backup it should check to see if it has run over budget and trim older backups. Don’t wait until the disk fills up and the backup fails, that’s silly. As it stands now, I have to guess how much free space to leave on the disk, or the time interval between enforcements, so that the backup won’t fill the disk and cause the backup to fail.
I shouldn’t have to babysit my backups so much, but that’s what Arq forces me to do.
As part of my updates to my monitor scripts to handle Arq 5.11’s new parallel background validation, I’ve observed that Arq is doing budget enforcement for destinations for which I have no budget set. Since budget enforcement does not run in parallel with backups, this uncalled for activity is blocking my backups until enforcement is finished, which could take hours. It looks like Arq does budget enforcement for all destinations if even just one is enabled.
I have the Budget checkbox unchecked and the Limit total size of backups to set to 0 GB. So just what limit does Arq think it’s enforcing?
Since the documentation is weak I don’t know for sure, but I’m hoping this unscheduled, uncalled for budget enforcement is just Arq’s way of reclaiming unreferenced objects that it should have done when thinning occurred.