Tuesday, 16 May 2017

Finding deleted files and summing all numbers in a column


Here are two handy things, which can be combined in to one!

First, is there a mismatch between what du says is used on disk and what df is reporting as used? Well you probably have some processes holding open deleted files, the links will be removed so du will be unable to sum their size, but since they're still allocated on disk then df will report it as used space.

You often see this with log files where someone has gone in to clean up files due to a disk space alert, but the process is still writing to those files. As an aside, it's better avoid this whole situation and use >logfile.name to truncate a file rather than rm to delete the file on a process you're unable to restart right away.

Point one, find the deleted files, you can do this easily with:

sudo lsof | grep deleted

Also if you know which process is holding open the files then you can use the proc filesystem and look in the processes fd directory, the links will have (deleted) after their names if they are removed. Historical tip: The flash plugin used to protect streaming videos (such as youtube) by saving the file in /tmp and then deleting the file. You could just go in to proc and retrieve it if you like, that's changed now of course with HTML5 and DRM. Anyway back on topic.

The second part of this is: now I have the list of deleted files how do I sum up all the size values to see how much disk space is used?

Here you can use awk to extract column, paste to remove the newline and put '+' in it's place (awk can probably do this, but this is easier for me to remember), then pipe that in to bc to get a total:

sudo lsof | grep deleted | awk '{ print $7}' | paste -sd+ | bc

Easy. That number should be the same as the mismatch between du and df values.

Wednesday, 10 May 2017

Tar and rename files using substring replacement

In my day to day I'm asked to tar up logs quite often, but often from hosts which have the same log file names. I've got this little snippet saved that can rename the log file paths in the tar file so we don't clobber log files when extracting at the destination end.

find /var/log/jbossas/standalone -mtime -1 -type f -print | \
   xargs tar --transform 'flags=r;s|var/log/jbossas/standalone|${hostname}|' \
   -cvf /var/tmp/logs_$(hostname)_$(date +%Y%m%d).tgz

The first part of the command is running find and looking for any files that are modified in the past 24 hours (use -daystart for the past day). We print any filenames found and pass that to xargs, which will then run tar and add the files to the output tar file. However in the middle is this transform option, it's doing a simple substring replacement of "var/log/jbossas/standalone" with "$(hostname)".

And that's it. Simple tar filename transformation.

This is of course completely ignoring solutions like splunk or greylog, but often vendors want their raw log files to look at.

Extract start/end dates of SSL certificates across a list of servers

This openssl command will extract the certificate start and end dates from a server:

HOST=randomnamehost.com
openssl s_client -showcerts -connect ${HOST}:443 </dev/null 2>/dev/null | \
openssl x509 -noout -dates

This is useful to extract the dates for monitoring/checking (although, about 1000 other solutions could work too).

Here's a quick application of that in a loop to check a list of servers I have saved:

for HOST in $(cat /host/name/file); do
echo "==========="
echo $HOST
openssl s_client -showcerts -connect ${host}:443 </dev/null 2>/dev/null | \
openssl x509 -noout -dates
done
There's most likely a python module to do the same which would make comparing dates easier in a monitoring script. I should look for one.