February 18, 2008
Top-and-tailing log files

Some of our log files are huge - I have a 10 GB file on my HDD right now. These files are unwieldy to say the least, and you usually have a good idea as to the time in which a particular problem occurred, so it's often handy to be able to chop out a specified time range. I keep forgetting how to do this, and re-inventing the process. So, here for my own benefit at a couple of methods.

The hard way

From the bash shell:

wc -l your.log

This will count the lines in your file.

grep -n 12:00: your.log | head -n 1

This will give you the line number of the 1st line containing "12:00:" - in this example, I want log entries starting at midday, so this is the first line that I want.

grep -n 12:10: your.log | tail -n 1

This will give you the line number of the 1st line containing "12:10:" - in this example, I want log entries up to ten past twelve, so this is the last line that I want.

tail -n $[total-first] your.log | head -n $[total-last] > your_focused.log

Replace first, last and total with the values you got above, and you'll end up with a file containing only the time range that you wanted. (If you only want to look at the file once, you can just pipe into less or whatever rather than piping into an output file.)

The easy way

python -c "import itertools, sys; sys.stdout.writelines(itertools.takewhile(lambda item: not '12:10:' in item, itertools.dropwhile(lambda item: not '12:00:' in item, open('your.log'))))" > your_focused.log

Same thing, only this will read through the file just once.

Now, I'm fully expecting someone to come and tell me the real easy way. ;-)

Posted to Linux by Simon Brunning at February 18, 2008 12:25 PM
Comments
Post a comment
Name:


Email Address:


URL:



Comments:


Remember info?