When developing the new website for Stats NZ, one thing was clear from the start, and that was that we needed a better strategy towards content.
How many pages?
How much content we even had was anyone's guess in fact, when I joined the project I was told we had about 200,000 pages. Not believing it could ever be that high I kept asking and got 70,000. Apparently there was no way in the CMS to recover a page count, and GA hadn't been configured to merge doubles with trailing slashes, or extensions.
Communicate the problem simply
In the end I got as accurate picture as I could by exporting a years worth of data, then running it through a small C# script I wrote to merge pages. I then did some further processing on the result and managed to produce a single image which resonated with senior level management, and justified the case to reduce content succinctly.