This is just "thinking out loud", and only a bare outline of a solution, but...
* Definition: a position that appears in a chess database is "plausible"
* the larger the database, the more positions will be repeated
* use statistics, curve fitting, maths and common sense to build the following function f:
Probability of a position being repeated = f ( move number, number of positions at that move number in the database )
* rearrange expressions from the coupon collector's problem (link) to have a guess at how many "plausible" positions there are (so instead of asking "how many cereal boxes must I buy to get all 10 plastic toys?", you're asking "I've bought 20 cereal boxes, I've got 8 different toys, how many toys are there in total?")
* decide how many moves you're going to go to (100 might be sensible: there probably aren't enough "plausible" games beyond that to make a significant difference)
* sum the number of plausible games from move number 1 to 100
