Quote Originally Posted by eepjr24 View Post
The problem you have with real place name generation is hard to overcome without either very large sample sets, an exclusion list or significantly increasing the temperature as you noted.
Yeah, my dataset had nearly 6000 entries, which I thought would be plenty, but I suppose more is always better.

My main issue with including real place names is that I wanted everything to be fictitious. I can usually pick out the real names from my Scottish list, but if I wanted to use different datasets from around the world, I'd have no clue what was real and what wasn't. Your idea of an exclusion list would work though. I can get my python script to check if the name generated is in the original list, and if so, delete it. That won't be too hard, even for me!


If you like rabbit holes, check out some of the more advanced routines that attempt to generate hypocoristic alternates.
Hehe I think I will very quickly get out of my depth with this. Sounds very interesting, though.