Computers do tend to be bitchy sometimes

Yeah, the lists quality are passable at best. Actually for Portuguese, English, Spanish and Italian words I was able to get high quality lists from my university's NLP Lab, but they only had those languages Those lists you saw are compiled from movie subtitles (apparently pretty crappy ones). And even though I tried to clean them a bit, I reckon that at least 10% of its words are just rubbish.

What software do you use to analyze the lists?
I wrote the software myself, I dont have a name for it yet But there is a lot of room for improvement. I have a sketch of a pretty cool way to find patterns on words using artificial neural networks, I just don't have the time to implement it right now.

The biggest problem (besides the fact that it takes forever) with my method is that it's a bit TOO consistent. In later stages, it's a struggle to come up with origins for "figure of speech" type phrases, slang terms, contractions... things like that.
I haven't really thought about that, after a time a rigid structure could make things harder. But then I guess that you could just let your language get a bit 'corrupted' over time as you evolve it, just like it happens on real languages over time.

Perhaps I should give you my word lists around the half-way point and let your software figure out the hard parts
I'll be happy to help. I'm not absolutely sure about the software's output quality as of now, I do have plans for improvement. It does serve me well tho.

I'll post the written alphabet as soon as my camera stops fighting me.
Hahaha I'll be waiting.

I'd love to see yours as well!
I used the opportunity to make the pronunciation cheat-sheet that I've been meaning to. It is still kinda incomplete, some letter combinations change the sound a bit: