expecially, expresso, aks (ask), excetera are generally markers of a lower socio-economic background. It is natural for a child learning a language to substitute a more familiar sound for a less familiar sound. "Ex" sounds a lot like "esp," with the former being much more prevalent due to it being a Latin prefix. Usually these mistakes are pointed out by parents or friends and corrected at a young age. However, if your parents and friends also make the same mistake, then it never gets corrected and in fact seems totally natural. These traits are passed down through generations in these particular speech communities. Somewhere down the line, groups of people started mispronouncing it and no one bothered to correct it because they didn't know any better, and speaking correctly simply wasn't vital to their livelihoods. These groups of people would have had little or no formal education and worked manual labor jobs. They all had kids who inherited this speech trait and it went viral. It is obviously an oversimplification, but this is how regional dialects form.
I'll bet that everyone here could reflect on speech patterns they inherited growing up that they later realized sounded incredibly regional/hayseedy. Many (most?) people who matriculate to a University and/or become a professional actively work to remove these markers from their speech. Ironically, many of those same people later integrate regional/hayseedy features into their speech in order to project or feel a sense of belonging in a community.