escaping query string with special characters with python
- by that_guy
I got some pretty messy urls that i got via scraping here, problem is that they contain spaces or other special characters in the path and query string, here is some example
http://www.example.com/some path/to the/file.html
http://www.example.com/some path/?file=path to/file name.png&name=name.me
so, is there an easy and robust way to escape the urls so that i can pass them to urlopen?
i tried urlib.quote, but it seems to escape the '?', '&', and '=' in the query string as well, and it seems to escape the protocol as well,
currently, what i am trying to do is use regex to separate the protocol, path name, and query string and escape them separately, but there are cases where they arent separated properly
any advice is appreciated