why does b'(and sometimes b' ') show up when I split some HTML source[Python]
- by Oliver
I'm fairly new to Python and programming in general. I have done a few tutorials and am about 2/3 through a pretty good book. That being said I've been trying to get more comfortable with Python and proggramming by just trying things in the std lib out.
that being said I have recently run into a wierd quirk that I'm sure is the result of my own incorrect or un-"pythonic" use of the urllib module(with Python 3.2.2)
import urllib.request
HTML_source = urllib.request.urlopen(www.somelink.com).read()
print(HTML_source)
when this bit is run through the active interpreter it returns the HTML source of somelink, however it prefixes it with b'
for example
b'<HTML>\r\n<HEAD> (etc). . . .
if I split the string into a list by whitespace it prefixes every item with the b'
I'm not really trying to accomplish something specific just trying to familiarize myself with the std lib. I would like to know why this b' is getting prefixed
also bonus -- Is there a better way to get HTML source WITHOUT using a third party module. I know all that jazz about not reinventing the wheel and what not but I'm trying to learn by "building my own tools"
Thanks in Advance!