translating play in HTML to python
- by aharon
So, I'd like to represent one of Shakespeare's plays, Hamlet, into the following objects (maybe this isn't the best representation, if so please tell me):
class Play():
acts = []
...
def add_act(self, act): acts.append(act)
class Act():
scenes = []
...
def add_scene(self, scene): scenes.append(scene)
class Scene():
elems = []
def __init__(self, title, setting=""): ...
def add_elem(self, elem): elems.append(elem)
...
class StageDirection(): # elem
def __init__(self, text): ...
class Line(): # elem
def __init__(self, id, text, character = None): ...
# A None character represents a continuation from the previous line
# id could be, for example, 1.1.1
There are other methods, of course, for printing and such in each of the classes.
The question is, how do I get a structure based on these classes (or something like them) from HTML 4 code that looks like this:
<H3>ACT I</h3>
<h3>SCENE I. Elsinore. A platform before the castle.</h3>
<p><blockquote>
<i>FRANCISCO at his post. Enter to him BERNARDO</i>
</blockquote>
<A NAME=speech1><b>BERNARDO</b></a>
<blockquote>
<A NAME=1.1.1>Who's there?</A><br>
</blockquote>
<A NAME=speech2><b>FRANCISCO</b></a>
<blockquote>
<A NAME=1.1.2>Nay, answer me: stand, and unfold yourself.</A><br>
</blockquote>
<A NAME=speech3><b>BERNARDO</b></a>
<blockquote>
<A NAME=1.1.3>Long live the king!</A><br>
</blockquote>
<A NAME=speech4><b>FRANCISCO</b></a>
<blockquote>
<A NAME=1.1.4>Bernardo?</A><br>
</blockquote>
<A NAME=speech5><b>BERNARDO</b></a>
<blockquote>
<A NAME=1.1.5>He.</A><br>
</blockquote> <!-- for more, see the source of shakespeare.mit.edu/hamlet/full.html -->
translating that into something like this:
play = Play()
actI = Act()
sceneI = Scene("Scene I", "Elsinore. A platform before the castle.")
sceneI.add_elem(StageDirection("Francisco at his post. Enter to him Bernardo."))
sceneI.add_elem(Line("Bernardo", "Who's there?"))
...
Of course, I don't expect all the code—but what libraries and, when there aren't libraries, logic should I use?
Thanks.
(This is for a future opensource project and me learning Python for fun—not homework.)