A quick guide to using pso


Version:$Revision: 1.8 $
Author: thanos vassilakis
Document url: http://sourceforge.net/docman/display_doc.php?docid=11314&group_id=49265
Contact: thanos@businessfour.com
Feed-back: pso-development@lists.sourceforge.net
Copyright:thanos vassilakis 2001,2002
Contributions: Ming Huang and Vadim Shaykevich.
$Id: pso-guide.html,v 1.8 2003/01/03 17:25:06 thanos Exp $

  1. Introduction
  2. System Requirements
  3. Quick Example
  4. pso.request.ServiceRequest - servicing requests

  5. pso.session - session handling
  6. pso.parser - template parsing
  7. Installation
  8. Unix Installation Example

  1. Introduction:
  2. We developed pso for the following reasons:
    1. We wanted to develop web systems that would work as cgi and mod_python request handlers on apache or nsapi handlers on netscape without thinking about httpd implementation details. To do this pso includes the class SerivceRequest that is a bridge between the server and any code we develop. This bridge also gives us a consistent and easy interface with the server and the client.
    2. We needed a fast and easy to use parser to extract sgml ( that includes html and xml) tags and replace them with the result of a rendered object. We also wanted to parse the templates only once and render the object tree as we needed. The pso parser if fast and simple, returning a object tree that is easily rendered or processed by a visitor pattern.
    3. We wanted to be able to just add a new tag to a template and drop in the relevant classes without changing the application code. Every tag represents an object whose class can be sub classed. You only need to place the new class in the python import path for it to be recognized and used by the parser.
    4. We found that it was only trivial systems that did not need session handling. When you use pso, session handling is available by default.
    5. We wanted lots of useful methods to handle redirection, setting cookies, targets, status, also methods to handle file uploads, and other form handling, and url coding. pso offers these.
    We are programmers, so pso has been kept simple and basic. The template system offers no built in tags, you have to build your own or subclasses those contributed, by ourselves or other users. We decided on this spartan approach on the basis that by keeping pso simple and light it would be easier to maintain and keep error free. Who uses pso ? Well let us say the biggest stock market uses pso on the floor and in the back offices for there most used and important internet trading service. So will pso be maintained ? They will!

  3. System Requirements
  4. Quick Example
  5. Lets start with a really simple CGI:
    a cgi a mod_python script
    #!/usr/bin python2.1
    
    
    def testHandler():
    	print "content-type: text/html"
    	print
    	print "hello world"
    
    if __name__ == '__main__':	
    	testHandler()
    
    
    from mod_python import apache
    
    def testHandler(req):
    	req.send_http_header()
    	req.write( "hello world")
    	return apache.OK
    
    The above programs rewritten in pso
    #!/usr/bin/env python2.1
    from pso.service import ServiceHandler
    
    def testHandler(serviceRequest):
    	print "Hello World!"  
    
    if __name__ == '__main__':
    	ServiceHandler().run(testHandler)
    

    Test: http://www.0x01.com/%7Ethanos/pso/guide/testGuide.cgi [ mod_python test ]

  6. pso.request.ServiceRequest - servicing requests
  7. Form Input
    The ServiceRequest object has the following methods:
    • hasInputs(self, *keys) - tests if a field or fields in the form were filled.
    • getInputs(self, key=None) - which given a key will return a list of all the values associated with this key. If none exits it will return an empty list. When getInputs() is called a cgi.FieldStorage object is returned.
    • getInput(self, key, default=None, index=None) returns the given form field value as a string. If there are multiple values under the same key, it will return the first in the list, unless index is given. If no value is found will return "", unless default is given.
    Here is an example: [http://www.0x01.com/~thanos/pso/tests/guide/testGuide.cgi?test=input - mod_python test]
    #!/usr/bin/env python2.1
    from pso.service import ServiceHandler, OK
    
    
    def testInput(serviceRequest):
    	if serviceRequest.hasInput("submit"):
    		name = serviceRequest.getInput("name")
    		options =  ",".join(serviceRequest.getInputs("option"))
    		print """<pre>
    			name: %s,
    			options: %s
    			</pre>""" % (name, options)
    	else:
    		print """
    			<form >
    			<input name="name">
    			<input name="option" value="alpha" type="checkbox">alpha
    			<input name="option" value="omega" type="checkbox">omega
    			<input type="hidden" name="test" value="input">
    			<input type="submit" name="submit" value="submit">
    			</form>
    			"""
    	return OK
    
    if __name__ == '__main__':
    	ServiceHandler().run(testInput)
    

    server, request header & environment variables
    pso emulates the CGI standard, and these variables are available through ServiceRequest.getEnvrion(self, key=None, default=None)that takes a key and either returns a string or default. If no key is given a dictionary is returned of these variables.
    Here is an example: [http://www.0x01.com/~thanos/pso/tests/guide/testGuide.cgi?test=environ - mod_python
    #!/usr/bin/env python2.1
    from pso.service import ServiceHandler, OK
    
    def testEnviron(serviceRequest):
    	print "<ul>"
    	for keyValue in serviceRequest.getEnviron().items():
    		print "<li>%s: %s" % keyValue
    	print "</ul>"
    	return OK
    
    
    if __name__ == '__main__':
    	ServiceHandler().run(testEnviron)
    

    Handling cookies
    • You get cookies using ServiceRequest.getCookies(self) that returns a dictionary of cookies and their values.
    • You get a cookie using ServiceRequest.getCookie(self, key, default=None) which returns the cookie value requested by key otherwise returns default.
    • Cookies can be set by using setCookie(self, key, value, **attrs) sets cookie key to value. **attrs is a key word parameter through which you can pass the cookie attributes as defined in RCF2109:
      • Comment
      • Domain
      • Max-Age
      • Path
      • Secure
      • Version
      • expires?... Used by netscape et al. This takes a string date such as "Mon, 12 Nov 2002 13:04:56 GMT", as defined in RFC2068 section 3.3.1 [also RCF822 and RCF1123]

    Here is an example: [http://www.0x01.com/~thanos/pso/tests/guide/testGuide.cgi?test=cookie - mod_python test ]

    #!/usr/bin/env python2.1
    from pso.service import ServiceHandler, OK
    
    def testCookie(serviceRequest):
    	if not serviceRequest.getCookie("MyTest"):
    		print 'setting cookie "MyTest" to "is Tasty" reload to see this cookie'
    		serviceRequest.setCookie('MyTest','is Tasty')
    	print"<ul>"
    	for cookieValue in serviceRequest.getCookies().items():
    		print "<li>%s: %s" % cookieValue
    	print "</ul>"
    	return OK
    
     __name__ == '__main__':
    	ServiceHandler().run(testCookie)
    

    File Uploads
    pso makes file uploading very easy. It offers the method ServiceRequest.getFile(self, key) that returns a PSOFields object, a subclass of cgi.Field. The return object has all some additional member fields:
    • filename - the original file's name
    • file - a file object holding the actual file
    • tempname - is None until the member method keep() is called.
    The returned fields object has also some new methods:
    • keep() - For each uploaded file the python standard cgi library opens a temporary file and immediately deletes (unlinks) it. The trick (on Unix!) is that the file can still be used, but it can't be opened by any other process, and it will automatically be deleted when it is closed or when the current process terminates. keep() gives this temporary a new temporary name. This is especially useful for forms that have a confirmation screen.
    • save(pathName) - renames an uploaded file.
    Here is an example: [http://www.0x01.com/~thanos/pso/tests/guide/testGuide.cgi?test=upload - mod_python test ]
    #!/usr/bin/env python2.1
    
    def testUpload(serviceRequest):
    	if serviceRequest.hasInputs('file'):
    		file = serviceRequest.getFile('file')
    		file.keep()
    		print """
    		<form >
    			Save As: <input name="saveAs" type="text" >
    			<input name="tempfile" type="hidden" value="%s">
    			<input name="test" type="hidden" value="upload">
    		</form> """ % file.tempname
    
    	elif serviceRequest.hasInputs('saveAs'):
    		tempFile = serviceRequest.getInput('tempfile')
    		saveAs = serviceRequest.getInput( 'saveAs')
    		import os
    		print "renaming", tempFile, 'to', '/tmp/'+saveAs
    		os.rename(tempFile, '/tmp/'+saveAs)
    		print """
    		DONE - Thankyou
    		"""
    	else:
    		print """
    			<form  enctype="multipart/form-data" method="POST">
    			file: <input name="file" type="file" >
    			<input name="test" type="hidden" value="upload">
    			<input name="action" type="submit" value="Upload">
    		</form>"""
    	return OK
    
    
    if __name__ == '__main__':
    	ServiceHandler().run(testUpload)
    

    Redirection
    Redirection is essential and pso makes this easy. Just call ServiceRequest.redirect(someUrl) and your script will terminate and redirect the client's browser.

    Here is an example: [http://www.0x01.com/~thanos/pso/tests/guide/testGuide.cgi?test=redirect - mod_python test ]

    #!/usr/bin/env python2.1
    from pso.service import ServiceHandler,OK
    
    def testRedirect(serviceRequest):
    	url =  serviceRequest.getInput('url')
    	if not url:
    		print """
    		<form >
    		Redirect to : <input name="url" type="text" size=40">
    		<input name="test" type="hidden" value="redirect">
    		</form>"""
    	else:
    		serviceRequest.redirect(url)
    	return OK
    
    
    
    if __name__ == '__main__':
    	ServiceHandler().run(testRedirect)
    

    Status
    By default pso always sets the return status to 200. When invoking redirect, pso will set the status code to either 301 or 302. To set the code explicitly you have a choice of invoking:
    • sendStatus(self, status) - this immediately sends the status code and terminates you handler.
    • setStatus(self, status) - this sets the status code but your request handler will continue execution.
    For a complete list of these codes you should refer to [ rfc2616 sec6.1.1].

    Here is an example: [http://www.0x01.com/~thanos/pso/tests/guide/testGuide.cgi?test=status - mod_python test ]

    #!/usr/bin/env python2.1
    
    from pso.service import ServiceHandler,OK
    
    def testStatus(serviceRequest):
    	import time
    	print time.asctime()
    	if not serviceRequest.hasInput('status'):
    		print """
    		<form>
    			<input name="status" type="text" size=3" maxsize=3>
    			<input name="test" type="hidden" value="status">
    		</form>"""
    	else:
    		status = serviceRequest.getInput('status', '204')
    		serviceRequest.sendStatus(status)
    	return OK
    
    if __name__ == '__main__':
    	ServiceHandler().run(testRedirect)
    

    Target and other http headers
    pso sets many headers (status, content_type, cookie), and it is easy to do using :
    • setHeaderOut(self, key, value) - replaces the header entry with the same key.
    • addHeaderOut(self, key, value) - adds the header entry.

    Here is an example: [http://www.0x01.com/~thanos/pso/tests/guide/testGuide.cgi?test=headerOut - mod_python test ]

    #!/usr/bin/env python2.1
    from pso.service import ServiceHandler,OK
    
    def testHeaderOut(serviceRequest):
    	if serviceRequest.hasInputs('url','target'):
    		serviceRequest.setHeaderOut('Window-target', serviceRequest.getInput('target'))
    		print "<ul>" 
    		table = serviceRequest.getHeadersOut()
    		for k in table.keys(): 
    			print "<li>",k, table[k] 
    		print "</ul>"
    		print serviceRequest.getInput('message')	
    	else:		
    		print """
    		WIndow target seems to only work with netscape, please try and let us know.
    		<form>
    			target: <input name="target" type="text" > 
    			message: <input name="message" type="text" size=50 > 
    			<input type="hidden" name="test" value="headerOut">
    			<input name="action"  type="submit" value="write to target"> 
    		</form>"""
    
    
     __name__ == '__main__':
    	ServiceHandler().run(testHeaderOut)
    

    sys.stdout
    When using psosys.stdout is buffered until the termination of the request handler or the invocation of ServiceRequest.send_http_header( self, content_type='text/html'). This system allows you to use print without worrying about when you set the headers, cookies or when you what to redirect. Buffering can damage a performance on very large return screens, yet it can simplify the program logic, and in most cases web services try to fit their return results on one screen. When you want top stop buffering just send the headers, using the method send_http_header. If you want write directly to the sys.stdout before sending the headers you can use:
    requestService.write(someString)
    

    sys.stderr and logging
    By default the output to sys.stderr is posted to the httpd log, but this can be easily changed using the following httpd directive:
    CGImod_python
    SetEnv PSOLog=/path/to/some/Log
    PythonOption PSOLog=/path/to/some/Log
    pso.ServiceRequest also has a member function log(self, *listToPost) that will post a line to the log starting with a timestamp then followed by the parameters in listToPost.

    Here is an example: [http://www.0x01.com/~thanos/pso/tests/guide/testGuide.cgi?test=error - mod_python test ]

    #!/usr/bin/env python2.1
    from pso.service import ServiceHandler, OK
    
    class ErrorToBeLogged(Exception): pass
    class ErrorForTraceBack(Exception): pass
    
    def testError(serviceRequest):
    	print "<pre>"
    	try:
    		raise ErrorToBeLogged()
    	except ErrorToBeLogged, e:
    		serviceRequest.log('just caught this error:', e.__class__, "<br>")
    		try:
    			raise ErrorForTraceBack()
    		except:
    			import traceback
    			traceback.print_exc()
    	import os
    	print os.popen("tail -50 /tmp/psotestlog").read()
    	return OK
            
                   
    
     __name__ == '__main__':
    	ServiceHandler().run(testError)
    
    building urls
    Any web request handler will be building many urls. This is tedious and error prone so pso.ServiceRequest offers a set of member methods to make things easy for you:
    • baseUri(self) - TODO.
    • buildUri(self, parts, clean, **kws) - TODO.
    • serviceUri(self, clean=1, **kws) - TODO.
    • uriParts(self) -TODO.

    Session Handling
    By default every pso request has a session object. getSession() retrieves it. It is a mutable dictionary whose contents is saved between invocations of the request handler. The document Easy mod_python session handling using pso.session or Easy CGI session handling using pso.session describe session handling with pso in greater detail.

    Here is an example: [http://www.0x01.com/~thanos/pso/tests/guide/testGuide.cgi?test=session - mod_python test ]

    #!/usr/bin python2.1
    
    from pso.service import ServiceHandler, OK
    	
    def testSession(serviceRequest):
            session = serviceRequest.pso().session
            try:
                    session['reloads'] +=1
            except:
                    session['reloads'] =0
            print "<br>hello World!  ~ Your number of reloads: %(reloads)d ~ Try Reload !" %  session
            print "<br>session: %s" %  session.__dict__
    	return OK
    
    
    if __name__ == '__main__':
    	ServiceHandler().run(testSession)
    

  8. pso.parser - template parsing

    pso templates
    The pso parser will process any file, looking for sgml tags. On parsing a template the pso parser generates a renderer object tree. This tree is saved with the templates name plus the extension ".pso". Until the template is changed again the compiled version of the template is used. The quest often asked is why SGML and not XML: We found that with sites and services we developed the designers would create html templates. These templates were often not cannonical. The designers used many tricks to create their desired effect. It proved impossible to use the standard available XML parsers on non standard markups. Hence we have tried to make the parser as robust as posible.
    pso tags
    • When used in a template the tags can be of three forms:
       <pso  pso="tagpackage.tagmodule:SomeTagClass" /> 
      or
       <input  pso="tagpackage.tagmodule:MyInput" /> 
      or
      <tagpackage.tagmodule:SomeTagClass />
      The first is much faster to parse. As you can probably guess the tags name includes the package path terminated by the module name, followed by a semi-colon and then the class that is responsible to render the tag. The second lets you lace existing tags, typically HTML with your own renderer. The third form lets you parse existing pages (XML, or for screen scraping and robot type activities). The pso parser (as will be shown below) can be used to parse any sgml tag such as
       <a  href="http://www.0x01.com/" >a great web site</a> 
      Tag names can be any valid entity name, only pso is reserved. As you can see pso tags don't have to be singlets and can be written as:
      <pso pso="tagpackage.tagmodule:SomeTagClass" > what ever you want </ tagpackage.tagmodule:SomeTagClass> 
      They can be nested - this is not the case of most templating systems.
      Your Favourite Drink:
      <form pso="mytags:DrinkPoll">
      Water: <input type=radio pso="questionaire:Drink" /><br>
      Beer: <input type=radio pso="questionaire:Drink" /><br>
      </form>
      
      
      The tags can have attributes.

      <pso pso="mytags:DbTextField" table="clients" /> 
    • The coding of tags is straight forward. Your class should sub-class pso.parser.Tag, implement an render member method that returns a String or None. The default tree renderer invokes the Tag object itself. So if you use the default renderer you need to just overwrite __call__. Your render method will usually return a string.

      from  pso.parser import Tag
      
      class  Welcome(Tag):
      	def __call__(self, renderer, cdata=''):
      		if not loggedIn():
      			return self.getAttrs()['default']
      		return "Welcome %s" % getName()
      
      
      The parameter renderer is the visitor who traverses the object tree invoking the objects render method. The parameter cdata is the pre-parsed and rendered data between the beginning of this tag and its end. So using this tag
      <pso pso="mytags:Welcome"  intro="Welcome %s" >Please Login</pso> 
      and the code below have the same effect as the previous example.
      from  pso.parser import Tag
      
      class  Welcome(Tag):
      	def __call__(self, renderer, cdata=''):
      		if not loggedIn():
      			return cdata
      		return self.getAttrs()['intro'] % getName()
      
      pso tags are instantiated with their attributes as their constructors parameters. So you could do:
      from  pso.parser import Tag
      
      class  MyWelcome(Welcome):
      	def __init__(self, **attrs):
      		attrs.setdefault('intro','have a good time: %s')
      		Welcome.__init__(self, **attrs)
      
      The above example really shows the power of the OO model of pso.parser.Tag.

    • pso tags come some useful member attributes and methods:
      • getAttrs(self) - returns a case insenstive map of the tags attributes.
      • getChildren(self) - returns a list of the tags directly nested within this tag
      • travers(self, renderer=None) - visits the method renderer on every tag in this tags tree of children.
      • preProcess(self) - this is a callback that is called just before a tags children are rendered.
        class MyForm(Tag):
        	def validator(self, obj, cdata):
        		if obj:
        			try:
        				obj.validate()
        			except ValidationError, e:
        				self.errors.append(e)
        
        	def preProcess(self):
        		self.errors = []
        		self.traverse(self.validator)
        

    pso parser
    Parsing is easy. You invoke the parser which will return a tree of objects.
    from pso.parser import Parser
    tree = Parser().parseFile('some_template.html')
    
    This means that in persistent http systems you only have to parse a template once. Then on each request just render the object tree:
    from pso.parser import Parser
    from pso.service import ServiceHandler
    tree = Parser().parseFile('template.html')
    	
    def testParser(serviceRequest):
    	print tree.render()
    
    if __name__ == '__main__':
    	ServiceHandler().run(testParser)
    
    Parsing a template only once is a terrific saving in processing time.

    The Parser constructor can take a useful parameter:

    parser = Parser(defaultModule="html")
    This allows the parser to recognize tags such as:
    <a href="http://www.sourceforge.net/">sourceforge</a>
    and write code such as this:
    #mypackages/html.py
    
    from pso.parser import Parser, Tag
    
    class a(Tag):
            def __call__(self, cdata=''):
            	href = self.getAttrs().get('href')
    		if href:
    			return """%s [<a href="%s">%s</a>]""" % ( cdata, href, href)
     
    class A(a): pass
     
     
    psoParser = Parser(defaultModule=a.__module__)
    
    When you parse a template there are several parameters you can pass to the parser:
      Parser.parseFile(self, filePath, oPath=''", noCache=0, reload=0, allTags=0)
    • filePath - is the template.
    • oPath - this is the directory where the pso files (object trees) are saved.
    • cache - setting cache to 1 sets the creation and use of saved object trees. By default this is set to 0. Normally if you are using <pso> style tags parsing a template can be as fast as pickling and unpickling the object trees. When you are in the process of development you will be doing changes to your code that wont be reflected in the template resulting in some serious waste of debugging time. In a well set up system, or when you are parsing very complex templates or non <pso> style tags setting cache to 1 will give you some gains.
    • reload - if reload is set then each time a tag is being parsed its module will be imported, and reloaded. By default module is only imported the first time. When you are debugging and developing it is a good idea to set reload on.
    • taglevel -
      • 0 = parse only pso tags. Very Fast
      • 1 = The default. Parse only pso tags and tags having a pso attribute. Fast
      • 2 = parse all tags. Not so fast.
        	from pso.parser import Parser, Tag
        
        class a(Tag):
                def __call__(self, cdata=''):
                	href = self.getAttrs().get('href')
        		if href:
        			return """%s [<a href="%s">%s</a>]""" % ( cdata, href, href)
         
        class A(a): pass
         
         
        psoParser = Parser(defaultModule=a.__module__)
        tree = pso.parseFile(template, taglevel=2)
        print tree.render()
        	
    pso tree
    The object tree is returned by the parser. Every object on the tree represents either a tag entity object or cdata. The pso.parser.Tree class has a member method render. render is used to process the tree with a visitor method. The default visitor is pso.parser.Tree.renderer:
    def renderer(self, object, cdata=''):
    	if not object:
    		return cdata
    	return object(self, cdata)
    	
    Used with the default render method this renderer will replace the tag with the result of the tags render method. As parameters a renderer method must accept the object to be rendered and the cdata (if any) that is between the start and end tags. What the render does with the tag is your choice!

    This renderer just returns the cdata

    def renderer(self, object, cdata=''):
    	if not object:
    		return cdata
    	
    This one just returns the doc of on the tag:
    def renderer(self, object, cdata=''):
    	if  object:
    		return object.__doc__
    	

    By default the return values of the visitor are concatinated. Again you can do things differently if you need to:

    import sys
    from pso.parser import Parser
    class TagDocumentor:
    	def document(self, object, cdata=''):
    		if object:
    			self.documentation  += '\t<td>%s</td><dd>%s<dd><br>\n' % (object.__class__, object.__class__.__doc__)
    	
    	def do(self, infile):
    		self.documentation ='""
    		psoParser = Parser()
    		psoTree = psoParser.parseFile(infile)
    		psoTree.render(self.document)
    		print "<dl>%s</dl>"
    
    if __name__ =="__main__":
    	TagDocumentor().do(sys.argv[1])
    

    This concludes our quick guide to pso. Please take a look at our pso-example as it shows you how to build a pretty usable site in about one hour!
    B.T.W We really welcome any questions: pso@0x01.com

  9. Installation
  10. pso should be installed using Distutils which comes standard with python 1.6. and up

    1. Download the distribution file at sourceforge.net
    2. Uncompress the distribution file
    3. Change into the directory created during the uncompression of distribution file
    4. execute the command:
      python setup.py install

  11. UNIX Installation Example
    1. create your project directory:
      $ mkdir ~/public_html/psotest
    2. create sub-directory:
      $ mkdir ~/public_html/psotest/templates
    3. Download and un-tar the latest pso package:
      $ wget  http://pso.sourceforge.net/dist/current.tgz
      or for MS lovers: [ http://pso.sourceforge.net/dist/current.exe]
      $ tar zvfx current.tgz
      	
    4. cd into the pso directory and then install it locally into your psotest directory.
      $ cd  pso-XX/
      $ python2.1 setup.py install --install-purelib=~/public_html/testpso
      
    5. Create a .htaccess file with the following:
      #.htaccess for testing pso using CGI
      #
      # Options ExecCGI directs apache to allow cgi's to be run from this directory.
      # AddHandler cgi-script  .py  treat any .py file as a script
      #SetEnv PSOServiceId MyPSOTest  names  session cookie as MyPSOtest
      
      Options ExecCGI Indexes 
      AddHandler cgi-script  .cgi 	
      SetEnv PSOServiceId MyPSOTest	
      
    6. Now we will create a test handler and check that it works, we will call the file test.py , but creat a link to it test.cgi. [We are doing this link, and setting the Options Indexes and AddHandler cgi-script .cgi for only one reason: to allow you to view and browser the source code from the web.]
      #!/usr/bin/env python2.1
      #
      from pso.service import ServiceHandler
      def testHandler(serviceRequest):
      	print "hello world"
      
      if __name__ == '__main__':
      	ServiceHandler().run(testHandler)
      
      Now we will set the file permissions, and do the link:
      $ chmod a+x  test.py
      $ ln test.py test.cgi
      
      and try it from the command line, should give you this:
      python:~/public_html/testpso# ./test.cgi
      set-cookie: SESSION_ID=@60123.0SESSION_ID;
      content-type: text/html
      
      hello world
      python:~/public_html/testpso# 
      
      Now lets try it using a browser: http://www.yourhost.com/~yourid/testpso/test.cgi

    see Greg Ward's http://www.python.org/doc/current/inst/inst.html If all else fails. Just copy the pso directory somewhere in the python PATH or to the directory of your request handler.

    SourceForge Logo