Appengine is a new Google service using which you can write an application in Python, and then use Google’s infrasructure when your application needs to scale. We will use Google’s Appengine to create a search website using Yahoo’s Developer API. Ah, the Irony. You can see the complete application from appspot
We will build the app in Python, so you need to know Python. No other knowledge is assumed.
Appengine has two parts, the Appengine servers at Google’s infrastructure where you will deploy you code, and a SDK which you will use to develop code locally. Download the SDK, and make sure that you add dev_appserver.py and appcfg.py to the system PATH.
You can download the completed application from here. The complete spplication consists of five files, which we will explore in detail below.
You need to provide a cofiguration file to Appengine, with information about your application. The configuration is done using a YAML file, which is a very simple markup language. Create a directory where you would store all your application files and create a file app.yaml. Edit this file to put these lines.:
application: asdf version: 1 runtime: python api_version: 1 handlers: - url: /.* script: search.py
Let us disect each of these lines to see what they do.
application: asdf: This tell the name of the application. On your local webserver, you can keep any name, but when you deploy it to Appspot, you must own the application there for uploads to work.
version: 1: This determines the major version of your application and is mostly used for versoning at Google’s end.
runtime: python: This tell the runtime to use. As of now Python is the only supported runtime.
api_version: 1: The version of API to use. Currently 1 is the only supported value.
handlers: - url: /.* script: search.py
Handlers maps the script to call when a particular URL pattern is encoutered, and is sepcified using regular expressions. The regex url: /.* asks the script to map all urls to a python script search.py.
Let us take a look at the python code which we will look through in detail below.:
import wsgiref.handlers
from google.appengine.ext import webapp
from google.appengine.ext.webapp import template
from google.appengine.api import urlfetch
from django.utils import simplejson
import urllib
import logging
from StringIO import StringIO
class MainPage(webapp.RequestHandler):
def get(self):
self.response.headers['Content-Type'] = 'text/html'
query = self.request.get('q', '')
if query:
logging.debug('query: %s'% query)
results = get_search_results('YLPjx2rV34F4hXcTnJYqYJUj9tANeqax76Ip2vADl9kKuByRNHgC4qafbATFoQ', query)
results = results['Result']
payload = dict(results=results, query=query)
resp = template.render('index.html', payload)
else:
resp = template.render('form.html', {})
self.response.out.write(resp)
def get_search_results(appid, query, region ='us', type = 'all', results = 10, start = 0, format ='any', adult_ok = "", similar_ok = "", language = "", country = "", site = "", subscription = "", license = ''):
base_url = u'http://search.yahooapis.com/WebSearchService/V1/webSearch?'
params = locals()
result = _query_yahoo(base_url, params)
return result['ResultSet']
def _query_yahoo(base_url, params):
params['output'] = 'json'
payload = urllib.urlencode(params)
url = base_url + payload
response = StringIO(urlfetch.fetch(url).content)
result = simplejson.load(response)
return result
def main():
application = webapp.WSGIApplication(
[('/', MainPage)],
debug=True)
wsgiref.handlers.CGIHandler().run(application)
if __name__ == "__main__":
main()
main is the first function called when our script is called. It creates a WSGI application, which has the job of mapping URLs to the classes. The next line runs the WSGI application.
The class MainPage is used in response to \ Urls. This class defines a method get which is invoked in response to HTTP GET requests. Similarly you can define put or post to handle the corresponding requests. Here our form only does get requests, so we define get. The line self.response.headers['Content-Type'] = 'text/html' sets a header on the reqponse telling the browser we would be sending HTML back.
The GET or the POST data is in the request objects. So we get the user’s query from request.get. get_search_results queries Yahoo to find web pages with the query. Once we have the results we can show the results by rendering the data with our templates. Lets take a small diversion to learn about templates.
To create a webpage with dynamic data, webapp uses templates. You create the structure of the html, while providing placeholders for the variables which you need to insert. Appengine uses Django templates, which provides programming costructs like looping and if using tags. Lets look at the template for the search results page.:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>searching {{query}}</title>
</head>
<body>
<h2>You searched for {{query}}</h2>
{% for res in results %}
<div class="result">
<h3><a href="{{res.ClickUrl}}">{{res.Title}}</a></h3>
<div class="summary">
{{res.Summary}}
</div>
<div class="extra">
<small >{{res.Url}}</small>
</div>
</div>
{% endfor %}
</body>
</html>
Most of this is simple Html, but you can see a few new constructs, such as, {{query}} and {% for res in results %}. {{...}} allows you to put variables you have passed from your python script to this page. {% ... %}, allow you access to looping, conditionals and other constructs. Here we used {% for res in results %} to loop over an array which we passed to this templates. End of loop is signified by {% endfor %}. Inside of the for loop you have access to the variable defined in the {% for ... %} tag. So inside of the {% for %} we could use {{res}}. As results is a array of dictionaries, {{res}} is a dictionary. We can access any element in {{res}} using a dotted notation, which we did with {{res.Summary}} and {{res.Url}}.
Lets see the other template.:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <title>Search</title> </head> <body> <form actrion="." method="get"> <input type="text" name="q" value="" rows="80" /> <br /> <input type="submit" name="submit" value="search" /> </form> </body> </html>
You will see that this is a simple Html file with no Appengine specific tags. Here we just needed a form, so we used a simple html page, but used python to render it.
If the user has done a search, the code to render the template is:
payload = dict(results=results, query=query)
resp = template.render('index.html', payload)
payload is a dictionary of variables, which we want to use in the template. We pass the results, and the query string to the template.
If the user has not done any search, the code which runs is, resp = template.render('form.html', {}), which renders the forms.html template with an empty dictionary.
We have two helper functions defined, to talk to Yahoo search api:
def get_search_results(appid, query, region ='us', type = 'all', results = 10, start = 0, format ='any', adult_ok = "", similar_ok = "", language = "", country = "", site = "", subscription = "", license = ''):
base_url = u'http://search.yahooapis.com/WebSearchService/V1/webSearch?'
params = locals()
result = _query_yahoo(base_url, params)
return result['ResultSet']
def _query_yahoo(base_url, params):
params['output'] = 'json'
payload = urllib.urlencode(params)
url = base_url + payload
response = StringIO(urlfetch.fetch(url).content)
result = simplejson.load(response)
return result
Appengine is a sandboxed python runtime, and hence there are some limitations in which python functions you can call. urllib.urlopen is such a disallowed function. When we want to access external resources, we need to use Appengine’s urlfetch class instead. Using this call, we get the results for query in json format. We want to use this in python, so we use simplejson.load to get the python representation. simplejson is bundled with Django which is bundled with Appengine.
You can test the code locally using the SDK which you installed. Navigate to the directory where you saved the files and run dev_appserver .. This will start the server on localhost:8080, where you can use your application. If you have an Appengine account go to http://appengine.google.com, and register your application. Edit the app.yaml file to change the name of the application, application: asdf, to your application, and deploy the application using command appcfg update ..
I hope this tutorial has got you excited about the potential of Appengine. A infintely scalable solution seems a tall order, but if Google delivers on it, it moves away a lot of headaches of web development. You can learn more about Appengine at Google code.