Lxml Gramplet

From Gramps
Revision as of 07:04, 15 March 2013 by Romjerome (Talk | contribs)

Jump to: navigation, search

lxml gramplet is an experimental gramplet working under POSIX platform(s), which reads, writes (not the original one; safe read only state), transforms content of our Gramps XML file on the fly without an import into our database (Gramps session).

Dependencies and file format

Goals

The idea of this experimental lxml gramplet is to provide a way for using basic lxml features with Gramps XML files.

XPath, Xslt, XML dump, RelaxNG and XSD validations, can be used and done by lxml, which provides an API very close to etree ElementTree module from python 2.5 and later.

The experimental lxml gramplet aims to use these lxml features[1] by parsing a Gramps XML file generated by Gramps 3.4.x (or 3.3.x) and to generate an output sample, using open W3C standards (XML, Web design, Web services, etc ...).


[1] see also lxml.objectify

Screenshots

  1. Titles, labels and footer are translated (written on python code).
  2. Full separation of presentation and content for the generation.


  • Local output with custom XML data in buffer and XSLT transformation
Dynamic output


  • Local output without stylesheet
Dynamic output without stylesheet


  • View via HTML view
Within Gramps


  • Pseudo dynamic code generation (xml + xslt = html file)
Dynamic code geneartion


  • Action on surname (sort, remove duplicated)
Sorted surnames list


  • Action on place title (sort, enable cross search on place fields)
Sorted places list


  • Hardcoded list written in python and translated by Gramps into our locale (if translation exists)
Hardcoded list (gramps translations)


Test it

  • You can get a copy of this simple draft from Addon repository:

http://svn.code.sf.net/p/gramps-addons/code/trunk/contrib/lxml

Currently, this addon quickly explores multiple ways. Feel free to modify for your own use.

Go further

Bibliography gramplet ?

  • CherryTree is an hierarchical note taking application, featuring rich text and syntax highlighting, storing all the data (including images) in a single xml file with extension .ctd, which has planned to also implement an integration with zotero content.
  • Zim is a graphical text editor used to maintain a collection of wiki pages. All pages you create in zim are saved as plain text files with wiki formatting. This means that you can access your content with any other editor or file manager without being dependent on zim. You can even have your pages in a revision control system like CVS or use a Makefile to compile your notes into a webpage. Any images you add are just image files which are linked from the text files. This means that zim can call your standard programs to edit images. When you embed an image in a page the context menu for the image will offer to open it with whatever image manipulation programs you have installed. After editing you just reload the page to see the result. See also third party contributions.

Collaborative indexes

  • Tiny Tafel [1]
  • Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing. It should support Gramps XML, Gramps CSV and Gramps JSON.

Clients library for FamilySearch API

Serialization for C client library or Objective C Client library is done in conjunction with libxml2.

Comments on DB API Idea

I was basically approaching it from the leave gen.lib alone and
implement a "fully blown" SimpleAccess-esque solution.

At the moment I basically have a 'DB' object which represents an open
database. This at the moment is populated from a Gramps XML file. This
is then basically stored as lxml.objectify objects. Internally a graph
structure is built to represent the linking inside the database (so
relationships and ref. integrity is made easier).

'DBItem' objects consist of the 'node' data, the basic save/delete
etc... Deleting an event automatically removes all other references to
it (which has caught me out previously).

class Person(DBItem):
    DBTYPE = 'person'

Basically registers an object that 'wraps' a basic DBItem, but
containing useful attributes/methods. So for a person, we can write
attributes such as .birth, .mother, .families etc... etc... It can also
over-ride how it should be saved/retrieved etc...

I chose this approach because it keeps the process incremental. We can
still access the 'raw' data in a DBItem for the stuff I'm not caring
about at the moment, but someone can write a 'Place' class later for
instance.

The DB itself is an xpath queryable object (adds a bit of flexibility
for selections that don't have convenient attributes as of yet).

I'll see if I can get the code example out this week.

Anyway, does this seem a reasonable approach? 

source: Archive (Dec 07, 2009) on gramps-devel mailing list

Database compare and merge

  • GrampsCompare.py, a python script for comparing data in 2 Gramps XML files.

source: Archive (Oct 02, 2011) on gramps-devel mailing list

Database backend

Data transfer

  • Akara is a platform for developing data services available on the Web, using REST architecture. Akara is open source software written in Python and C. eg, Recollection project for the Library of Congress. See the user guide or screencasts (shockwave flash) [2], [3], [4].

Environment

Faceted classification

A faceted classification, system proposed by Shiyali Ramamrita Ranganathan with the theory "five laws in library science". See also Folksonomy.

HTML class

  • Gramps

Libhtml is an HTML/XML class for Gramps, see API.

  • Gtk3

GTK+3 provides an HTML backend that allows GTK applications to run natively within an HTML5 web navigator.

See sample1, sample2, sample3.

Interface

Performances

See Gramps performances for comparison on large datasets between different Gramps versions.

Web applications

  • GEPS 013 describes a web-based application that runs in your browser, and requires a server. A prototype is now on-line at http://gramps-connect.org/ which is running trunk on a sample database (id=admin1, password=gramps).
  • DenominoViso plugin for GRAMPS is a third party plugin that creates an interactive graphical representation of a family tree. DenominoViso creates a grapical webpage in SVG/XHTML/javascript.

XQuery

"Or something close to SQL like XQuery so you can do querys on Gramps XML database similar to SQL Query. It can works even in internet browser thru plugins. XML is quite self-explanatory. Zorba provide python bindings for XQuery."

source: Archive (Oct 28, 2009) on gramps-user mailing list