pschmidtke's blog

Painless XML parsing using Python

I recently had to parse some XML data and as usual I used python for this task. There are several XML parsers available but XML is a pain and parsing them also is. I first used the minidom interface and it was working quite well until I had to parse a 40Mb file...which basically intended to use more than 2Gb of Ram during parsing. This is not acceptable. So I digged a bit around and saw that using the Sax python interface should be more stable. But using Sax is also a pain.

Running Client Side Python

Usually if one has to deal with web-browser development and especially in scientific areas, we have to deal with lots of applications that run on the client side. Often they are controlled using java applets, javascript, ajax etc... If you are not at all into webserver development, it can be a pain just to get into a new programming language again. One popular programming language is python (although some might not agree with me here....get your hands on it, you'll see).

Finding the PyMol source code

PyMol is definetly one of the most used molecular visualizers. For a long time PyMol has been distributed free of charge as a very old beta version and donators had easy access to newer versions (binary). However, everyone always had the possibility to get the sourcecode of PyMol and compile it himself to get a newer version free of charge. The accessibility of the source-code from the pymol website was always a bit occluded, yet visible. Since the passing of the creator of PyMol, Warren DeLano, the NY based Schrödinger took over this piece of software claiming to keep it open source.

Mapping Chemical Features on Molecules using RDKit

This is a little blog post on some nice features that are available within RDKit. In case you don't know RDKit, have a look here. Basically, it's a C++ based python library for small molecule handling. A part from a getting started guide and some pieces of documentation here and there, lots of features implemented in RDKit are not well documented or if they are, they are not very visible to the end-user.

The art of passing integers from python to cuda

I recently had some problems with some python -> pycuda -> cuda implementations that stole me quite a lot of time for debugging.

As general rule, when passing an integer to cuda via pycuda specify in every case the number of bits used  for it.

Usually you can pass an integer to cuda using npy.uint(2)...Seen the nomenclature of numpy, one  should expect to receive an unsigned integer in the cuda code. Obviously that is not the case because I was receiving everything but the value I transmitted. Instead one has to specify npy.uint32(2) to transmit a unsigned integer.

So just a little help for conversions (untested) :

numpy cuda (C)
uint16 unsigned short
int16 short
uint32 unsigned int
int32 int

I skip the long int these types are a bit weird to declare in C.


Syndicate content