You are currently browsing the category archive for the ‘techie’ category.

… and we’re back. I realise I haven’t posted for a couple of weeks. This is down to a nasty bout of flu which laid me very low and generally made the world seem an iller place. This week I want to talk a bit about 50Hz technology in movies, and the storm of slightly odd criticism being thrown at The Hobbit. I realise no one is terribly interested in what I think about his, but hey, when does that ever stop the average blogger?

First a bit of background. Most movies update at 24 frames per second – this is pretty well-known. Yes, yes, yes TV is often interlaced but the complete image updates 24 times per second, even if the lines are staggered. As far as I can tell, this dates back to the earliest days of cinema, when it was recognised as the lowest frame rate that gave a convincing illusion of movement rather than a sequence of still images. Like all the best standards, it’s persisted for an impressively long time.

Recently, though, a plethora of HD technologies have been foisted on an unsuspecting public. We’ve gone relatively rapidly from VHS to DVD to Blu-ray and 3D. The TV in my front room has eye-wateringly crisp definition and magnificent contrast and depth of colour. Its a joy to watch and the movies I play on it are incredibly well mastered and encoded.

All of which makes it a bit odd that by default it came with literally dozens of digital post-processing algorithms that try to reprocess poor quality video input into, well, I assume they are intended to make the picture better but in fact all the dynamic contrast enhancement, edge detection, smart vector deinterlacing and deblocking conspired to make my blu-rays look frankly ludicrous. I spent 3 weeks finding all the settings on the TV and switching everything off, then doing the same with my Blu-ray player software and again on my graphics card driver (I use an HTPC). Having finally gotten to the bottom of them all, I still find I occasionally have to repeat the process after software updates helpfully turn a few randomly back on.

Of all of these things, two are the most infuriating. The first is smart deinterlacing, which is so smart that it tries to deinterlace content that is not in fact interlaced in the first place and managed to turn 1080p into something resembling over-played VHS. Seriously, turn this one off and you’ll see a VAST improvement in picture quality. If in doubt, deactivate deinterlacing completely, you won’t regret it.

The second, however, is the real troll under the bridge: smoothing. On my TV this is called MotionPlus, but technically it’s interpolation in the time domain. From a technical standpoint this is actually a pretty obvious thing to do. All the improvements in definition I mentioned above improve the spatial resolution of the individual images in the movie: smaller dots, larger images, simply more pixels. This is great, but it’s only half the story. Because the images are animated, you could also add in extra frames in between the 24 you’ve already got each second.

And this is where is starts to get interesting, because we’re starting to talk about hings that don’t just involve the technology but also involves human perception. Throughout the history of cinema it’s been assumed that 24fps is enough to fool a person into thinking that a sequence of still images is actually a smooth flow of animation, but as I mentioned at the beginning, it was adopted because it was the slowest speed that did the job, and this does make a difference.

If you try hard, however, (and largely ignore the detail of what you’re watching) it is possible to see the transitions between frames. This is actually easier with HD stuff – the picture is so crisp that the tiny judder due to the frame rate is easier to spot. The reason we overlook this, I believe, is that all of us have grown up watching 24fps video and accepting it as such. We subconsciously ignore the fact that the illusion isn’t quite perfect.

So, enter temporal interpolation. With hardware acceleration, you can interpolate frames in real-time, and with a decent enough algorithm you can increase the apparent frame rate quite a lot. My TV’s top setting is at 200Hz. That’s pretty smooth. Newer sets are (discretely) claiming frame rates of 400Hz. That’s a hell of a lot of interpolation.

I don’t use it. Why not? because it makes beautifully directed movies with high production values look about as convincingly real as local amateur dramatics (I’m not having a go at amateur dramatics, it’s just that production values tend to be a bit cheap and cheerful. This, of course, is not why you go!). A lot of the time it’s completely unwatchable. Interestingly, this is all very similar to the criticisms leveled at The Hobbit, which was shot at 48fps, which suggests that it’s not an artifact of the interpolation but the increased temporal resolution itself that’ to blame.

But why on earth should this be? Improving the frame rate should reduce judder and Peter Jackson is quoted as saying that filming at 48fps almost eliminates motion blur, both of which should improve the viewing experience, not make things unwatchable.

So, here’s my 2 cents-worth: the effect is perceptual. I’ve already mentioned that because we’ve always watched movies at 24fps we’re conditioned to it. You could extend this argument to suggest that, subconsciously, we have a mental category for filmed material – we accept things animated in this way as fictions, separate from reality.

To develop this idea completely I need to introduce one more concept, the Uncanny Valley. The uncanny Valley is a well-documented effect in computer graphics, but it is basically perceptual. It works like this: imagine a simple cartoon, like a stick-man. We accept this a rough representation of a person. It’s not very accurate, but we accept it as being a representation of a person. If we add more detail, like a face or some feet, we accept this as well. In fact, this cartoon is slightly more convincing than before.

Adding more and more detail: more realistic shape, colouring, clothing, more nuanced behaviour and we find the illusion more and more convincing, whilst still being aware that it is a cartoon. This continues up to a point, but once a representation of a person gets very close to being completely realistic we start to reject it – the cartoon character becomes a doll-eyed automaton and the illusion is ruined.

This is the Uncanny Valley, and it’s a valley of perception. At a certain level of realism there a shift in perception, we cease to accept it as a cartoon and start to put it in the same category as actual people. At this stage different mechanisms kick in and we apply different standards: this is an illusion, our inner cave-men will not be fooled by this sorcery. Ha!

This is a difficult thing to overcome. Polar Express is often touted as a movie that suffered hugely from the uncanny valley. It attempted to be photorealistic and failed. The same was true of Final Fantasy: The Spirits Within, but isn’t true of (say) Tin-Tin, which has people who objectively look nothing like real people but are paradoxically easier to accept as such.

And this is what I think the problem is with higher frame rates – they cause us to categorise what we are seeing as real life rather than movies and as a consequence they look like actors on film sets rather than convincing illusions. Of course, with practice you can train yourself to recognise them as movies and all becomes right with the cosmos. I watched season 2 of Game of Thrones at a friend’s house on their 200Hz TV. It took seven episodes before I could look past the overt, hyper-reality of the interpolation and I’m still not completely past it but at home I just deactivate it as it only distracts from enjoying the movie.

There’s one final thing to add here. This effect is made worse by bad lighting. Interpolated content shot outside with natural lighting does grate nearly as much as poorly lit studio shots. I’m guessing this just adds to the unreality.

So that’s my thought on the subject. Perceptual shifts causing movies to look unrealistic. If this is true, I would imagine it would be an interesting challenge for a director, can you make a convincing-looking movie at 48fps? It would probably end up being spectacular.

Update: It seems that the situation in early cinema with regards frame rate is quite interesting. Silent movies typically had frame rates of 20-26 fps, although this was more of a function of equipment design than anything perceptual. Thomas Edison apparent insisted that 46 fps was the minimum, and that “anything less will strain the eye.” Interesting.

Furthermore, the perception of very short events is quite complex, with flashes of darkness as short as 10ms being perceptible, but flashes of light causing a “persistence of vision” effect that can cause 10ms flashes of light to appear more like 100ms long, and cause consecutive flashes of different colours to merge together so that, for example, red plus green is perceived as yellow.

It’s been a while since I posted anything techie, so this week I thought I’d revisit the visualiser I mentioned a while back. Before we continue, however, a short disclaimer: this post contains maths and (a little) code. The maths involves nothing more strenuous than vector cross-products, and the code is a few lines of JoGL but if those things don’t float your boat then by all means feel free to ignore the rest of this post. I won’t hold it against you.

So, that said: I’ve been doing some work adding interactivity to Seer, the visualiser for the Camino Monte-Carlo simulation. The visualiser is primarily a public engagement and demonstration tool, but it’s also pretty handy for debugging. I’ve talked about it before here and here. What the Seer does is visualise a live diffusion MRI simulation. It shows the diffusion environment and the positions of diffusing particles. It currently looks like this:

What we’ve got here is the tissue mesh I’ve talked about before with spins executing random walks rendered in red. Their positions are update by a Camino simulation running on the same mesh in a separate thread. The smaller plot in the bottom-left is a 3D scatter plot of the net displacement of each spin, also updated live. This shows the sort of data that we would measure from the simulation using a diffusion MRI pulse sequence.

### Point picking & JoGL code

What we decided, though, was that it needed a bit more interactivity. Specifically we wnt to be able to reset all the spins to a location that you select with the mouse. Since I’m already using a single click to move, the right mousebutton to reset the visualisation, and the mouse wheel to zoom, I decided to go with a double-click to set spin positions.

This presents an interesting challenge, though. How do you translate the mouse coordinates (which are 2D) into a position in 3D space? Setting out I had this plan about projecting a vector into the OpenGL viewport and checking for intersections with the mesh, sorting them along the arclength and then projecting into visualisation (modelview) coords and then on to simulation (substrate) coordinates. What was quite nice, though, was that it turns out thatOpenGL, or rather the GLU API, does quite a bit of this for you.

Point picking works by taking the coordinates of the pixel you click on, converting to a position in the plane at the frot of the view frustrum (this can be done in 2D), then projecting into the scene along the current z-axis until you hit something. You then use the z-coordinate of the object as your third coordinate. This gives you a 3D point that you then project into the model coordinates via the current projection and modelview matrices. GLU provides methods to do this, specifically they’re called glReadPixels() and gluUnProject(). There’s an excellent tutorial on NeHe’s website here.

Because this is a tutorial for OpenGL in C/C++, I’ll also add my code snippet in JoGL:

 public final void resetWalkerPositions(GLAutoDrawable drawable, Mesh mesh){
 GL gl= drawable.getGL(); 
 IntBuffer viewport= BufferUtil.newIntBuffer(4); DoubleBuffer modelview= BufferUtil.newDoubleBuffer(16); DoubleBuffer projection= BufferUtil.newDoubleBuffer(16); 
 gl.glGetIntegerv(GL.GL_VIEWPORT, viewport); 
 int winx= walkerX; int winy= viewport.get(3)-walkerY; 
 FloatBuffer posZ= BufferUtil.newFloatBuffer(1); DoubleBuffer pos= BufferUtil.newDoubleBuffer(3); 
 gl.glReadPixels(winx, winy, 1, 1, GL.GL_DEPTH_COMPONENT, GL.GL_FLOAT, posZ); 
 gl.glGetDoublev(GL.GL_MODELVIEW_MATRIX, modelview); gl.glGetDoublev(GL.GL_PROJECTION_MATRIX, projection); 
 glu.gluUnProject((double)winx, (double)winy, (double)posZ.get(0), modelview, projection, viewport, pos); 
 // transform into substrate coords boolean onMesh= mesh.GLtoSubstrate(pos, subsCoords); 
 // if the coordinates are on the mesh, reset walker positions if(onMesh){ // tell the simulation thread to reset walker positions SimulationThread.resetWalkers(subsCoords); } else{ System.err.println("coords are off-mesh. walkers not reset."); } 
 } 

In addition to the use of gluUnProject(), there’s one additional JoGL-specific issue here: the GL object itself. The way I’d designed the code meant that the method that catches the double-click event was nowhere near the rendering code that does the unprojection and talks to the simulation. I spent a bit of time trying to get hold of a GL object and hand it over to the event handler, but nothing I tried worked so instead I realised that all the event handler actually needed to do was to provide the mouse coordinates and instruct the render method to do the rest. So all it does is set a flag and hand over the coords via a class-level variable. That’s a theme that’s emerged a little recently: making instructions to other parts of the system via global flags rather than method calls. It works pretty well when you’ve got functionality that’s spread across different parts f the code. (I suppose I could also have used static variables but the principle is the same and this way things are more self-contained).

### Planes, projection and a bit of maths

So: sorted. Well, actually no. Unfortunately The meshes that I’m clocking on have a lot of holes in them, and sometimes I was to click on a hole instead of a triangle. In this case, glUnProject() gives a point at infinity, which isn’t what I want. I want a point half way across my substrate. This means there’s a special case to catch. Fortunately, points at infinity are easy enough to catch as the coordinate will be equal to 1, but what to do once you’ve caught it?

Firstly, we need to recognise that this is essentially a projection into a plane. The plane in question bisects the substrate half way along its z-axis ans so is easily defined but in viewport coords will depend on the current view of the mesh (the modelview matrix). Given a plane $ax + by + cz + d =0$, and a point $\left( X, Y, Z \right)$ we just choose a new z-coord such that

$Z' = -\frac{aX + bY +d}{c}$

The tricky part is knowing what $a$$b$, $c$ and $d$ are. My initial thought was to back rotate into substrate coords and project into the appropriate plane, but this requires you to invert the modelview matrix, which frankly a cannot be bothered to write code to do (and in any case is an expensive operation) so I need to be working in viewport coordinates, not modelview coordinates. So then I thought I’d use the modelview matrix to rotate the plane normal but it turns out that plane normals transform with the inverse of the rotation matrix so once gain we’re back to square one.

The answer is to define the plan using three points and use the cross product to get the plane normal. Any three non-collinear points define a plane. These points transform using the modelview matrix, not the inverse, and the the components of the normal to the plane are the coefficients we want. The algebr works out like this [Cracks knuckles], [flexes fingers in the manner of concert pianist]:

$\hat{\mathbf{n}}= \frac{\mathbf{n}}{|\mathbf{n}|}$

$\mathbf{n} = \left(\mathbf{v}_3 \times \mathbf{v}_1\right)-\left(\mathbf{v}_2 \times \mathbf{v}_1\right)-\left(\mathbf{v}_3 \times \mathbf{v}_2\right)$

because

$(a - b) \times (c - b) = a \times (c-b) - b \times (c-b) = (c-b) \times b - (c-b) \times a$

$(a - b) \times (c - b) = c \times b - b \times b - c \times a + b \times a$

$(a - b) \times (c - b) = c \times b - c \times a + b \times a$

and we’re away. Three cross products and no matrix inversion.

I’ll call it a day there. I’ll post some code snippets once they’re done.

As promised, here are some code snippets for the visualiser. First up, assemble the vertex and normals buffers.

### Constructing the geometry

In my case, I’ve already read the data into Triangle objects which hold their vertices and normals. I rearrange things into FloatBuffer objects

Collection<Triangle> triangles= Seer.triangles;
scale= (float)(1.0/Seer.maxSubsSize);

vertexCount= triangles.size()*3; // three vertices per triangle
vertices= BufferUtil.newFloatBuffer(vertexCount*3); // three coords per vertex
normals= BufferUtil.newFloatBuffer(vertexCount*3); // one normal per vertex

// add all triangles in the mesh to the vertex float buffer
for(Iterator<Triangle> triIt= triangles.iterator(); triIt.hasNext(); ){

Triangle triangle= triIt.next();
double[] normal= triangle.getNormal();

for(int i=0; i<3; i++){ // loop over vertices in the triangle
double[] vertex= triangle.getVertex(i);

for(int j=0; j<3; j++){ // loop over coords of the vertex
vertices.put(substrateToGL(vertex[j], j)); // transform substrate space into openGL coords
normals.put(-(float)normal[j]); // normals are (ahem) normalised and polar, so no transform
}
}
}
vertices.flip();
normals.flip();

I loop over the Collection of Triangles and re-parse everything into FloatBuffers for vertices an normals which are then flipped to reverse the order.

There are a couple of details to note: first, normal are repeated three times (once for each vertex), and second that I’ve flipped the normals. This turns out to be important for correct rendering in my case because the code that generates the meshes in the first place always makes them point inward.

### Build the VBOs

Next up we need to generate and bind the buffers for the VBO. We need separate buffers for vertices and normals (we’d need separate ones for colour and texture data if we were doing that as well).

// Generate bnd bind the Vertex buffer
gl.glGenBuffersARB(1, VBOVertices, 0); // Get A Valid Name
gl.glBindBufferARB(GL.GL_ARRAY_BUFFER_ARB, VBOVertices[0]); // Bind The Buffer
gl.glBufferDataARB(GL.GL_ARRAY_BUFFER_ARB, vertexCount * 3 * BufferUtil.SIZEOF_FLOAT, vertices, GL.GL_STATIC_DRAW_ARB);
// generate and bind the normals buffer
gl.glGenBuffersARB(1, VBONormals, 0);
gl.glBindBufferARB(GL.GL_ARRAY_BUFFER_ARB, VBONormals[0]);
gl.glBufferDataARB(GL.GL_ARRAY_BUFFER_ARB, vertexCount * 3 * BufferUtil.SIZEOF_FLOAT, normals, GL.GL_STATIC_DRAW_ARB);

vertices = null;
normals = null;

So here we’ve generated buffer “names” (which are jut integer identifiers), bound the buffer to the data identifier and further bound the name to the data. After that we don’t need the original FloatBuffers any more and can free the memory.

### Drawing the object

Now we’re all set, and just need to be able to render the mesh whenever we feel like it. I’ve added a method to my mesh object that renders it which looks like this.

// Enable Pointers
gl.glEnableClientState(GL.GL_NORMAL_ARRAY);
gl.glEnableClientState(GL.GL_VERTEX_ARRAY);

gl.glDisable(GL.GL_COLOR_MATERIAL);
gl.glDisable(GL.GL_TEXTURE_2D);

gl.glBindBufferARB(GL.GL_ARRAY_BUFFER_ARB, this.VBONormals[0]);
gl.glNormalPointer(GL.GL_FLOAT, 0, 0);

gl.glBindBufferARB(GL.GL_ARRAY_BUFFER_ARB, this.VBOVertices[0]);
gl.glVertexPointer(3, GL.GL_FLOAT, 0, 0);

// Set The Vertex Pointer To The Vertex Buffer
gl.glDrawArrays(GL.GL_TRIANGLES, 0, this.vertexCount); // Draw All Of The Triangles At Once

// Disable Pointers
gl.glDisableClientState(GL.GL_VERTEX_ARRAY);  // Disable Vertex Arrays
gl.glDisableClientState(GL.GL_NORMAL_ARRAY); // Disable Normal Arrays


What turns out to be important here is that you do the vertices LAST. Activate the client state for normals before the vertices, specify the normals pointer before the vertices. Then make the call to gl.glDrawArrays to actually instruct the card to render. I also disable the client states vertices first (opposite order to enabling) which may not be essential, but does work.

### And we’re done…

That about wraps it up. I’ve been able to render meshes with about a million triangles at upwards of 60fps and 5 million at around 30fps. The complete application links to the Camino simulation and renders diffusive dynamics restricted by the mesh. The simulation ends up running in a separate thread so that it doesn’t pull down the frame rate in the visualiser, and also renders a small displacement plot in the lower left corner that co-rotates with the main plot. It’s also got arcball rotation and mouse-wheel zoom.

After a couple of more introspective blogs, this week I’m going o be a little more geeky. To whit: 3D graphics programming. I’m a fan of graphics coding – it involves interesting maths and methods and can also be quite rewarding when your vision for a little application actually appears on the screen. Everyone should have a hobby, even an exceptionally nerdy one.

All this meant that when the news came down that we would need to make a demo for the upcoming CMIC open day, I jumped at the opportunity to build a cool-looking 3D realisation of my diffusion simulation. The idea is to render the tissue substrate we use to restrict the motion of the spins and the diffusing spins themselves and a couple of plots of spin displacements. It’s written in JoGL and interfaces with the Camino project, which contains the simulation code.

I should say at this point that it’s not yet finished, so this is something of a part one, but I’ve learned some interesting things this week so I thought I’d blog it.

First up, there was the basecode. This opens a window, sets up OpenGL and the rendering environment, and  initialises an arcball so that you can rotate the view with the mouse. In the spirit of not inventing the wheel, I used some code that someone else had written. Specifically, IntelliJ’s port of the NeHe OpenGL tutorials, which I can highly recommend. Lesson 48 contains the arcball code.

The meshes from the diffusion simulation can be fairly big, with hundreds of thousands or millions of triangles so to render them efficiently meant doing something radical: using some OpenGL functionality that was developed after 1996. Shocking, I know, but needs must.

Back in 2001 when, as a fresh(er) faced young PhD student earning OpenGL seemed like a good idea, you rendered triangles one at a time with normals and colour specified in immediate mode. This, I have learned, is now rather akin to wearing a stove-pipe hat and working exclusively with steam-driven technoogy*. Instead there are these new-fangled things called Vertex Buffer Objects (VBOs).

A VBO is just a way of shunting as much of the rendering as possible off onto the GPU. Assemble your verte data into the right order, configure the graphics pipeline, shove the data into video RAM and them tell the card to get on with it.

It works VERY well.

I wanted to render my meshes with lighting, so I needed to specify normals as well as vertices. It turned out that find code example to construct and render VBOs with normals was a little hard to come by, so I ended up stumbling through this part on my own. I’ve got it working, though, and I’ll be posting some code snippets to show what I did. I’m not claiming this is the best code in the world, but it works and has pretty decent performance.

In the process of getting things working, I learned some important things:

• VBOs can easily handle normals (and colours, and textures, for that matter) but OpenGL is a little fussy about the order in which you do things. You need to generate and bind the normals object and specify the normals pointer before the vertices or you’ll get an error. I’m sure there’s a good reason for this, but my knowledge is too superficial to know what it is.
• Specifying projection geometry can be a tricky business. The view frustrum won’t work with a negative clipping plane, but more importantly a clipping plane at zero can cause your z-buffering to stop working (I presume this is due to singular values in some projection matrix). Moving the clipping plane away from zero will fix this.
• By default OpenGL only lights one side of the triangles. This is great for a closed surface, but my meshes are unholy messes derived from stacks of microscope images – you cn see inside and need to render both sides of the triangles. This has nothing to do with the VBO or even the shader model, you change it my specifying a two-sided lighting model with glLightModelf(GL_LIGHT_MODEL_TWO_SIDE, 1.0f).
• VBOs are almost supernaturally efficient. This morning I loaded a mesh with over 1000000 triangles. I can render it at over 30fps with no special sorting or optimisation at all on my laptop within my IDE.

So now I some code that renders a complex mesh with arcball rotation and lighting. I’ve added some extra functionality for a little graph in the bottom-left corner that I’ll be adding to over the next week or so. In the mean time, here’s a screenshot:

A plant tissue mesh visualised with my new code

… as a bonus, I can render any mesh I’ve got a ply file for, so We can now simulate diffusion inside a cow

Diffusion in hollow cows in an under-studied problem

or a sailing ship…

This would make an intersting crossing-fibres problem

I’ll post some updates once there’s some more functionality. Next up: the diffusing particles, and proper displacement profile plots.

* i.e. kind of steampunk. Not quite what I was going for.