Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PShape performance could be improved using buffer object streaming #196

Closed
processing-bot opened this issue May 18, 2021 · 16 comments
Closed
Labels
enhancement New feature or request opengl

Comments

@processing-bot
Copy link
Collaborator

Created by: codeanticode

Performance was quite good with GLModel (from the GLGraphics library) in certain instances where PShape still struggles (for example when modifying the coordinates of many vertices at once), because GLModel used a technique called buffer object streaming. This technique allows one to write data directly into a regular Java buffer, which internally is mapped to video memory. This relies on the function glMapBufferRange, which was available only on desktop, not on mobile (OpenGL ES), back when the OpenGL renderer in Processing 2 & 3 was implemented it. In order to keep code parity between desktop and mobile versions of the OpenGL renderer Processing, buffer streaming was not used in PShapeOpenGL. It turns out that glMapBufferRange eventually became available in OpenGL ES 3.0. So it could be possible to re-implement buffer streaming to improve PShape performance, perhaps in combination with other PShape enhancements such as this one processing/processing#2280, for both Processing 4 and new versions of the Android mode. [to be tagged as proposed enhancement]

@processing-bot
Copy link
Collaborator Author

Created by: benfry

Since you're the maintainer of that code, are you offering to implement it? Or is this a request for someone else to do so?

@processing-bot
Copy link
Collaborator Author

Created by: codeanticode

I'd like to implement it. Just could not assign issues to myself in this repo :-)

@processing-bot
Copy link
Collaborator Author

Created by: benfry

That's odd, it should have similar permissions as the 3.x repo, but let me know if I need to set something for you.

@processing-bot
Copy link
Collaborator Author

Created by: codeanticode

Looks like I'm able only to unassign myself from the issue, but that's ok. You can add "enhancement" and "opengl" as labels.

@processing-bot
Copy link
Collaborator Author

Created by: codeanticode

Explored stream object copy, actually not that hard to implement: https://github.com/codeanticode/processing4/commit/a23aaa51980e0bfc1dc2dc7e3c1c47ba3ff8648f

That could help, but what makes PShape sluggish compared with the old GLModel class is that the latter operated directly on the tessellated geometry that's rendered through GL. PShape adds a lot of overhead necessary to properly tessellate primitives, arbitrary polys, stroke lines, etc.

One way to bypass this overhead is to allow direct modification of the tessellated geometry. There was some "experimental" API in PShapeOpenGL that allowed to retrieve the tessellated vertices, but don't think it was very useful. Something more useful in terms of perf could be to add a beginTessUpdate/endTessUpdate block where all setVertex/Fill/Normal inside of it would operate on the tess geo. I tried that out (https://github.com/codeanticode/processing4/commit/6595427e117b31e685fe3e53477410854f88c0bb) and seems to make a differnce. From something like:

for (int ci = 0; ci < group.getChildCount(); ci++) {
  PShape child = group.getChild(ci);
  float rx = random(-1, 1);
  float ry = random(-1, 1);
  float rz = random(-1, 1);
  child.translate(vi, v.x, v.y, v.z);
}

to

group.beginTessUpdate(TRIANGLES);
for (int ci = 0; ci < group.getChildCount(); ci++) {
  PShape child = group.getChild(ci);
  float rx = random(-1, 1);
  float ry = random(-1, 1);
  float rz = random(-1, 1);	      
  for (int vi = 0; vi < child.getVertexCount(); vi++) {
    child.getVertex(vi, v);
    v.x += rx;
    v.y += ry;
    v.z += rz;
    child.setVertex(vi, v.x, v.y, v.z);
  }
}
group.endTessUpdate();

the fps difference can be quite significant (3X) for large geometries. Perhaps is worth removing the getTesselation() methods from PShape, and adding beginTessUpdate/endTessUpdate (or better named).

@processing-bot
Copy link
Collaborator Author

Created by: PratiquesAlgorithmiques

Hello Andrés, I'd like to test this. What is v in the above examples ? Thanks !

@processing-bot
Copy link
Collaborator Author

Created by: codeanticode

oh sorry, v is just a temporary PVector to get the current vertex coordinates, here you have the full code, as a Java application:

package test;

import processing.core.PApplet;
import processing.core.PMatrix3D;
import processing.core.PShape;
import processing.core.PVector;

public class Sketch extends PApplet {

	PShape group;

	int numBoxes = 162500;
	//int numBoxes = 500;
	float boxSize = 2;

	PMatrix3D mat = new PMatrix3D();
	
	int fcount, lastm;
	float frate;
	int fint = 3;	
	
  public void settings() {
	  size(600, 600, P3D);
//	  fullScreen(P3D);
	  noSmooth();
  }
    
  public void setup() {   
	  noStroke();
	  group = createShape(GROUP);
	  
	  for (int i = 0; i < numBoxes; i++) {
	    PShape s = createShape(BOX, boxSize, boxSize, boxSize);
	    s.setFill(color(255, 0, 0));
	    s.translate(random(-width/2, width/2), random(-height/2, height/2), random(-1000, 1000));
	    group.addChild(s);
	  } 
  }
  
  public void draw() {
	  background(0);
	  lights();
	  PVector v = new PVector();
	  
	  group.beginTessUpdate(TRIANGLES);
	  for (int ci = 0; ci < group.getChildCount(); ci++) {
	    PShape child = (PShape)group.getChild(ci);
	    float rx = random(-1, 1);
	    float ry = random(-1, 1);
	    float rz = random(-1, 1);	      
	    for (int vi = 0; vi < child.getVertexCount(); vi++) {
	  	child.getVertex(vi, v);
	  	v.x += rx;
	  	v.y += ry;
	  	v.z += rz;
	      child.setVertex(vi, v.x, v.y, v.z);
	    }
	  }
	  group.endTessUpdate();

	  translate(width/2, height/2, 0);
	  rotateY(frameCount * 0.001f);
	  rotateX(frameCount * 0.001f);
	  shape(group);
	  
//	  if(frameCount % 60 == 0) {
//	    println(frameRate);
//	  }
	  fcount += 1;
	  int m = millis();
	  if (m - lastm > 1000 * fint) {
	    frate = (float)(fcount) / fint;
	    fcount = 0;
	    lastm = m;
	    println("fps: " + frate); 
	  } 	  
  }
    
  public static void main(String[] args) {
    PApplet.main("test.Sketch");
  }
}

You would need to build this branch in my fork of processing4 to test this.

@processing-bot
Copy link
Collaborator Author

Created by: PratiquesAlgorithmiques

30fps with 65000 boxes. Things are looking up ! Thanks Andrès ! Jeff

@processing-bot
Copy link
Collaborator Author

Created by: codeanticode

Nice, no hiccups?

@processing-bot
Copy link
Collaborator Author

Created by: codeanticode

@benfry Implemented buffer object streaming here. Also used the opportunity to finalize the attribute API in PShape, and do some refactoring/cleanup in PGrpahicsOpenGL.

As discussed in previous comments, the new buffer object streaming adds to new methods to PShape, right now they are called beginTessellationUpdate and endTessellationUpdate. All setters/getter calls between this begin/end block work on the tessellated geometry instead of the input geometry, and the setter calls save the new values directly into the corresponding mapped vertex buffers, without triggering a re-tessellation. This enables much faster geometry update in certain cases. An example of use:

PShape group;
int numBoxes = 150000;
float boxSize = 2;

void setup() {
  //size(600, 600, P3D);
  fullScreen(P3D);
  noStroke();
  group = createShape(GROUP);

  for (int i = 0; i < numBoxes; i++) {
    PShape s = createShape(BOX, boxSize, boxSize, boxSize);
    s.setFill(color(255, 0, 0));
    s.translate(random(-width/2, width/2), random(-height/2, height/2), random(-1000, 1000));
    group.addChild(s);
  }
}

void draw() {
  background(0);
  lights();

  PVector pos = new PVector();
  group.beginTessellationUpdate();
  for (PShape child : group.getChildren()) {
    float rx = random(-1, 1);
    float ry = random(-1, 1);
    float rz = random(-1, 1);
    for (int idx = 0; idx < child.getVertexCount(); idx++) {
      child.getVertex(idx, pos);
      pos.add(rx, ry, rz);
      child.setVertex(idx, pos);
    }
  }
  group.endTessellationUpdate();

  translate(width/2, height/2, 0);
  rotateY(frameCount * 0.001f);
  rotateX(frameCount * 0.001f);
  shape(group);

  if (frameCount % 60 == 0) {
    println(frameRate);
  }
}

Any thoughts, comments?

@processing-bot
Copy link
Collaborator Author

Created by: benfry

Ok, let's use beginTessellation()/endTessellation()… we don't use three word API calls anywhere in the API unless it's completely unavoidable.

I'm nervous about a major change like this coming in late, but we'll see how it goes.

@processing-bot
Copy link
Collaborator Author

Created by: codeanticode

Get your concern, I ended up making more changes that I originally anticipated, but also did run a fair amount of tests, including with complex libraries such as Hemesh :-) I will go ahead, change the names of the methods and create the PR.

@processing-bot
Copy link
Collaborator Author

Created by: codeanticode

The other PR with all the recent gl changes: #238

BTW, how can add the new example for the tessellation update mode? The default examples don't seem to be included with the main repo anymore.

@processing-bot
Copy link
Collaborator Author

Created by: benfry

@codeanticode Are we clear to close this one?

@processing-bot
Copy link
Collaborator Author

Created by: codeanticode

Yep, closing.

@processing-bot
Copy link
Collaborator Author

Created by: github-actions[bot]

This issue has been automatically locked. To avoid confusion with reports that have already been resolved, closed issues are automatically locked 30 days after the last comment. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request opengl
Projects
None yet
Development

No branches or pull requests

1 participant