Welcome, Guest. Please Login or Register.  • Help
SMF Underground
+ SHMUP-DEV » SHMUP DEV FORUMS » Assistance
|-+ Multithreading a shmup engine?

Pages: [1]   Go Down
0 Members and 1 Guest are viewing this topic. Topic Tools  
Read April 08, 2011, 08:46:13 AM #0
Arpeggiodragon

Multithreading a shmup engine?

Recently I started working on my game again, this time on the dual-core computer, and found that it was using only 50% cpu maxed out. I've never actually coded anything on more than one core before so I just figured compiling with \MT multi-threaded switch would ..you know, ..multithread the app for extra core cpu's. Suprise! Nope, it doesn't work like that. Tongue (I seriously did not know this.) So after reading a bit on it seems I have to explicitly create threads and give them tasks to run in parallel with each other. Obvious one would be splitting up rendering and update logic into separate threads, but then it dawned on me: There is no way that my code is going to be thread-safe for a task so large. Likely scenario; Thread A is compacting vector<something> while thread B tries to read it. ...I just can't see how this is possible? Granted, the threading concept involving anything more than a simple timer is relatively new to me. I don't know but it seems like hardware is advancing faster than it can be utilized effectively by your average software. (Just a though, I could be way off base here)

What do you guys do here? Is there a relatively easy solution to this problem?

Offline  
Read April 08, 2011, 09:04:58 AM #1
Hornet600S

Re: Multithreading a shmup engine?

First of all: I never used multi-threading in a shmup before. The reason is simple: CPUs with 2 or more cores are usually that fast with one core that I don't need more for such a not really CPU intensive task...

Besides: be happy if you don't use much CPU time, because this means less energy wasted Smiley I am currently programming an Atomic Bomberman remake and the CPU-meter shows 0-5 %, so it's more like Eco Bomberman now Wink

But if you want to use it I suggest to look for rather isolated tasks. Like for example the enemy movement (asuming they are all independant). Instead of looping over 100 objects the traditional way create up to as many worker threads as your CPU has cores, lets say 2, and run 2 loops, one goes from 0-49, the other from 50-99.

There is that great tool OpenMP which I can highly recommend. It's supported by most decent C/C++ compilers and greatly simplifies especially such things as described above. With OpenMP that's as simple as that:

Code:
#pragma omp parallel sections num_threads(2)
{
  #pragma omp section
  {
    for(unsigned int t=0;t<50;++t) {
      object[t].DoIsolatedWork();
    }
  }

  #pragma omp section
  {
    for(unsigned int t=50;t<100;++t) {
      object[t].DoIsolatedWork();
    }
  }

}

Oh, one general advice: if you find yourself cluttering your code with lockings and guards then you most likely did something conceptually wrong. Better throw everything away then and start over.
« Last Edit: April 08, 2011, 09:16:44 AM by Hornet600S »

"When the Amiga came out, everyone at Apple was scared as hell."
(Jean-Louis Gassée - Head of Macintosh development at Apple)
Offline  
Read April 17, 2011, 10:55:46 PM #2
the2bears

Re: Multithreading a shmup engine?

I've never used multiple threads in a shmup.  It's just not necessary as modern CPUs are more than up to the job. You'll only complicate matters when there's no need to.

Bill


the2bears - the indie shmup blog
Offline  
Read June 17, 2011, 02:43:10 AM #3
Tiao_Ferreira

Re: Multithreading a shmup engine?

how to apply these concepts in a game done in Flash? I've been noticing that my game "PIXEL FIGHTERS" ( www.goldensox.org/Air/aviao4.swf ) is sometimes slow, remember that the version that is in the air was made using an INTEL QUAD CORE 2.4 Gz, and today I am using a Celeron 2.5. I think we regressed, in terms of hardware ...

To what extent does this influence in a game meant to run the browser?


"O melhor da escola é a merenda!" - Alf, o E.T.imoso
Offline  
Read June 17, 2011, 07:00:12 AM #4
LorenzoGatti

Re: Multithreading a shmup engine?

how to apply these concepts in a game done in Flash? I've been noticing that my game "PIXEL FIGHTERS" ( www.goldensox.org/Air/aviao4.swf ) is sometimes slow
There are two main ways to be slow: doing unneeded work, which doesn't depend on multithreading or lack thereof, and distributing work unevenly between frames and updates, making the game "sometimes slow" even if it's fast at other times.
Regarding your game, lacking obvious expensive features I suspect generic inefficient and unpredictable algorithms as slowdown causes. Have you tried using a profiler?
Offline  
Read June 19, 2011, 01:05:18 AM #5
relsoft

Re: Multithreading a shmup engine?

how to apply these concepts in a game done in Flash? I've been noticing that my game "PIXEL FIGHTERS" ( www.goldensox.org/Air/aviao4.swf ) is sometimes slow, remember that the version that is in the air was made using an INTEL QUAD CORE 2.4 Gz, and today I am using a Celeron 2.5. I think we regressed, in terms of hardware ...

To what extent does this influence in a game meant to run the browser?

My friend who codes in Flash told me that "for each" is way faster than a generic "for".

Have you tried that?



Hello
Offline  
Read October 08, 2011, 06:47:43 AM #6
DeadlyRedCube

Re: Multithreading a shmup engine?

(I don't know what the rules are for posting on older threads on this board, so hopefully this is okay)

I actually have some threading in my current project, but I've done it in such a way that it's still deterministic.

Basically, the collision detection works in phases

Essentially, it does the collision pass first, and THEN it resolves the collision.

For instance, if a bullet hits the player, it'll still test all other bullets (maybe more than one will hit?). Similar to bullets that hit enemies.

Only after all collisions have been processed, then it runs through the list of things that collided and resolves the collisions. At this point, damage is assigned, ships are told to explode, bullets are removed, etc.

The collisions themselves, then, are now completely independent of the others. You can thread them and run them in whatever order and it doesn't matter.

Now, the reason that I'm threading is because I had to get my collision (3D meshes colliding with other 3D objects) working at a decent speed on the Xbox using XNA. The game was actually too slow until I did this. My standard model is: do it the stupid, straightforward way first. See if it's fast enough. If it is, you saved yourself a lot of time doing it the hard way. If it's not, well, now you know Smiley

i.e. always try the easiest sane way first, and only go crazy with optimization when you have to. That includes threading, which isn't the simplest business ever.

Good luck!
Offline  
Read October 10, 2011, 01:42:17 PM #7
mpersano

Re: Multithreading a shmup engine?

Hey, this is interesting!

Let me see if I understood this... you have a thread that updates your game objects, and another one for collision detection? How do these two threads communicate? Do you have a queue between them?
Offline  
Read October 10, 2011, 05:32:33 PM #8
DeadlyRedCube

Re: Multithreading a shmup engine?

Hey, this is interesting!

Let me see if I understood this... you have a thread that updates your game objects, and another one for collision detection? How do these two threads communicate? Do you have a queue between them?

A diagram may help a little:


Basically, the game thread moves all of the enemies to their new positions, then queues up collision work for the worker threads, then waits for that work to complete before continuing.  As to the number of collision threads: I have one worker thread for each core that I detect the current PC has. If the current PC is a single core, the game doesn't bother threading it at all.

Each item in the work queue is a pair of things to be tested. Note that, at least in my game, the collection of enemy bullets is one thing. The collection of player bullets is a different thing. That is, there's not a queue item for, say, every enemy bullet vs. every enemy, there's one item per enemy for "test against bullets".

So the game thread does (effectively) the following:

MoveAllEntities();

for each (pair of things that need to be tested)
  collisionQueue.Add(pair);
collisionTarget = collisionQueue.Length();

collisionQueueReadyEvent.Signal();
WaitForEvent(collisionsCompleteEvent);

ResolveAllCollisions();


And the collision threads look kinda like:

while (true)
{
  WaitForEvent(collisionQueueReadyEvent);
  Pair testPair;
  do
  {
    testPair = null;
    lock(queue)
    {
      if(queue.Length() > 0)
      {
        testPair = queue.Pop();
      }
    }

    if(testPair != null)
    {
      // Note that the test collision function will set Object.Collided value on
      // anything that has tested as colliding during this frame. This flag will
      // get used during the Resolve Collisions step later in the game thread.
      TestCollision(testPair.ObjectA, testPair.ObjectB);

      collisionTarget--; // NOTE: This should be an interlocked decrement or on the inside of a lock for thread safety
      if(collisionTarget == 0)
      {
        // We've run out of collisions to process, signal the main thread that we're done!
        collisionsComplete.Signal();
      }
    }
  }
  while (testPair != null);
}


I hope that all makes sense (and is right, I don't have my code in front of me right now, so I'm not positive that this is exactly right...but it should give you the gist).
Offline  
Read October 10, 2011, 06:36:12 PM #9
mpersano

Re: Multithreading a shmup engine?

Wow... thank you very much for your detailed explanation!

I hope that all makes sense (and is right, I don't have my code in front of me right now, so I'm not positive that this is exactly right...but it should give you the gist).

Yes, it all makes perfect sense! Thank you!

If you hadn't mentioned that multithreading helped with your game performance, I would have thought that an architecture like this wouldn't work on a real-time game (synchronization incurs in a lot of overhead). I definitely feel like giving something like this a try...

With how many threads did you try this?
Offline  
Read October 10, 2011, 08:43:33 PM #10
DeadlyRedCube

Re: Multithreading a shmup engine?

If you hadn't mentioned that multithreading helped with your game performance, I would have thought that an architecture like this wouldn't work on a real-time game (synchronization incurs in a lot of overhead). I definitely feel like giving something like this a try...

With how many threads did you try this?

Originally I set this up on the Xbox, using 3 worker threads (one for each CPU). In my PC builds, I check how many cores the system has and use that (and skip threading entirely for single-core systems).

I'll note that the only reason, really, that this is necessary for Procyon is because I'm doing full 3D mesh collisions, and in managed code on the Xbox it was, really, rather slow.

If you're not having performance issues with collision, threading it probably isn't going to do anything except slow you down. It's definitely not worth putting the effort into something that's already fast enough Smiley
Offline  
Read October 14, 2011, 12:17:10 PM #11
Hornet600S

Re: Multithreading a shmup engine?

I don't get why you don't split and parallelize the work itself:

Instead of that
Code:
for each (pair of things that need to be tested)
  collisionQueue.Add(pair);

I'd do that (here in the 2 core case):
Code:
int nr_of_pair_of_things_that_need_to_be_tested=xxx;

main thread:
for(t=0;t<nr_of_pair_of_things_that_need_to_be_tested/2;++t)
  collisionQueue[0].Add(pair[t]);

secondary thread:
for(t=nr_of_pair_of_things_that_need_to_be_tested/2;t<nr_of_pair_of_things_that_need_to_be_tested;++t)
  collisionQueue[1].Add(pair[t]);

main thread:
wait for secondary to complete (shouldn't take long, since the avg. amount of work is the same)

Note the use of a separate collisionQueue for each thread to avoid locks.
My point is:
I'd keep the logic serial and only parallelize dedicated time consuming loops over objects that are logically seperated. So no locks and only minimal waiting (which eventually can be removed if the next logic step is splitted in the same way again).

Same thing with the collision tester which you packed in one thread parallel to the main thread. I'd rather split the tester's work itself into multiple threads like above, so each thread only checks one dedicated collisionQueue[X].

Personally I find that way much more convinient and less error prone, since the overall program flow is still 100% serial.

Have you considered doing it that way?
« Last Edit: October 14, 2011, 12:18:42 PM by Hornet600S »

"When the Amiga came out, everyone at Apple was scared as hell."
(Jean-Louis Gassée - Head of Macintosh development at Apple)
Offline  
Read October 14, 2011, 09:15:52 PM #12
DeadlyRedCube

Re: Multithreading a shmup engine?

I don't get why you don't split and parallelize the work itself:

<snip>

Have you considered doing it that way?

First (and it's possible that you understand this and I'm misinterpreting something that you've said), there are actually multiple work threads. They're all identical, though, so there's only one code block. That is, there are multiple threads that, simultaneously, are doing collision work. They all just happen to be pulling from the same, central queue.

My rationale for sharing the queue was simple: some collision testing takes longer than others, and with split queues you could end up with threads finishing there work while there are still other jobs queued. Here's a hypothetical example:

Say I have three threads and (to make this nice and even), 9 collision tests to do.  Since I'm making up numbers, let's say that, this frame, collision test A is going to take 5 units of time, where all of the rest of them (B through I) will take I.

With three queues, You'd end up with something like the following (with the unit of time each takes in parentheses):

Queue 1: A(5) D(1) G(1)
Queue 2: B(1) E(1) H(1)
Queue 3: C(1) F(1) I(1)

Queue 1, total, takes 7 units of time to complete.
TOTAL TIME: 7 units


Note that, by the time that threads 2 and 3 finish their work, Queue 1 is still working on its first item, and there are two more queued after that, which means that there are two threads doing nothing that could, instead, be doing actual work.

Instead, if there's one shared queue (A B C D E F G H I)

Thread 1 grabs A and starts going
2 grabs B, 3 grabs C, 2 D, 3 E, etc.

You end up with:

Thread 1: A(5)
Thread 2: B(1) D(1) F(1) H(1)
Thread 3: C(1) E(1) G(1) I(1)

Thread 1 takes 5 units of time, the other two take 4. Overall, you've used less time.
TOTAL TIME: 5 units


I specifically know that I have collision sets that are going to take longer in the average case (there's the "test all bullets against the player" case and "test all player bullets against a specific enemy" case, which I treat as one "pair" to test, so that I don't churn the queues with a collision for every bullet. These tests I queue up FIRST, then follow up with the normal ship/ship collision tests.

Locking is such a small percentage of the thread's run-time (compared to the collision tests, which are heftier), and because a simple critical section is only really "slow" when there's contention, it seemed worth the trade-off of having a shared, locked queue to allow other worker threads to pick off the smaller work items while some worker threads get bogged down in heftier tests.

Does that make sense?
Offline  
Read October 15, 2011, 09:29:04 AM #13
Hornet600S

Re: Multithreading a shmup engine?

Does that make sense?
Yes, since your payload differs that much it makes perfect sense Smiley


"When the Amiga came out, everyone at Apple was scared as hell."
(Jean-Louis Gassée - Head of Macintosh development at Apple)
Offline  
Read December 16, 2011, 01:36:23 PM #14
klaim

Re: Multithreading a shmup engine?

Hi, I've read this thread and started another related : http://www.shmup-dev.com/forum/index.php/topic,3723.0.html

I was plannign to use such parallel algoritm for movement and collision detection too, but I also want to have concurrent tasks to exploit the system as much as I can where possible.
See the thread to view my potential problem.
Offline  
Read December 16, 2011, 05:08:02 PM #15
kdmiller3

Re: Multithreading a shmup engine?

If your shmup needs multithreading, you're doing it wrong.  Grin

(Using a shmup as a testbed to learn multithreaded programming is another matter entirely.)

If it isn't meeting its performance targets, use a profiler to see what's taking so long.  Collision detection and rendering are two of the most likely culprits so they're a good place to start.

The important thing with collision detection is to avoid testing pairs of objects that can't collide.  Testing everything against everything is quadratic in the number of objects (typically n*(n-1)/2 tests).  The easiest way to avoid that is to restrict what objects can interact (e.g. player ship only collides with obstacles, enemy ships, and enemy bullets; enemy ships only collide with player bullets).  That alone eliminates a huge number of tests.  If that's not enough, you can apply spatial partitioning, object partitioning, or sort-and-sweep algorithms to speed up spatial queries.  These approaches are useful for collision systems in general--not just shmups--so they're worth knowing.

If you're using DirectX or OpenGL, it's way too easy to submit geometry in ways that slow things down.  The important thing is that submitting geometry has a relatively high fixed cost so you want to do it as few times per frame as possible.  If you have a thousand bullet sprites, submit them in one large batch (one draw-primitive call) instead of one at a time (a thousand draw-primitive calls).  Treat bullets like a particle system.  Smiley
« Last Edit: December 16, 2011, 07:23:05 PM by kdmiller3 »
Offline  
Pages: [1]   Go Up
Jump to:  

Page created in 0.74 seconds with 17 queries.