Maya 2011 and Mental Ray Satellite – always benchmark first!

UPDATE: Using an AMD 6-core machine (@3.5Ghz), I ran the same renders mentioned in the writeup. The machine managed 7 frames per minute solo.¬†When any combination of the machines below were added to maya.rayhosts (1, 2, or 4) and I rendered using the network machines, the performance always dropped! It seems once your master is fast enough, you’re not likely to see any performance increase by rendering simple scenes over the network.

Note that I also tested a more complex render which you can check out in Part 2!

I did a little testing with Maya 2011. If you use Maya and have other machines on your network, you might be tempted to throw satellite on them to speed up renders.

However, you’ll find that this may either help or hinder your overall speed.

I used a test scene in Maya that used mental ray for the rendering. This scene was created by someone a couple years ago in 8.5, and while about 12000 frames in size, most of it wasn’t overly complex. We’re talking images per minute in terms of the output. Obviously, while the render’s being done in 2011, it only uses effects/features that were available in 8.5.

Here are the results. Under the “frames per minute” section, I essentially looked at the file time stamps afterwards and wrote down the number of images for each of the first 5 minutes or so:

# machines and networkmachine detailsframes per minutenotes
1 machine (solo)i3-3.2Ghz3/4/3/3/3/43 to 4 (3 heavy) – this is the standard “reference” with no networked machines
2 machines (wireless N)i3-3.2Ghz
3/3/3/3/3/33 – you’ll notice that performance went DOWN over a wireless network
2 machines (wired 1Gbit)i3-3.2Ghz
4/4/5/4/4/4/5/4/4/4/54-5 (4 heavy) – same machines as above, but on a wired gigabit connection. better results than previous and solo.
3 machines (wired 1Gbit)i3-3.2Ghz
Core2Duo 2.26Ghz
4/4/5/5/4/5/4/5/4/5/4/54-5 (half and half) – better results still
5 machines (wired 1Gbit)i3-3.2Ghz
Core2Duo 2.26Ghz
Pentium Dual-Core 1.6Ghz (E2140)
Pentium Dual-Core 1.6Ghz (E2140)
3/4/……I didn’t bother writing down the rest at the time. Performance dropped to the same as a solo machine by adding the pair of “budget” machines to the mix. Before anyone asks, I checked every single machine running to ensure that it was actually “working” (task manager / activity monitor depending on the machine)
4 machines (wired 1 Gbit)i3-3.2Ghz
Core2Duo 2.26Ghz
Pentium Dual-Core OC (1.6Ghz -> 2.3Ghz)
5/5/5/5/5/4/3/5Aside from a hiccup towards the end, I was getting predominantly 5 frames rendered per minute. This was the best result yet. Note that it was back down to 4 machines, with the weakest (dual core pentium) cranked up to 2.3Ghz. I would have liked to overclock the other machine as well, but the RAM couldn’t handle it.

So what can we take from this?

  • The render doesn’t scale 100% with the machines. If the single machine was doing 3-4 frames per minute, you’d think 2 machines would do close to 6-8 frames per minute, but it’s not even close. There’s overhead in distributing the data over the network. You see this evidenced when actually rendering. Using a single machine, the CPU load sits near 100% almost all the time. As soon as you start adding network machines, you find that it idles at least 1/4 of the time (both on the master and the slaves). On simple scenes (talking 10-20 seconds per frame), you’re not likely to see as much benefit as for complex scenes (talking minutes per frame) because so much time is spent at idle just distributing/allocating each frame.
  • Never use machines on a wireless network. Note that the peak network data from the master was typically around 5-6MB / second which should have been well within what the N could handle in terms of bandwidth. Wireless seems to bring another issue of some sort that chokes the satellite render.
  • Always test the network configuration you’re planning to use against a solo configuration. Remember that my 5-machine test was as slow as the solo configuration.

Why was the 5-machine test so slow?

All I have are theories, but the 2 I’m leaning towards are as follows:

  1. Slow machines may bottleneck the render. I don’t know what method Maya uses to determine what chunks to distribute, but if it’s simply a matter of splitting the scene into X equal chunks and sending them off (where X = number of machines or CPU’s), it would stand to reason that a slow enough machine would still be working on it’s chunk while all the others sit waiting. Again, this is just a theory – I have no idea what method Maya uses to determine how much to send to each satellite.
  2. There may come a point where the increased overhead per machine added results in diminishing returns.

A couple final tidbits:

Note that the CPU load/usage on all the machines jumps from idle to full load during the network render (after it renders it’s piece of the scene/frame, it sits and waits until it gets the next one). This can help keep the heat down (and thus, possibly keep stability up) on all the machines, and keep them usable if they’re being used for other tasks. However, time not spent at 100% load is often time not well spent.

If a machine drops out (goes to sleep, network connection dies, etc), the network batch render just ends. If you have an unstable machine, don’t use it unless you’re able to monitor the render process in case it conks out.

In any case, the important thing to take from this is to benchmark before setting off a network render. You don’t have to spend hours doing it (nor should you), but at the very least render a few frames on just the master, and then try your networked configuration. If there’s an improvement, great – go ahead and do the whole thing. If you see a decrease, do the entire thing on the master. Don’t go crazy (like me) and start benchmarking various configurations – you’ll generally spend more time testing than you’ll save during the render process.