Josh Sutphin

The Raft

RED Games | Senior Engineer | April 2017 - March 2018

The Raft is a four-player co-op same-space VR shooter commissioned by Starbreeze Studios for Emarr Entertainment's PVRK in Dubai. You and three friends will board the raft and purge the swamp of an infestation of supernatural creeps.

Gameplay detail

The Raft is essentially a room-scale VR shooter where the "room" moves through an environment; the room is the raft, and the raft moves along the river through ever-changing surroundings and a variety of combat encounters, building up to a climactic boss fight in the middle of a lake.

Up to four players strap on backpack PCs and Vive headsets and enter the same physical space. Players can physically interact with each other during gameplay, e.g. giving high-fives. In the virtual world, each player is represented by a unique avatar. You don't know this going in, and you don't get to pick who you'll be, so there's this fun (and often funny) moment when you get in-game and look down to discover you've grown a big ol' beer belly, then look over at your friend and discover they've become a pissed-off biker grandma.

The raft is equipped with two mounted machine guns, a mounted grenade launcher, and several portable shotguns and automatic rifles. As with other VR games you simply hold out your hand and press the button to pick up and drop stuff, and pull the trigger to fire. The portable guns have mechanical sights which are correctly calibrated, so you can shoot with precision if you desire, but I also implemented fairly generous aim assist because most people turned out to be not physically comfortable holding a gun that way, nor even necessarily familiar with how to do it. To-date, nobody has actually noticed that the game is helping them aim, which I see as a great success (more on this implementation below).

As the raft moves downriver, various creatures emerge and attack. Each enemy type has a distinct appearance and unique movement and attack patterns. Of particular technical interest are the massive swarms of bat-like aliens in the midgame; more on these below.

Players blast away at the enemies, competing for kills while defending the raft. Enemies can damage specific sections of the raft, which catch on fire. Fire extinguishers are available for putting out fires, but if fires are left to burn too long, they'll grow in intensity and eventually permanently destroy that section of the raft. Because this is intended to be a fun VR experience and not necessarily a "hardcore" game, we don't actually let the raft sink -- that would cut short an experience they paid good money for, and feels arbitrary and cruel in this context -- but it turns out players really don't like having everything in close proximity turn into a raging inferno, and they spring into action with the fire extinguishers despite the absence of any true fail state anyway.

(We do, however, allow players to fail the final boss fight; there are two separate endings.)

The whole experience averages about 11 minutes. It's been played about a zillion times by now, and the consistent feedback has been that the experience feels longer than that (this seems true of most VR titles), that players enjoyed every minute of it, and that even for the perceived longer duration they never became tired or bored. That feels like success!

Development overview

The game was purpose-built for deployment to the PVRK in Dubai. It's built primarily for Vive and StarVR, and we also supported the Oculus Rift which we used for early prototyping. Each client runs on an HP Omen X compact desktop, which has a backpack form factor allowing untethered/wireless play; this was crucial to executing the same-space concept (all four players share the same physical space and can physically interact with each other during gameplay).

I was the design and development lead on the project, and for most of its life I was also the sole engineer. The game is built in Unreal Engine 4 and nearly all game logic is implemented in C++. Key tech pillars for this project were the avatar posing system, which translates the HMD and hand controller positions into convincing full-body articulation complete with walk cycles and facial expressions; a transparent aim assist algorithm which helps players feel like sharpshooters while avoiding the feeling that the game is aiming for them; and a multi-layered flocking behavior for the massive enemy bat swarms that appear during the middle section of the experience.

As the project wrapped, I also concepted, shot, and edited the promo trailer you see above, and coordinated a final pass of After Effects work that was provided by a colleague.

Avatar articulation

The avatars are a high point of the experience. Despite the fact that our only inputs are the HMD position and the two Vive hand controllers, I was able to derive sufficient data to give the avatars a convincingly full (if cartoonish/whimsical) articulation. The head and hands are easy to pose, because we have that data directly, and these alone are incredibly expressive.

We used IK to pose the arms based on the transform of the hands, but Unreal's basic two-bone IK solution proved insufficient: for a whole range of poses, the elbows would invert. I ended up building a fairly sophisticated animation blueprint which builds on the two-bone IK plus a bunch of other transforms to keep the elbows natural depending on whether the hands are in front of or behind the body, out to the sides or pulled in toward the middle, and above or below the shoulders. In the final implementation, players could convincingly wave, cross their arms over their chest, and clasp their hands behind their back with natural elbow positioning and extremely minimal geometry interpenetration.

For the legs, we naturally don't have the data to do truly accurate posing, like letting the player kick, or stand on one leg. We did animate a walk cycle and I drove that with the lateral movement of the head, and matched up the animation speed dynamically to minimize foot-sliding.

Of course, the head also moves laterally when players bend over. I solved that by adding a "spine bend" pass to the animation blueprint, driven by the pitch of the head. This arches the back forward and backward within a range, giving a convincing "bending over" pose. Through the magic of math, I offset the root position accordingly so the feet would stay planted during these bends. This solution isn't perfect, but it did a pretty good job of distinguishing between lateral head movements due to the player walking around, versus lateral head movements due to the player bending over.

Vertical head movement drives a dynamic kneeling pose which looks fantastically natural when players crouch or kneel to pick up weapons off the floor, or to duck behind cover.

The cherry on top was the randomized facial expression animations which trigger whenever two players make eye contact. Despite these being completely random, players invented so much contextual storytelling from these. They're a huge source of "happy accidents" throughout the entire experience. There's nothing quite like catching your friend's eye, seeing their avatar go absolutely slack-jawed with terror, then turning around to see a giant monster looming behind you!

In practice, the mapping of real-world movement to virtual animation is rough and imprecise in a lot of little ways, but we nailed the broad strokes well enough that literally nobody notices. On the contrary, we routinely have players express surprise that they were able to reach out and touch their friend and that the physical and virtual poses matched up way better than they were expecting (this is especially true for the arms). Players can technically break the simulation by doing weird things with their legs (e.g. sitting down on the ground with legs extended) but in practice people just... don't do that. There's no gameplay reason to do so, and the experience is intense enough that they stay fully engaged, on their feet and moving for the full duration.

Aim assist

I've seen a bunch of aim assist implementations during my career but none of them felt quite right; usually they're too "sticky" and you notice the game is helping you aim, and the better the implementation is, the less often you notice this.

For The Raft I tried out a different approach which appears to have been really successful in that it helps you a lot and yet nobody seems to really notice it's there. (This turns out to be crucial in VR shooters because most people have never shot a gun before and have no idea how to properly aim one.)

The basic idea here was to project the target's bounding sphere onto a camera-aligned plane that intersects the target center point, cast the aim ray through that plane, and derive a radial distance from the target center to the aim ray intersect point. This effectively turns a 3D problem into a 2D one, which is much easier to reason about.

The target has an aim assist radius -- which varies from target to target based on the target's size, relative importance, and a bit of subjective "feel" on my part -- and I compare the derived radial distance to that radius. If you're outside the radius, you get no aim assist. The closer you get to the center, the more I ramp up the aim assist amount (non-linearly); this prevents a "snap" of assist the moment you cross the radius "edge". The calculated assist amount is then used as the percentage of a blend between your raw aim vector, and the vector to the target center point.

That assist ramp is the key to making this method feel transparent. In other games, when using an automatic weapon you can sometimes feel when the assist kicks in because your bullet stream abruptly changes direction (even if only subtly). That doesn't happen here.

Flocking behavior

At one point during the midgame you encounter swarms of dozens of bat-like monsters in the midst of a cypress swamp. These needed to fly around in a visually believable fashion, flock naturally around each other and the towering cypress trees, and not wreck the game's performance while doing so.

The obvious solution was "boids" flocking, which forms the foundation of this behavior: each agent steers toward faraway neighbors for cohesion, away from nearby neighbors for separation, and to match the average alignment of neighbors for general direction.

I added a layer of cylinder primitives to represent the tree trunks; these are inserted into the boids system as repulsors with a special behavior that generates a vector tangential to the cylinder based on the approach vector, which allows the flock to flow naturally around the cylinder, rather than bouncing off it.

Floor and ceiling planes provide upward and downward influences (respectively) as agents approach them, to keep everyone penned in to a reasonable altitude range that feels good to shoot at.

The raft itself acts as the universal goal, and agents receive a strong influence toward it, except for when they're nearby but facing away (e.g. immediately after completing a pass). This creates really effective near-miss fly-bys and a long, graceful turnaround arc as they come around for another pass.

Visually, I play wing flap animations while climbing and a no-flap dive animation while falling, and apply a bit of roll to the transform while turning.

For audio, it was ridiculous to play idle loops on every agent: it sounded like a cacophony and was virtually un-mixable. Instead, we created a sound event in FMOD with a parameter that ramps up from a single agent flapping, to a whole swarm flapping. I wrote code that logically assigns all the agents to four radial slices around the raft: these are in the north-most group, those in the east-most group, etc. There's also a group for all agents (in any direction) that are within a very small radius (i.e. right overhead).

Each of those groups gets an instance of that sound event, with the parameter driven by the number of agents currently within that group. The sound's position is the average of the positions of the agents in that group. If the group has no agents in it, the sound is disabled (but not deallocated). This gave us a performant audio solution that scales and spatializes well, was predictable to mix, and sounds awesome and terrifying in practice.

Lessons learned

After using Unity almost exclusively for years, I came into Unreal Engine 4 pretty cold. I remembered some stuff from having worked on Gem Feeder 14 years prior, but that was two engine generations ago and a lot had changed, so I had to get up to speed quickly.

UE4 is really different in that it's open-source, which meant wrapping my head around a whole new buildchain. For StarVR support we had to integrate an engine patch, and over the course of development I had to make a few other engine-level changes as well: mostly addressing bugs, in particular a crucial network timing issue. This was a depth and intensity of C++ programming I hadn't done in a while, and it was definitely an adjustment.

I also learned a ton about VR development, in particular what you can get away with in VR. The avatar articulation system, while clever and effective, is way less sophisticated in implementation than I'd have thought it would need to be to feel convincing. That turned out to be true for lots of things, and might be chalked up to the fact that, for now, VR as a concept still has that new-car smell with most players. I'm sure the bar will go much higher over the next few years, as adoption spreads and players become more familiar and experienced with VR.

Finally, this project really crystallized for me something that's been nagging at me for a long time: that games can be awesome without being punishing. The decision to omit a fail state for the bulk of the experience was deeply controversial throughout development, with many arguments made that without a fail state the game would have no stakes and would end up being boring.

In practice, that turned out to be wildly false... for this use case.

To be fair: this is a location-based experience that most people will play through just the once, and its target audience is vastly more casual and VR-curious types than hardcore gamers. But for this use case, taking a more forgiving design stance and just letting people enjoy the content, regardless of their skill level, was absolutely the right decision.

Created 11/12/2022 • Updated 11/29/2023