Embodied Play in Baby Steps and Death Stranding
- Ryan Zhao

- 3 days ago
- 14 min read
2025’s comedic blockbuster (and ball-buster) walking sim Baby Steps was announced in 2023 with a trailer that, while lacking direct head-nods to Kojima Productions’ Death Stranding (2019), seemed to quietly parody aspects of its visual iconography. Nate emerges into the game’s world by wading out of the water in a cave of grey rock covered in mossy greens, evoking the Iceland-inspired opening zone of the first Death Stranding. The trailer’s camera frames Nate in dramatic and detached angles similar to Stranding’s documentary-like wideshots. Nate’s attire, a toddler-like onesie, while more explicitly a commentary on Nate’s infantile passivity, can be read as an ironic echo of Stranding avatar Sam Porter Bridges’ single-color hiking outfit, inverting the utilitarian streamlinedness of Bridges’ ensemble to draw ironic juxtaposition with the absolute unpreparedness of Nate.
Coincidentally releasing in the same year as Death Stranding’s sequel, Baby Steps serves as both a parody and extension of Stranding’s mechanical innovations, and both games work together to advance (or perhaps argue against) an emerging philosophy in game design pertaining to virtual embodiment.

Abstraction
For the purpose of this exploration, we can consider “abstractness” in games along two axes: pertaining to the depiction of the simulated game world and pertaining to the player’s virtual presence or interaction with it. For instance, SUPERHOT has a highly-abstract visual style, reducing agents and the game world to minimalist, iconographic representations in its distinctive, stylized, and highly-readable visual presentation. Conversely, Red Dead Redemption 2 strives for greater fidelity to depictive naturalism; it does feature a significant amount of abstraction / simplification for the purpose of simulating its world on a computer, but its degree of abstraction is obviously lower. It aims to replicate the sights and sounds of our world without drawing attention to the ways in which it is mediated and simplified. I will refer to this dichotomy of abstraction as depictive abstraction, or d-abstraction.
Abstraction does not necessarily imply a greater deviation from reality. Put another way, our physical reality is not the singular benchmark against which abstraction is measured against. Games like The Witcher 3: Wild Hunt and Hogwarts Legacy include magic and fantastical creatures, but great care is put into the lighting, physics, and construction of their worlds, such that we can imagine that they demonstrate with great fidelity what such worlds with magic and fantastical creatures would look like, were they real. They are fantasy worlds depicted with a low degree of abstraction.
We can separately consider the abstraction of presence and/or interaction with game worlds. For example, Beat Saber has fairly low abstraction of interaction and presence. The player is made to feel like she is standing in a large, abstract space but observing that space naturally through her own eyes, holding colorful sabers in each of her hands that respond to her hand movements naturally and intuitively. Though her body is (depictively) abstracted away, her sense of proprioception within the game world is precise and necessary (used for dodging obstacles), and her ability to act within the game world (by swinging the sabers) feels relatively un-mediated.
Conversely, The Secret of Monkey Island presents a highly-mediated sense of presence and interaction. To the degree to which the player inhabits the game world, it is through the avatar Guybrush Threepwood, a character who not only does the player not directly control (clicking on objects and locations within the game world feels more like issuing suggestions that Guybrush may or may not choose to act upon) but also does not serve as a locus of projective embodiment. The player is never encouraged to conceive of herself as Guybrush.
Furthermore, avatars may be used to communicate sensory information about game worlds. For example, the movements and animations of Mario in Super Mario Bros. 3 communicate physical properties of the game world such as slipperiness of slopes, viscosity of water, inertia…, eventually forming a relationship of projective embodiment through virtual vicario-sensation. I refer to this polarity of fidelity of interaction and presence as embodied abstraction, or e-abstraction.

(There are other forms of abstraction in gameplay that we may consider, such as procedural abstraction: simplification of processes in their simulation, such as the simple cause-effect relationships that lead to population behaviors in SimCity that do not represent the full complexity of possible influences on group behaviors in the real world. But, for the purpose of this exploration, I will focus on d- and e-abstraction).
Typically, game worlds high in d-abstraction are acted upon with abstract actions, and realistically-presented game diegeses lend themselves well to more naturalistic forms of interaction. As the Madden games push closer to photorealism, abstraction in the player’s virtual-physical presence becomes more incongruent; thus, they become more slavishly adherent to naturalistic interactions. The more heightened action of, say, NFL Blitz, is accompanied by a greater degree of d-abstraction. Furthermore, the low d-abstraction of Gran Turismo 7 leads players to expect a higher level of fidelity to the functionality and handling of actual automobiles than games like Mario Kart World and Crazy Taxi, which communicate their “arcadey” sensibilities with a greater degree of d-abstraction.
Deliberate “mismatches” between d- and e-abstraction levels can serve a number of purposes. art of rally and Absolute Drift have highly-stylized aesthetics, making use of visual minimalism, but feature vehicles that control with a high fidelity of naturalistic physics. The visual abstraction is intended to represent a disciplined focus on the specificity of car handling, like the sense of being “in the zone” when focusing on a task and letting the distracting details of the world fade away.

Embodiment Fidelity
Every time that a video game avatar makes a choice separate from player input, this represents another degree of abstraction of embodiment — not necessarily in a bad way (it would be tedious to have to remind Mario to breathe in and breathe out every few seconds), but it’s a degree of separation and mediation put between players and their virtual avatars.
Consider inverse kinematics (IK) systems of technical animation that allow avatar bodies to naturalistically interact with objects and topographies within game environments. Correctly applied, an IK system allows an avatar to place her foot atop a raised portion of flooring, bending her leg and redistributing her weight and balance in a way that looks natural.
It can be fun to play with the limits of an IK system, micro-adjusting an avatar’s position on a staircase or near a narrow hallway to observe the “decisions” the system makes to pose the character naturally. In action, IK systems typically help with embodied immersion, making avatars feel physically present within their worlds, reacting appropriately to steps, slopes, and obstacles. But IK also represents a vector of abstraction: a simplification of the act of movement and the requirement of proprioception within the game world. IK represents an advancement in depictive fidelity that typically does not reflect in interactive fidelity. Whereas stairs in Super Mario 64 were treated like any other flat or slanted surface, advancement in IK shows the avatar sensing and adjusting to a greater degree of nuance in the game world. The experience of walking up stairs in Super Mario 64 and Dark Souls is fairly consistent — the latter does not require additional consideration from the player — but the Dark Souls avatar implies “this is a different type of surface, and I will navigate it for you” in a way that Mario never implies.

Death Stranding is a thesis about e-abstraction. Kojima and team observe the abstraction of IK and seek a way to gamify it and to bring it back into active consideration for the player. IK is more than just a system of aesthetic naturalism, it is another vector of information being fed back to the player. Kojima feels that, if the player is being given this information, she should have to act upon it. Thus, every stone and slope in the world of Death Stranding presents a challenge and a responsibility to the player. Though the game’s movement is still highly-abstracted (hold a thumbstick forward to engage a fairly-automated process of human ambulation), there is a greater degree of nuance pertaining to avatar proprioception than most contemporary games.
In a sense, Death Stranding poses a challenge to its contemporaries in the pursuit of high-fidelity photorealism: if we are going to include a greater amount of environmental detail in the visual information conveyed to the player, it follows that we should likewise increase the interactive elements that foster the sense of projective embodiment.

A Travesty of Projective Embodiment
If that is the gauntlet that Death Stranding threw, Baby Steps exaggerates that philosophy to the point of absurdity. In the way that the grotesque caricatures of appearances in political cartoons satirize their recognizable targets, or codes of manners in farce comedies draw out the absurdities of social systems, the deliberately-obtuse physics and controls of Baby Steps serves as a grotesque of embodied virtual movement. With respect to Death Stranding’s procedural thesis, Baby Steps is its interactive travesty.
A travesty is “a type of parody or humorous distortion” in which “the action of a serious […] work is retold in comic, vulgar, and often anachronistic language” (Martha Bayless, Encyclopedia of Humor Studies, p. 774; Ritchie Robertson, Encyclopedia of Humor Studies, p. 514). Parody and travesty are both forms of burlesque that reframe or reinterpret existing works, with parody being high burlesque (elevation of a trivial or “low” subject with higher artistic form) and travesty being low burlesque (degredation of an elevated subject matter with lower artistic form or portrayal). For example, “Weird Al” Yankovic’s song “Lasagna” is parody / high burlesque because it elevates a trivial subject matter using a higher artistic form (Mexican folk song “La Bamba”, specifically as interpreted by Ritchie Valens in 1958), whereas Monty Python’s film The Life of Brian is travesty / low burlesque, because it trivializes an elevated subject matter (the life of Christ) in the form of a vulgar and irreverent farce.
The subject matter that Baby Steps degrades is virtual embodiment, and it does this by drawing a grotesque of IK and embodied vicario-sensation. If Death Stranding posits that the nuance and fidelity introduced through IK represents an opportunity for embodied, proprioceptive challenge, Baby Steps extends that impulse to the degree of absurdity. Not only do topographical anomalies introduce small challenges and moments of consideration that ground the player the virtual world, topographical anomalies pose unrealistically-magnified challenges, requiring amounts of scrutiny that shoot far past that which would be realistic in the scenarios that the game depicts.

I have an anecdotal memory of a conversation I observed when “snap” auto-aiming in third-person shooters was still somewhat controversial. Having grown up with games like Quake and Counter-Strike, in which the avatar’s gun points towards the center of the screen and the challenge of shooter gameplay involves quickly and skillfully orienting that cursor towards moving targets, snap auto-aim, as seen in games like Grand Theft Auto V and UNCHARTED 2: Among Thieves can seem like a simplification of the act of aiming that takes much of the skill out of the act. When engaging the “aim” function (typically by pulling the left trigger) the crosshair immediately snaps to its nearest target, sometimes also following it as it moves (as in Metroid Prime). Some players objected to this trivialization of aiming as a decrease in realism for the sake of ease-of-use, but others insightfully argued that it actually matches the experience of aiming better than the Quake model.
“Find a detail in the distance and point at it. You reach out your arm pointing directly at it; you don’t start in a neutral position and move your arm up towards it”. Requiring players to perform what are realistically subconscious actions is not naturalistic interaction (whether or not this change creates better gameplay). It is more realistic to automate subconscious actions.
Certainly, that’s the distinction that Kojima is prying into with Death Stranding. IK is naturalistic in the sense that I do not have to consciously consider how to shift my body weight to balance upright on a slanted surface, but I do have to recognize uneven terrain and anticipate the differences in effort and body position that will be required to pass it (if a step up or down in real life has ever caught you by surprise, you know the feeling).
Baby Steps tasks players with both the conscious and subconscious aspects of virtual proprioception. Though there is an IK system at work, Baby Steps requires players to think through weight distribution, footstep grounding, and other aspects of basic ambulation that would typically be automated. In some ways, it extends Death Stranding’s vision and presents a more naturalistic model of traversal. Despite Stranding’s increased level of movement nuance and environmental fidelity, the manner of approach for any single obstacle is still abstracted; operating Sam Bridges is like driving a vehicle over rough terrain; you push forward and see whether or not the vehicle has the ability to surpass the obstacle from the chosen trajectory of approach. Little consideration is given to the positioning of each limb. Steps gives players control of each foot and lets them position and plant each step and tentatively test the stability of each grounded position before committing to it — a familiar exercise for rock climbers and hikers who often need to wedge and contort themselves in strange ways to surpass certain terrains.
But, on the other hand, Baby Steps exaggerates the clumsiness of the avatar in ways that do not reflect reality, perhaps commenting on the alienation we feel from our avatars’ virtual bodies when stumbles and falls feel disconnected from the cues we receive from vicariosensation. Going a little too quickly down a hill in Death Stranding or completely losing one’s footing during what looks like a minor tremor in Shadow of the Colossus often leads to sighs and frustration from players; the fidelity of projective proprioception is not great enough to make the fall feel inevitable or earned. “Come on, why can’t you stay on your feet?”, I find myself asking, depersonalizing the avatar from the “I” pronoun to the second-person “you”, signaling an expulsion from my feeling of projective embodiment.
In its exaggeration of proprioceptive effort, Baby Steps locks the player into that that embodied “I” perspective. Every misstep and stumble, though unrealistic, is the direct result of the player’s input. Nate does not act independently of the player’s control, even to perform realistically subconscious actions (apart from humorous cutscene interludes).

Being this “locked-in” to the projective embodied experience, the player becomes strongly attuned to vicariosensory information. Micro-adjustments in the movement of Nate’s feet and body upon setting each foot down communicates a tremendous amount about the texture and stability of the terrain on which he stands. To a greater degree than even Death Stranding, the Baby Steps player feels the slipperiness of gravel beneath Nate’s feet.
Though it’s used for the purpose of comedic exaggeration, this is a notable achievement in game design. Baby Steps communicates visual information that assists in sensitive, high-fidelity vicoariosenation to a greater degree than any other game I have played. It does this by creating a particularly strong and uninterrupted projective link between player and avatar, and by making proprioception consequential.
What Lessons Can We Take from Baby Steps?
While I believe Baby Steps represents an advancement in avatar embodiment, I do not suggest that every game should embrace its deliberately clumsy and manual control scheme. Obviously, fumblecore is not the future of mainstream video gaming. But there are lessons that we can take from its example, particularly in the ways in which it extends and improves upon the ambitions of Death Stranding.
The book is not closed on conventions of avatar control. Baby Steps’ limb-based control scheme concentrates the loci of vicariosensation feedback. Though Death Stranding has a high degree of proprioceptive feedback compared to most games, it treats Bridges’ body as a singular whole. When Bridges stumbles on a slippery rock or topographical abnormality, he exhibits a full-body fall, making it difficult to sense what went wrong. This is partially because Bridges controls like a vehicle with standard 3D avatar control in which pointing in a direction initiates fairly automatic ambulation.
Since Bridges moves as a single unit, the player only has one unit with which she can test and learn about the environment. Baby Steps allows players to control each leg individually, so the player can test the stability and texture of topographical features with two separate loci of vicariosenation. Stumbling when moving the right foot communicates instability in the environment on the avatar’s right side. The analog sensitivity of trigger motion also means that Nate can test his footing slowly and cautiously before putting his full weight onto an unknown surface. This creates a higher-sensitivity, more naturalistic sense of connection between the avatar and environment.
In addition, Steps’ per-leg movement system leads the player into mindful, rhythmic movement in a manner more evocative of real hiking than Stranding’s more automatic locomotion. Walking on uneven terrain and past the point of exhaustion increases the amount of mindful attention we pay to our gait, and Steps’ rhythmic walking (once the proper rhythm is “felt” in the hands of the player) feels analogous to the rhythmic swinging of one’s arms when hiking. It also represents a realistic modulation of the amount of attention it requires, becoming fairly automatic on flat, safe surfaces but requiring the player to be more thoughtful about the rhythm when on inclines or near large drop-offs.
Stranding simulates the modulation of mindfulness with its “grip” command, which I feel is one of its weaker mechanics. The left and right triggers can be held in to cause Sam to grip the straps of his backpack and stabilize himself, moving slower but also being significantly less likely to trip. It’s not comfortable to perpetually hold both triggers (it’s an expenditure of player effort), and it lowers the maximum speed of the journey, which encourages the player to only use it when necessary (on steep slopes, crossing rivers…). Essentially, it is a “mindfulness” button combination; a command to Sam to “please do not trip”. In that sense, it rewards thoughtful observation of changes in the environment, but it feels more like toggling on proprioceptive auto-pilot than actually engaging more mindfully with difficult terrain. It’s telling that Stranding’s solution to uneven terrain is a control pertaining to the avatar’s hands rather than his legs; a slightly odd choice from an embodiment perspective.
Greater consequence leads to greater attenuation to virtual sensation. It’s well known that greater challenge and consequence necessitates greater attention to game mechanics. I am far more thoughtful about my sword swings in Dark Souls than I am in Dynasty Warriors, because Souls has greater consequences for errant, frivolous attacks. The same can be said for factors that contribute to virtual embodiment.
Greater propensity for error causes players to lean-in and pay greater attention to the micro-feedback in the physicalities of avatars. Consider the instances of navigating perilously thin walkways in Banjo-Kazooie; those are the moments in which the player is most sensitive to precisely how minor controller adjustments translate to the virtual performance and what animation cues can be read as warnings of dangerous falls.
Baby Steps and Death Stranding both feature significant consequences for missteps, but Stranding externalizes many of those consequences (with the most lasting consequences being damage or loss of parcels) while Steps keeps consequences centered around the avatar body. A fall may require the player to re-navigate a portion of the environment. Steps centralizes incentives: the only thing that matters is the journey’s progress, as opposed to Stranding’s distributed incentive and resource systems. As such, Baby Steps players spend relatively little time thinking about the game world in terms separate from and outside of the body.
Conclusion
Nate and Sam Bridges mirror each other in many ways. Both are fiercely individualistic and recoil at the friendly familiarity of others. Nate sublimates his embarrassment for his utter childlike reliance on his parents by going out of his way to not be an imposition on others. Sam actively rejects affection, bristling at the lightest of touches from his peers.
Nate is confronted with his self-image, accepting the help of a giant mother-figure late in the game, who cradles him in her arms and lifts him past an otherwise impassible obstacle. Conversely, Sam sees the unsustainabiliy of his strict isolationism in his infant companion, a child ripped from another world, unable to survive except in the complete physical isolation of its containment vessel.
Nate’s journey concludes with the defeat of his infantile self-image; through determination and hard work, he achieves his goal and proves to himself that he is capable and valuable separate from the protection of his parents. He is shown in what appears to be a loving (or at least mutually-respectful) relationship with companion Moose, having finally accepted himself and becoming capable of letting someone else into his life as well.
Sam moves in the opposite direction, overly-confident in his self-reliance but coming to understand that there are aspects of life that can benefit from vulnerability and reliance on others. He breaks down his own social barriers and symbolically frees BB from his isolation tank, each subjected to the risks and benefits of existing as social and connected beings.
Gameplay-wise, this is reflected in the evolution of the relationship between the player and avatar. Nate is an unwieldy avatar, and this clumsiness only increases as environments become more challenging to traverse throughout the game, but progress is made as the player (as Nate) overcomes the initial stark alienation from the (virtual) body and grows more confident in the rhythms of its operation.
Conversely, progression through Death Stranding sees an increase in alienation from the virtual body as more and more vehicles and gadgets assist Sam on his expeditions. There are more and more layers of mediation between the virtual body and the environment. By the end of the game, journeys on foot — the use of Sam’s body at all — are rare. He (and the player) has externalized and accepted his place as a member of a wider, connected world.


Comments