Which would you choose? Darwin and concurrent reinforcement schedules.

In the journey to turn Darwin into a willing learner, one who actively participates in teaching sessions and engages in a mutually positively reinforcing contingency, there is still a lot of ground to cover. Some days I feel we are really getting somewhere. Others, I just want to give up and let him be a “dog” -- give him two free bowls of food per day, let him roam between the sofa, his bed and crate, put him on a permanent long line when out in the park to avoid recall failure -- because it’s just too hard to compete with the environment. Sometimes I think being a behaviour analyst is more of a hindrance than an advantage in our relationship. As the only one with verbal behaviour, I experience constant conflict between my expectation of being a good teacher and the product of my teaching. When my learner is not behaving in accordance to my expectation, it reflects poorly on my skills. The impostor syndrome monster, who in my daily job I successfully keep dormant, awakens with a vengeance.

By mutually and positively reinforcing experience, I mean an interaction where both individuals are taking turns in mediating each other’s reinforcers and that: 1) these reinforcers are positive and 2) when possible the consumption of the reinforcer itself involves both individuals. I am essentially defining a social contingency, one in which the social stimulus, the movements of the individual (an action, a smile, a nod, a sound, a word) are active elements of the reinforcer or at least precede the tangible reinforcer. This is different from the teacher simply being the agent who hands over the reinforcing item. For example, the adult giving the Ipad after the child has engaged in some academic task is the agent through which the reinforcer is delivered, but the reinforcing experience itself does not include the adult. The same can be said about food; it is typically consumed solitarily.

As food becomes increasingly more valuable for Darwin, I find myself thinking about further enhancing its value through my actions; or put another way, my actions entering the reinforcing experience. In other words, if Darwin could have the same food freely accessible from a bowl versus being delivered through interaction with me, which would he choose? At this stage in our mutual training journey, I am still not placing any specific demand, but I am beginning to deliver food in such a way to evoke some measurable behavioural change. For example, through strategic food tossing and short time delays, I am beginning to establish my eyes as the Sd correlated with the food delivery experience (bowling, chasing, flicking) for making eye contact. Whereas before I was just looking for a head turn toward my direction to chase the treat or watch the treat being flicked, I am now putting in a delay in those actions to evoke eye-contact as the behaviour that will produce the reinforcing action. The term “contrafreeloading” is used in the animal world to describe the phenomenon of an animal “choosing” to earn food through work when the same food is also freely available. In behaviour analysis “choice” is viewed as behaviour itself (response allocation) and therefore subject to change based on manipulating certain variables (e.g., MO, task effort). Thus, both the behaviour of choosing is studied, as well as the extent to which choice can be influenced. Whether, experimental or applied, the arrangement of “work for food” versus “free food” is described as a concurrent schedule in which reinforcement is delivered dependent on a response or independent of a response. In other words when the participant is presented with a choice between contingent and noncontingent reinforcement, we look at where responses are mainly allocated. The preference for “yearning and earning” vs “all you can have buffet” occurs in children, too.

Video: The box on the floor contains the same kibble I am using in our interaction. He is free to eat from it anytime. We live adjacent to the park, so the window is a constant source of EO competition.

Reinforcement delivery: the how matters as much as the what and when

When we talk about learning, we talk about a measurable behavioural change, in other words some observable difference in an organism’s movement in response to certain antecedent stimuli. Learning occurs all the time a learner is awake, of course, and the environment is always teaching, in a way. I like to think about teaching as the intentional arrangement and manipulation of antecedent and consequent stimuli to evoke that measurable change and gain stimulus control over the target behaviour. Stimulus control doesn’t always involve a person giving a verbal instruction. For behaviour which we wish to become free of a verbal request (e.g., when we are building self-help independent skills in human learners), the controlling stimulus needs to be in the environment. For Darwin, it’s going to the mat in the kitchen when I am cooking without being asked to do so. Thus, anything can become a discriminative stimulus for a specific operant given sufficient reinforcement history: a noise, a thing seen, a location, a word heard. Teaching involves the active manipulation of the contingency so that the target behaviour is evoked by the chosen antecedent stimuli: going to the mat contingent on me standing in the kitchen, getting dressed in the morning upon seeing clothes laid on the chair.

When establishing novel behaviour, we have two main concerns. The first relates to procedures to establish the topography itself (what the specific movement should look like, what the learner does). The second relates to the establishment of stimulus control: bringing the emission of that target movement contingent on a specific antecedent. Stimuli are fluctuating in the environment at all times. For a specific stimulus to acquire discriminative properties, it must predict the occurrence of reinforcement. Thus, reinforcement produces two outcomes: not only does it increase the probability that under similar conditions the target behaviour will occur, but it also strengthens the relationship between the antecedent stimulus and the behaviour it evokes. In other words, it establishes what is just a change in the environment (a stimulus) into a discriminative one. Of course, a MO must be present for the learning unit to commence.

HOW stimulus control is achieved is, I would argue, as important as whether it is achieved. As much as possible, I like to avoid a negative and automatic reinforcement contingency, where my learner does what he is asked because that is the quickest way to escape the interaction and to be able to return to self-reinforcing behaviours that do not require the presence of the teacher. In other words, as much as it is possible, I would like the experience of the teaching contingency, for both individuals, to also be reinforcing. I would like the reinforcer to be not just the delivery of the tangible item, but to be part of that reinforcing event. When this happens in the human intervention world, it is magic! The learner is not just content with being given the toy and walk away to play on his own, but actively seeks your engagement because your presence enhances the reinforcing value of playing with the toy. I can go to a 3-star Michelin restaurant on my own and the food will taste great. I can also go to a 3-star Michelin restaurant with a close friend. The food will likely taste just as good, but the whole event is likely to be much more enjoyable. That’s what I would like to achieve with Darwin -- not just to be the waitress delivering the food, but to be the agent that enhances the reinforcing experience of eating the food.

Video: We are at the park and for the first time, he is taking food and engaging in our usual pattern games instead of checking for other dogs to play with and smells. The line is on the ground, but am not actually holding it.

